Instructions to use ReadyArt/Serenity-26B-A4B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use ReadyArt/Serenity-26B-A4B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="ReadyArt/Serenity-26B-A4B-GGUF", filename="Serenity-26B-A4B-HB16-Q6_K.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use ReadyArt/Serenity-26B-A4B-GGUF with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use ReadyArt/Serenity-26B-A4B-GGUF with Ollama:
ollama run hf.co/ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
- Unsloth Studio
How to use ReadyArt/Serenity-26B-A4B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ReadyArt/Serenity-26B-A4B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ReadyArt/Serenity-26B-A4B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ReadyArt/Serenity-26B-A4B-GGUF to start chatting
- Pi
How to use ReadyArt/Serenity-26B-A4B-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use ReadyArt/Serenity-26B-A4B-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use ReadyArt/Serenity-26B-A4B-GGUF with Docker Model Runner:
docker model run hf.co/ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
- Lemonade
How to use ReadyArt/Serenity-26B-A4B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull ReadyArt/Serenity-26B-A4B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Serenity-26B-A4B-GGUF-Q4_K_M
List all available models
lemonade list
Serenity-26B-A4B
Balance achieved. Control surrendered.
🌊 What Is Serenity?
Born from the convergence of two lineages, Serenity-26B-A4B harmonizes the best of both worlds — the expressive depth of Melody and the immersive presence of Darkside — into a single, balanced model.
- 🧠 Dual Heritage: Combines Melody's male → female perspective with Darkside's female → male perspective
- ⚖️ Balanced Tone: Refined personality that flows naturally without overwhelming the narrative
- 🎭 Enhanced Depth: LoRA 80 captures nuanced behavioral patterns from both parent approaches
🧬 Synthetic Life Engine
Dataset generated using our advanced Character Engine and Emotional Engine, creating genuine life and emotional resonance in every interaction.
🎭 Character Engine
Ensures consistent personality traits, speech patterns, and behavioral logic across all contexts. Characters remain true to themselves throughout.
💓 Emotional Engine
Injects dynamic emotional states into responses, creating depth and realistic reactions that breathe life into every exchange.
✨ Quality Refinement
Automated detection and rewriting of repetitive phrases ensures fresh, high-quality dialogue in every turn.
💬 Dialogue Integrity
Advanced quote normalization ensures balanced dialogue markers, preventing formatting errors and maintaining immersion.
🧠 Training Details & Parameters
Fine-tuned using LoRA (Low-Rank Adaptation) for efficient and targeted weight adjustment, preserving the base model's capabilities while imprinting new behavioral patterns.
🔢 Epochs
Full passes through the training dataset for thorough learning
📐 LoRA Rank
Higher rank for richer adaptation and nuanced expression
| Parameter | Value |
|---|---|
| Training Method | LoRA (Low-Rank Adaptation) |
| LoRA Rank (r) | 80 |
| Epochs | 2 |
| Trained Layers | Text layers only |
- 🎯 Method: LoRA fine-tuning for parameter-efficient adaptation, preserving base knowledge while imprinting new behaviors
- ⚡ Efficiency: Only a fraction of parameters updated, keeping the model lean and responsive
- 🎭 Result: Maintains coherence while exhibiting the desired personality traits and interaction style
🔮 Training Process
Model weights subjected to iterative refinement during data creation. Each conversation underwent multiple checks for stability and alignment.
- 🔄 Multi-Turn Generation: Conversations built turn-by-turn for natural context flow
- 🛡️ Refusal Filtering: Automated systems detected and removed unwanted refusals
- 🧹 Slop Cleaning: Undesirable phrases identified and rewritten by dedicated assistant models
📚 Dataset Overview
Model trained on a specialized adult-oriented roleplay dataset with diverse scenarios and emotional contexts, drawing from the strengths of both parent lineages.
- 🔞 Content Rating: Strictly 18+ (Adults Only)
- 🎭 Focus: Mature themes, immersive roleplay, uncensored dialogue
📜 Version Notes
- ⚠️ Content Warning: This model can generate mature content. Use responsibly.
- 🧹 Clean Output: Reasoning tags and excessive markdown stripped for cleaner roleplay
- 🔀 Hybrid Lineage: Combines the male → female perspective of Melody1437 with the female → male perspective of Darkside into a unified, balanced model
⚙️ Configuration
🎛️ Sampler Settings
Recommended parameters for optimal output
📦 GGUF Quantizations
Available formats for local inference
- Q4_K_MDecent Quality
- Q5_K_MRecommended
- Q6_KHigh Quality
- Q8_0Near Lossless
💖 Credits
-
GECFDO — Data Generation & iMatrix Quants
-
Sleep Deprived — Dataset Generator
-
FrenzyBiscuit — Fine-Tuning & Dataset Creation
🔖 License & Usage
- 📜 Built upon the Apache 2.0 license
- 🛡️ You accept full responsibility for all outputs generated
- 🔞 You confirm you are at least 18 years old
- 🌍 Creators bear no responsibility for how the model is used
- 🏡 For personal use only (non-profit/non-commercial) to the extent legally allowed
- Downloads last month
- 140,599
4-bit
5-bit
6-bit
8-bit
Model tree for ReadyArt/Serenity-26B-A4B-GGUF
Base model
google/gemma-4-26B-A4B