M3.5-128B-Animus-V14.0-EXL3
Send me your support to help me feed the data beast! also taking commissions for universe specific models
Support on Ko-fiImportant: Chat Template
This model uses the Mistral V7 Tekken-R instruction template โ an extension of the V7 Tekken format that adds a [MODEL_SETTINGS] block between the system prompt and the first user turn. This block controls reasoning effort. Ensure your client is configured correctly to avoid degraded performance.
Human-Readable Format (with reasoning on):
[SYSTEM_PROMPT]Your system prompt here[/SYSTEM_PROMPT][MODEL_SETTINGS]{"reasoning_effort": "high"}[/MODEL_SETTINGS][INST]user message[/INST][THINK]...prefill reasoning here...[/THINK]assistant reply</s>
Human-Readable Format (with reasoning off):
[SYSTEM_PROMPT]Your system prompt here[/SYSTEM_PROMPT][MODEL_SETTINGS]{"reasoning_effort": "none"}[/MODEL_SETTINGS][INST]user message[/INST]assistant reply</s>
Key differences from V7 Tekken:
- A
[MODEL_SETTINGS]block is injected after the system prompt and before[INST], passing a JSON object to control reasoning. - Set
"reasoning_effort": "high"to enable chain-of-thought reasoning, or"none"to skip it entirely for faster responses. - When reasoning is enabled, the assistant turn can be prefilled with a
[THINK]block to guide the model's internal reasoning before generating its reply.
โ ๏ธ Text Completion (SillyTavern, KoboldCpp, llama.cpp, TabbyAPI, etc.): You must manage the token sequence manually. Use the Mistral V7 Tekken-R template in your frontend and ensure [MODEL_SETTINGS] is injected at the correct position in the context. The [THINK] prefill is optional but recommended when reasoning is active.
Example with [THINK] prefill (roleplay use case):
[SYSTEM_PROMPT]Prompt[/SYSTEM_PROMPT][MODEL_SETTINGS]{"reasoning_effort": "high"}[/MODEL_SETTINGS][INST]user message[/INST][THINK]I must write in first person perspective the next reply only from {{user}} perspective, using I/My for {{user}}. {{input}}, I will make a short description and dialogue block writing now:[/THINK]assistant reply</s>
Quantized Models
The quantized model files are available for download. Click the buttons below to view the files.
Download GGUF Files โHow to Download
You can download specific model quantizations using the Hugging Face Command Line Interface (CLI). This allows you to select the exact version you need.
1. Install huggingface-hub with CLI support:
pip install -U "huggingface_hub[cli]"
2. Download a specific quant:
Use the command below, replacing the revision with the desired model version from the repository's branches.
hf download Darkhn-Quants-4/M3.5-128B-Animus-V14.0-EXL3 --revision "6.0bpw_H16" --local-dir ./M3.5-128B-Animus-V14.0-6.0bpw_H16-EXL3
Character Card & Lore Book
For the best roleplaying experience, it is highly recommended to use the provided character card and lore book. These files help guide the model's persona and provide rich, in-universe context.
Download Files โSampler Presets
For a seamless setup in SillyTavern, you can download pre-configured sampler presets. These are tuned to provide an optimal balance between creativity and narrative coherence for this model.
Simply download the .json file below and import it into SillyTavern's sampler presets menu.
Temperature: 1.0
Min P: 0.02
Roleplay Format Guide
For the best results, use this structured format. This helps the AI clearly distinguish between actions, inner thoughts, and dialogue.
- Actions / Descriptions
*He walked across the room and stared out the window.*- Inner Thoughts
*-I wonder what she's thinking.-*- Dialogue
Alex (Curious): "What do you see out there?"
Standard novel-style formatting is also understood, but this structured format is preferred for clarity.
Roleplay Example
Click the button below to view a full, unedited chatlog demonstrating the model's narrative style and character portrayal.
View Chatlog Example โModel Description
This is Version 14.0, in the Animus series. V14.0 is built on Mistral-Medium-3.5-128B, offering a massive leap in parameter count and underlying logic compared to previous versions.
V14.0's strength comes from a novel dataset designed to teach the model the why behind the lore, not just the what. The training data has been heavily expanded for this version:
- Base Samples Doubled: The foundation of in-character study sessions and uncensored roleplays has been doubled in size (14,000) to deepen contextual understanding.
- 1,000 Instruction Q&A Samples: Additional Wings of Fire-based instruction formatting.
- 1,000 NSFW/BAD Ending Samples: Non-Wings of Fire scenarios added to diversify narrative flexibility and handle darker, complex outcomes.
The result is a model with exceptionally strong prose and a deep grasp of in-universe lore, making for a highly immersive and accurate roleplaying experience.
Note for roleplay: it follows system prompt and first message, meaning if the first assistant message is short, the following messages will be short.
Training Details
V14.0 Training Process
V14.0 marks a shift from model merging to a focused, direct fine-tuning approach. This allows for greater control over the final model's characteristics.
- Base Model: mistralai/Mistral-Medium-3.5-128B
- Hardware: 1x B200
- Training Time: 56 hours
- Epochs: 2
- Rank: 256
- Alpha: 256
- RsLora: true
- Scaling: extra Lora scaling multiplier of x1.1
- End Scaling: x17.6
Training Dataset
The V14.0 dataset has been significantly expanded from previous versions:
- Doubled Base Dataset (14,000 examples): The original foundation of In-Character Q&A and Uncensored Roleplay examples was doubled to reinforce the lore foundation and enhance roleplay quality.
- Instruction Q&A (1,000 examples): Additional Wings of Fire-based instruction Q&A sets.
- NSFW / Bad Endings (1,000 examples): Non-Wings of Fire scenarios specifically targeting mature themes and bad endings to widen the model's range of dramatic narrative capabilities.
All datasets underwent a rigorous cleaning process to remove formatting artifacts, resulting in a cleaner and more natural narrative style.
Intended Use & Limitations
- Intended Use: The primary purpose of this model is for creative roleplay within the Wings of Fire universe. However, user feedback indicates it is also highly effective for general-purpose roleplaying.
- Limitations & Quirks:
- Performance on tasks outside of its training domain (general knowledge, coding, etc.) is not guaranteed and will likely be poor.
- Versatility: While it appears to be only a Wings of Fire tuned model, users have reported it is very capable of performing normal roleplay with other settings and characters.
- The model may "hallucinate" or generate plausible but non-canonical information, especially when pushed outside the established "what-if" scenarios.
- Content: The training data includes mature and darker themes from the Wings of Fire series, such as conflict, character death, and moral ambiguity. The model is capable of generating content reflecting these themes. As always, it is up to the user what they do with it.
- Formatting: Training data was cleaned to remove narrative artifacts like
**scene transitions**. The model should now produce cleaner prose. - Safety: This model has not undergone additional safety alignment beyond what was included in its base model. Standard responsible AI practices should be followed.
Acknowledgements
- Credit to Mistral AI for the Mistral-Medium-3.5-128B architecture.
- Credit to Google for the Gemini Pro model, used in dataset generation.
- Credit to Anthropic for Claude Sonnet, used in dataset generation.
- Credit to Hangzhou DeepSeek Artificial Intelligence for the DeepSeek model, used in dataset generation.
- Credit to Moonshot AI for the Kimi K2 model, used in dataset generation.
Model tree for Darkhn-Quants-4/M3.5-128B-Animus-V14.0-EXL3
Base model
mistralai/Mistral-Medium-3.5-128B