Trouper-12B
A character roleplay model trained on the custom "Actors" dataset, fine-tuned from Mistral-Nemo-Base-12B. This model was made to expand on the things I learned from TinyRP, and to overcome certain limitations I found from it; also on an entirely new dataset made just for this model.
This model writes more naturally, less like "AI"; even more so than the 24B model I'm also releasing. I suppose this is because the 12B model saw less synthethic data, and is thus less likely to use phrases typical in AI writing & prose.
-> If you're looking for the larger model in this series: Prima-24B
Looking for feedback, so please do share if you got any!
Key Features
- Clean prose: Minimal AI slop patterns, natural speech
- Character depth: Handles emotional progression and vulnerability well
- Efficient: 12B size provides fast inference while maintaining quality
- Template-dependent: Requires Mistral-V3-Tekken for proper stop behavior
Recommended Settings
Use chat completion mode
- Temperature: 0.7 (tested and validated)
- Template: Mistral-V3-Tekken OR ChatML, some users reported better results with ChatML (critical for proper formatting and stop behavior)
- Context: Handles 15-20+ turn conversations effectively
- Prompt Preprocessing: Semi-strict, no tools
Strengths
- Writing Quality: Direct, concrete descriptions without purple prose
- Natural Dialogue: Speech patterns feel authentic, not performative
- Emotional Range: Handles vulnerability, humor, and character growth
- Structural Variety: Avoids formulaic response patterns
- Show Don't Tell: Trusts the reader, doesn't over-explain emotions
Comparison to Prima-24B
Trouper-12B and Prima-24B are trained on identical data but offer different trade-offs:
| Aspect | Trouper-12B | Prima-24B |
|---|---|---|
| Prose Style | Direct and concrete | Slightly more elaborate |
| AI Slop | Minimal | Moderate (some patterns) |
| Reliability | Good (template-sensitive) | Excellent |
| Long Context | Good (12B) | Better (24B) |
| Inference Speed | Faster (12B) | Slower (24B) |
| Setup Difficulty | Moderate (template critical) | Easy |
| Action RP | Good | Excellent |
| Emotional RP | Excellent | Good |
Choose Trouper-12B if: You want best prose quality, natural dialogue, and don't mind template setup
Choose Prima-24B if: You want reliability, long context, or action-oriented RP
Comparison to TinyRP-12B
This model addresses several issues found in my previous TinyRP-12B release:
| Aspect | TinyRP-12B | Trouper-12B |
|---|---|---|
| Formulaic patterns | Yes (after 20+ turns) | No |
| Character stagnation | Yes | No - characters evolve |
| Opening variety | Repetitive | Varied |
| Training data | Original dataset | Custom "Actors" dataset |
| Long conversations | Degrades | Maintains quality |
Known Limitations
- Template Sensitivity: Without Mistral-V3-Tekken, may generate meta-narration or continue past appropriate stopping points
- Occasional Meta-Breaks: Rare instances of stepping outside character (regenerate if needed)
- Context Window: While good for 15-20+ turns, may be outperformed by larger models at 50+ turns. please let me know how it works for you!
Got Feedback?
Issues, questions, or feedback welcome! Particularly interested in:
- Long conversation quality (20+ turns)
- Template compatibility findings
- Comparison with other RP models
Feel free to make a post in the Community tab here!
Why train on a base model?
According to this paper: Base Models Beat Aligned Models at Randomness and Creativity; and to avoid any possible "GPT-isms", I decided to train on a base model. Think of it as more mallable clay vs re-shaping something that was already formed to be something else.
This is what led to the behavior observed in this model, where the model just legitimately doesn't understand being an "assistant" outside of being a character that is an assistant. SO while the model is probably not useful outside of RP, it is also not intended to be.
- Downloads last month
- 31