Trouper-12B

A character roleplay model trained on the custom "Actors" dataset, fine-tuned from Mistral-Nemo-Base-12B. This model was made to expand on the things I learned from TinyRP, and to overcome certain limitations I found from it; also on an entirely new dataset made just for this model.

This model writes more naturally, less like "AI"; even more so than the 24B model I'm also releasing. I suppose this is because the 12B model saw less synthethic data, and is thus less likely to use phrases typical in AI writing & prose.

-> If you're looking for the larger model in this series: Prima-24B

Looking for feedback, so please do share if you got any!

Key Features

Clean prose: Minimal AI slop patterns, natural speech
Character depth: Handles emotional progression and vulnerability well
Efficient: 12B size provides fast inference while maintaining quality
Template-dependent: Requires Mistral-V3-Tekken for proper stop behavior

Recommended Settings

Use chat completion mode

Temperature: 0.7 (tested and validated)
Template: Mistral-V3-Tekken OR ChatML, some users reported better results with ChatML (critical for proper formatting and stop behavior)
Context: Handles 15-20+ turn conversations effectively
Prompt Preprocessing: Semi-strict, no tools

Strengths

Writing Quality: Direct, concrete descriptions without purple prose
Natural Dialogue: Speech patterns feel authentic, not performative
Emotional Range: Handles vulnerability, humor, and character growth
Structural Variety: Avoids formulaic response patterns
Show Don't Tell: Trusts the reader, doesn't over-explain emotions

Comparison to Prima-24B

Trouper-12B and Prima-24B are trained on identical data but offer different trade-offs:

Aspect	Trouper-12B	Prima-24B
Prose Style	Direct and concrete	Slightly more elaborate
AI Slop	Minimal	Moderate (some patterns)
Reliability	Good (template-sensitive)	Excellent
Long Context	Good (12B)	Better (24B)
Inference Speed	Faster (12B)	Slower (24B)
Setup Difficulty	Moderate (template critical)	Easy
Action RP	Good	Excellent
Emotional RP	Excellent	Good

Choose Trouper-12B if: You want best prose quality, natural dialogue, and don't mind template setup
Choose Prima-24B if: You want reliability, long context, or action-oriented RP

Comparison to TinyRP-12B

This model addresses several issues found in my previous TinyRP-12B release:

Aspect	TinyRP-12B	Trouper-12B
Formulaic patterns	Yes (after 20+ turns)	No
Character stagnation	Yes	No - characters evolve
Opening variety	Repetitive	Varied
Training data	Original dataset	Custom "Actors" dataset
Long conversations	Degrades	Maintains quality

Known Limitations

Template Sensitivity: Without Mistral-V3-Tekken, may generate meta-narration or continue past appropriate stopping points
Occasional Meta-Breaks: Rare instances of stepping outside character (regenerate if needed)
Context Window: While good for 15-20+ turns, may be outperformed by larger models at 50+ turns. please let me know how it works for you!

Got Feedback?

Issues, questions, or feedback welcome! Particularly interested in:

Long conversation quality (20+ turns)
Template compatibility findings
Comparison with other RP models

Feel free to make a post in the Community tab here!

Why train on a base model?

According to this paper: Base Models Beat Aligned Models at Randomness and Creativity; and to avoid any possible "GPT-isms", I decided to train on a base model. Think of it as more mallable clay vs re-shaping something that was already formed to be something else.

This is what led to the behavior observed in this model, where the model just legitimately doesn't understand being an "assistant" outside of being a character that is an assistant. SO while the model is probably not useful outside of RP, it is also not intended to be.