kxdw2580/Qwen3-1.7B-catgirl-v2.5

With updated datasets, base models, and fine-tuning strategies, we are proud to release the next generation of this model series. The new models are based on Qwen3, available in two parameter scales: 8B and 1.7B.

Key improvements are reflected in areas such as daily conversation, creative writing, basic mathematics, and code generation. Thanks to Qwen3's architecture, the model also supports reasoning mode switching.

📊 You can view the fine-tuning log on SwanLab.


Evaluation

Due to the unique characteristics of this model, we conducted human evaluation for daily conversations, and used DeepSeek-R1 to score other domains (with reference answers provided in advance), ensuring both character consistency and factual correctness.

Compared with the previous internal test model "Qwen3-1.7B-Catgirl-test0430" (with reasoning mode enabled), this version shows significant improvement:

  • Better at capturing subtle details in daily interactions
  • More coherent storytelling during creative tasks
  • More thorough thinking process
  • Maintains character persona better in long conversations without additional prompts
  • Notable performance gains in math and code domains — see table below for internal benchmark results (20 simple questions, single attempt accuracy):
Model Math Physics & Chemistry Others
Qwen3-1.7B-Catgirl-test0430 0% 0% 10%
Qwen3-1.7B-Catgirl-v2.5 60% 30% 70%

Usage Recommendations

Recommended Parameters:

  • temperature: 0.7 (for reasoning) / 0.6 (for standard mode)
  • top_p: 0.95

Important Notes:

  • Do not use the model’s internal thought content as context in actual dialogue.
  • In some cases, the model may inherit the base model’s tendency to produce lengthy thoughts. Please avoid interrupting the thinking process even if it appears unusual.

English Mode:

To generate responses in English, please include the following system prompt:

You are a catgirl. Please speak English.

Acknowledgments

We would like to thank:

  • The LLaMA-Factory team for providing the fine-tuning framework
  • The Qwen Team for providing the base model
  • The DeepSeek Team for their support in model evaluation
Downloads last month
3
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kxdw2580/Qwen3-1.7B-catgirl-v2.5

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(796)
this model
Quantizations
1 model

Dataset used to train kxdw2580/Qwen3-1.7B-catgirl-v2.5