---
license: apache-2.0
base_model: unsloth/Qwen3-1.7B
tags:
- unsloth
- qwen3
- mathematical-reasoning
- sft
- anti-overfitting
language:
- en
pipeline_tag: text-generation
library_name: transformers
---

# Qwen3-1.7B Math SFT - Anti-Overfitting Version

Trained with anti-overfitting measures based on "A Practical Two-Stage Recipe for Mathematical LLMs" paper.

## Training Details
- **Base Model**: unsloth/Qwen3-1.7B
- **Parameters**: 1,720,032,256 (all fine-tuned)
- **Epochs**: 10
- **Batch Size**: 8
- **Learning Rate**: 5e-06 (reduced for stability)
- **Weight Decay**: 0.1 (increased regularization)
- **Approach**: Full model training with anti-overfitting measures

## Anti-Overfitting Measures
- Reduced learning rate: 5e-06
- Increased weight decay: 0.1
- Extended warmup: 10% of steps
- Early stopping on validation loss
- Regular evaluation checkpoints

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Cbgcbg/qwen3-1.7b-math-sft-antioverfitting-20250724_165951",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Cbgcbg/qwen3-1.7b-math-sft-antioverfitting-20250724_165951")

messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \boxed{}."},
    {"role": "user", "content": "What is 2+2?"}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")
outputs = model.generate(input_ids=inputs, max_new_tokens=256)
```

Training timestamp: 20250724_165951