--- license: apache-2.0 base_model: unsloth/Qwen3-1.7B tags: - unsloth - qwen3 - mathematical-reasoning - sft - anti-overfitting language: - en pipeline_tag: text-generation library_name: transformers --- # Qwen3-1.7B Math SFT - Anti-Overfitting Version Trained with anti-overfitting measures based on "A Practical Two-Stage Recipe for Mathematical LLMs" paper. ## Training Details - **Base Model**: unsloth/Qwen3-1.7B - **Parameters**: 1,720,032,256 (all fine-tuned) - **Epochs**: 10 - **Batch Size**: 8 - **Learning Rate**: 5e-06 (reduced for stability) - **Weight Decay**: 0.1 (increased regularization) - **Approach**: Full model training with anti-overfitting measures ## Anti-Overfitting Measures - Reduced learning rate: 5e-06 - Increased weight decay: 0.1 - Extended warmup: 10% of steps - Early stopping on validation loss - Regular evaluation checkpoints ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "Cbgcbg/qwen3-1.7b-math-sft-antioverfitting-20250724_165951", torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("Cbgcbg/qwen3-1.7b-math-sft-antioverfitting-20250724_165951") messages = [ {"role": "system", "content": "Please reason step by step, and put your final answer within \boxed{}."}, {"role": "user", "content": "What is 2+2?"} ] inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt") outputs = model.generate(input_ids=inputs, max_new_tokens=256) ``` Training timestamp: 20250724_165951