Qwen3-0.6B-Reasoning-Opus

This is a fine-tuned version of Qwen3-0.6B optimized for multi-step reasoning. It was trained using QLoRA on a filtered dataset of reasoning traces distilled from Claude 4.6 Opus. The goal of this project was to induce "System 2" (deliberate) thinking in a sub-1B parameter model.

Model Details

Developed by: Shreyansh Pathak
Institution: Dayananda Sagar College of Engineering (DSCE), Bangalore
Model type: Causal Language Model with Chain-of-Thought (CoT) capabilities.
Base Model: Qwen/Qwen3-0.6B
Language(s): English
Fine-tuning Technique: QLoRA (Unsloth)
Rank (r): 16
Alpha: 16

Performance & Evaluation

The model was evaluated against the base Qwen3-0.6B model on a 50-sample random subset of the GSM8K (Grade School Math) benchmark to measure logical consistency and arithmetic accuracy.

Model	GSM8K Accuracy (n=50)	Improvement
Base Qwen3-0.6B	26.0%	Baseline
Qwen3-0.6B-Reasoning-Opus	32.0%	+6.0% (Absolute)

Key Findings

Reasoning Activation: The fine-tuned model successfully triggers a <think> block for complex queries, whereas the base model typically provides direct, often incorrect, answers.
Alignment Tax: While math accuracy increased, the model exhibits some "overthinking" on simple logic riddles, a common trade-off in small-parameter reasoning models.
Relative Gain: The model showed a 23% relative improvement in math problem-solving compared to its pre-trained state.

Training Procedure

Hardware: NVIDIA Tesla T4 (via Google Colab)
Optimizer: AdamW (8-bit)
Learning Rate: 2e-4
Batch Size: 2 (Gradient Accumulation: 4)
Training Steps: 60
Dataset: nohurry/Opus-4.6-Reasoning-3000x-filtered

Usage

This model uses the standard Qwen3 chat template but is optimized to generate reasoning traces inside <think> tags.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Shreyansh327/Qwen3-0.6B-Reasoning-Opus"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "If I have 10 oranges and give 3 to John and 2 to Mary, how many are left?"
messages = [{"role": "user", "content": prompt}]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Shreyansh327/Qwen3-0.6B-Reasoning-Opus-LoRA

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(943)

this model

Shreyansh327
/

Qwen3-0.6B-Reasoning-Opus-LoRA