nohurry/Opus-4.6-Reasoning-3000x-filtered
Viewer • Updated • 2.33k • 5.39k • 587
How to use Shreyansh327/Qwen3-0.6B-Reasoning-Opus-LoRA with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Shreyansh327/Qwen3-0.6B-Reasoning-Opus-LoRA to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Shreyansh327/Qwen3-0.6B-Reasoning-Opus-LoRA to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Shreyansh327/Qwen3-0.6B-Reasoning-Opus-LoRA to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="Shreyansh327/Qwen3-0.6B-Reasoning-Opus-LoRA",
max_seq_length=2048,
)This is a fine-tuned version of Qwen3-0.6B optimized for multi-step reasoning. It was trained using QLoRA on a filtered dataset of reasoning traces distilled from Claude 4.6 Opus. The goal of this project was to induce "System 2" (deliberate) thinking in a sub-1B parameter model.
The model was evaluated against the base Qwen3-0.6B model on a 50-sample random subset of the GSM8K (Grade School Math) benchmark to measure logical consistency and arithmetic accuracy.
| Model | GSM8K Accuracy (n=50) | Improvement |
|---|---|---|
| Base Qwen3-0.6B | 26.0% | Baseline |
| Qwen3-0.6B-Reasoning-Opus | 32.0% | +6.0% (Absolute) |
<think> block for complex queries, whereas the base model typically provides direct, often incorrect, answers.nohurry/Opus-4.6-Reasoning-3000x-filteredThis model uses the standard Qwen3 chat template but is optimized to generate reasoning traces inside <think> tags.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Shreyansh327/Qwen3-0.6B-Reasoning-Opus"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "If I have 10 oranges and give 3 to John and 2 to Mary, how many are left?"
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))