--- license: apache-2.0 base_model: meta-llama/Llama-3.2-1B tags: - tinker - distillation - openthoughts - lora - peft library_name: peft --- # Llama 3.2 1B - Distillation Off-Policy LoRA LoRA adapter trained with **Tinker** (by Thinking Machines) using off-policy distillation on OpenThoughts3 dataset. ## Training Details - **Base model:** meta-llama/Llama-3.2-1B - **Method:** Off-policy distillation (SFT on OpenThoughts3) - **LoRA rank:** 32, alpha: 32 - **Target modules:** all-linear - **Checkpoint:** batch 700 ## Usage ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B") model = PeftModel.from_pretrained(base, "arvindcr4/llama-3.2-1b-distillation-offpolicy-lora") tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B") ``` ## Platform Trained using [Tinker](https://thinkingmachines.ai/tinker) - hosted fine-tuning service for open-source LLMs.