---
license: apache-2.0
base_model: meta-llama/Llama-3.2-1B
tags:
  - tinker
  - distillation
  - openthoughts
  - lora
  - peft
library_name: peft
---

# Llama 3.2 1B - Distillation Off-Policy LoRA

LoRA adapter trained with **Tinker** (by Thinking Machines) using off-policy distillation on OpenThoughts3 dataset.

## Training Details

- **Base model:** meta-llama/Llama-3.2-1B
- **Method:** Off-policy distillation (SFT on OpenThoughts3)
- **LoRA rank:** 32, alpha: 32
- **Target modules:** all-linear
- **Checkpoint:** batch 700

## Usage

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
model = PeftModel.from_pretrained(base, "arvindcr4/llama-3.2-1b-distillation-offpolicy-lora")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
```

## Platform

Trained using [Tinker](https://thinkingmachines.ai/tinker) - hosted fine-tuning service for open-source LLMs.