Instructions to use emilcw/llama-3.2-1b-nb-saga-kl-sft-delta-dpo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use emilcw/llama-3.2-1b-nb-saga-kl-sft-delta-dpo with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B") model = PeftModel.from_pretrained(base_model, "emilcw/llama-3.2-1b-nb-saga-kl-sft-delta-dpo") - Notebooks
- Google Colab
- Kaggle
Llama-3.2-1B — Norwegian Bokmål Grammar-Aligned (SAGA KL-SFT + Δ-DPO) [Ablation]
Fine-tuned with SAGA (Syntax-Aware Grammar Alignment) using forced KL-regularised SFT followed by Δ-DPO. This is an ablation model — the base model's parse success (83%) already exceeds the τ=0.80 Auto-SFT threshold, so SAGA would normally skip SFT. This model forces KL-SFT to measure the effect of the extra SFT stage.
This is a LoRA adapter. Load it on top of meta-llama/Llama-3.2-1B.
Ablation finding: Forcing KL-SFT when base PS ≥ 80% does not improve results. Parse success is comparable (98.0% vs 98.5%) but PPL degrades (+10.8) and ScaLA AUROC drops (0.592 vs 0.617). The no-SFT variant (emilcw/llama-3.2-1b-nb-saga-delta-dpo) is the recommended model.
Results (Stanza NB — independent held-out evaluator)
| Metric | Base | No-SFT Δ-DPO | KL-SFT + Δ-DPO (this) |
|---|---|---|---|
| Stanza PS ↑ | 83.0% | 98.5% | 98.0% |
| Parse score ↑ | 0.392 | 0.622 | 0.630 |
| PPL-Wiki ↓ | 30.1 | 33.0 | 43.8 |
| ScaLA AUROC ↑ | 0.605 | 0.617 | 0.592 |
| Summ PS ↑ | 63.0% | 100% | 100% |
| Summ score ↑ | 0.202 | 0.722 | 0.758 |
| RC APS ↑ | 31.0% | 87.5% | 93.0% |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B", torch_dtype="auto")
model = PeftModel.from_pretrained(base, "emilcw/llama-3.2-1b-nb-saga-kl-sft-delta-dpo")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
prompt = "Norsk er"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=60, temperature=0.8, do_sample=True)
print(tokenizer.decode(output[0], skip_special_tokens=True))
Training details
- Base model: Llama 3.2 1B (base NB PS 83% ≥ τ=0.80, but SFT forced for ablation)
- Stage 1: KL-SFT on 10k Norwegian Wikipedia sentences, 3 epochs, λ=0.10
- Stage 2: Δ-DPO from KL-SFT checkpoint — N=8 candidates, δ≥0.25, β=0.1
- Anti-hacking: MATTR diversity weight=0.2, repetition_penalty=1.3
- Oracle: SpaCy
nb_core_news_lg(Norwegian dependency parser) - LoRA: rank 16, α=32, all linear layers, bfloat16
Citation
@article{fakhar2025saga,
title={SAGA: Syntax-Aware Grammar Alignment for Low-Resource Nordic Languages},
author={Fakhar, Hoda and others},
year={2025},
note={Under review}
}
License
Meta Llama 3.2 Community License.
- Downloads last month
- 1
Model tree for emilcw/llama-3.2-1b-nb-saga-kl-sft-delta-dpo
Base model
meta-llama/Llama-3.2-1B