GPT-SW3 1.3B — Icelandic Grammar-Aligned (SAGA Δ-DPO)

Fine-tuned with SAGA (Syntax-Aware Grammar Alignment), a two-stage pipeline that trains language models to generate grammatically correct Icelandic text using reinforcement learning from a symbolic parser oracle (Greynir (Icelandic constituency parser)).

This is a LoRA adapter. Load it on top of AI-Sweden-Models/gpt-sw3-1.3b.

No-SFT ablation: Δ-DPO trained directly from the raw pretrained GPT-SW3-1.3B (no KL-SFT warm-start). Confirms Auto-SFT rule: base PS 76.5% < τ=80% → skipping KL-SFT hurts.

Results (independent Stanza evaluation)

Metric Base + Δ-DPO
Parse success 76.5% 75.5%
Parse score 0.364 0.371
PPL-Wiki 17.0 20.6

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("AI-Sweden-Models/gpt-sw3-1.3b", torch_dtype="auto")
model = PeftModel.from_pretrained(base, "emilcw/gpt-sw3-1b3-is-saga-nosft-delta-dpo")

tokenizer = AutoTokenizer.from_pretrained("AI-Sweden-Models/gpt-sw3-1.3b")

prompt = "Íslenska er"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=60, temperature=0.8, do_sample=True)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Training details

  • Data: 10 000 Icelandic Wikipedia sentences (filtered for quality)
  • Method: Δ-DPO — generate N=8 candidates per prompt, keep pairs with parser score gap Δ ≥ 0.25, train with standard DPO loss (β=0.1)
  • Parser oracle: Greynir (Icelandic constituency parser)
  • LoRA: rank 16, α=32, all linear layers, bfloat16
  • Auto-SFT rule: SFT is applied first only if base parse success < 80%

Citation

@article{fakhar2025saga,
  title={SAGA: Syntax-Aware Grammar Alignment for Low-Resource Nordic Languages},
  author={Fakhar, Hoda and others},
  year={2025},
  note={Under review}
}

License

Inherits the base model license (AI Sweden LLM License / LumiOpen).

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for emilcw/gpt-sw3-1b3-is-saga-nosft-delta-dpo

Adapter
(22)
this model