--- base_model: facebook/nllb-200-distilled-600M library_name: peft tags: - translation - kalenjin - swahili - lora - transformers - nllb - peft - qlora language: - swh - kln metrics: - bleu - chrf datasets: - thinkKenya/kenyan-low-resource-language-data --- # NLLB-200 600M - Swahili to Kalenjin (LoRA Adapter) This is a rigorously fine-tuned LoRA adapter for `facebook/nllb-200-distilled-600M`, heavily optimized for translating **Swahili (SWA) to Kalenjin (KLN)**. ## 📊 Evaluation Results (SWA -> KLN) - **BLEU Score**: `40.24` - **chrF Score**: `62.38` ## 🛠️ Technical Details & Training - **Hardware**: Trained locally (42 Epochs) on 5060 8GB GPU for 13 hours. - **LoRA Config**: `r=64`, `alpha=128`, targeting `["q_proj", "v_proj", "k_proj", "out_proj", "fc1", "fc2"]`. - **Token Strategy**: Kalenjin uses "token hijacking" and routes through the `luo_Latn` token space to prevent catastrophic forgetting that comes with initializing a raw token. ## 💻 Usage ```python import torch from peft import PeftModel from transformers import AutoModelForSeq2SeqLM, NllbTokenizerFast # Load Base model_id = "facebook/nllb-200-distilled-600M" model = AutoModelForSeq2SeqLM.from_pretrained(model_id) # Load Adapter adapter_id = "mutaician/nllb-swahili-kalenjin-v3" model = PeftModel.from_pretrained(model, adapter_id) # Load Tokenizer tokenizer = NllbTokenizerFast.from_pretrained(adapter_id) tokenizer.src_lang = "swa_Latn" text = "Habari yako?" inputs = tokenizer(text, return_tensors="pt") target_lang_id = tokenizer.convert_tokens_to_ids("kln_Latn") with torch.no_grad(): generated_tokens = model.generate( **inputs, forced_bos_token_id=target_lang_id, num_beams=5, early_stopping=True, max_length=256 ) print(tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]) ```