base_model: CohereLabs/aya-expanse-8b
library_name: peft
model_name: aya-expanse-8b-tunisian-sft
tags:
licence: license
pipeline_tag: text-generation

---
base_model: CohereLabs/aya-expanse-8b
library_name: peft
model_name: TounsiLM-8b
tags:
  - base_model:adapter:CohereLabs/aya-expanse-8b
  - peft
  - lora
  - sft
  - transformers
  - trl
  - tunisian-arabic
  - text-generation
pipeline_tag: text-generation
language:
  - ar
license: apache-2.0
---

# TounsiLM-8b

`TounsiLM-8b` is a Tunisian Arabic supervised fine-tuning adapter built on top of [CohereLabs/aya-expanse-8b](https://huggingface.co/CohereLabs/aya-expanse-8b).

It is trained to answer in Tunisian دارجة, stay on topic, and keep responses short and direct when appropriate.

## Model type

- Base model: `CohereLabs/aya-expanse-8b`
- Fine-tuning method: PEFT / LoRA-style SFT adapter
- Format: adapter checkpoint, not a fully merged standalone base model

## Training dataset

- Dataset: `Syrinesmati/tunisian-question-response-dataset`
- Train split: `25,340` rows
- Eval split: `6,336` rows
- Input format: conversational messages built from the dataset fields `instruction` and `response`

## Training setup

- Trainer: TRL `SFTTrainer`
- Epochs: `2`
- Max sequence length: `1024`
- Learning rate: `1e-5`
- Per-device train batch size: `8`
- Gradient accumulation: `4`
- Precision: `bf16` when supported
- Checkpoint resume: enabled

## Training metrics

Final reported training metrics:

- Training loss: `1.1876104943680041`
- Mean token accuracy: `0.7577789686620235`
- Training runtime: `50353.3546` seconds
- Training steps: `1584`
- Total tokens seen: `9,585,534`

These are training metrics from the final log. No separate validation loss was recorded in the saved metrics file.

## Intended use

Use this model for:

- Tunisian Arabic question answering
- chat-style assistant replies in Tunisian دارجة
- short, direct conversational responses

Not intended for:

- factual safety-critical advice
- medical/legal/financial decisions without verification
- unsupported languages outside Arabic/Tunisian use cases

## How to use

### Option 1: load the adapter with the base model

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model_name = "CohereLabs/aya-expanse-8b"
adapter_dir = "TounsiLM-8b"

tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_dir)

messages = [
    {"role": "system", "content": "أنت مساعد تونسي تجاوب بالتونسي الدارج فقط."},
    {"role": "user", "content": "شنوة تعمل كان الواحد يحس روحو تعبان؟"},
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
)

inputs = {k: v.to(model.device) for k, v in inputs.items()}
output_ids = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))
```

### Option 2: use the model in a pipeline

```python
from transformers import pipeline

gen = pipeline("text-generation", model=model, tokenizer=tokenizer)
```

## Recommended inference settings

- `do_sample=False` for more stable answers
- `max_new_tokens=128` to reduce rambling
- `repetition_penalty=1.1`

## Files included in this repository

- `adapter_model.safetensors`
- `adapter_config.json`
- `chat_template.jinja`
- tokenizer files
- training metrics and logs

## Framework versions

- PEFT: `0.19.1`
- TRL: `1.3.0`
- Transformers: `4.57.6`
- PyTorch: `2.11.0`
- Datasets: `4.8.5`
- Tokenizers: `0.22.2`

## Notes

This repository contains the fine-tuned adapter. To use it, load it on top of the base model `CohereLabs/aya-expanse-8b`.

If you want a merged standalone model later, the adapter can be merged into the base model and re-uploaded as a separate repo.

## Citation

If you use this model, please cite the base model and the training stack used to create it.

### TRL citation

```bibtex
@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}
```
This model is a fine-tuned version of [CohereLabs/aya-expanse-8b](https://huggingface.co/CohereLabs/aya-expanse-8b).
It has been trained using [TRL](https://github.com/huggingface/trl).

## Quick start

```python
from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="None", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])
```

## Training procedure

 
This model was trained with SFT.

### Framework versions

- PEFT 0.19.1
- TRL: 1.3.0
- Transformers: 4.57.6
- Pytorch: 2.11.0
- Datasets: 4.8.5
- Tokenizers: 0.22.2

## Citations


Cite TRL as:
    
```bibtex
@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}
```