base_model: CohereLabs/aya-expanse-8b library_name: peft model_name: aya-expanse-8b-tunisian-sft tags: licence: license pipeline_tag: text-generation --- base_model: CohereLabs/aya-expanse-8b library_name: peft model_name: TounsiLM-8b tags: - base_model:adapter:CohereLabs/aya-expanse-8b - peft - lora - sft - transformers - trl - tunisian-arabic - text-generation pipeline_tag: text-generation language: - ar license: apache-2.0 --- # TounsiLM-8b `TounsiLM-8b` is a Tunisian Arabic supervised fine-tuning adapter built on top of [CohereLabs/aya-expanse-8b](https://huggingface.co/CohereLabs/aya-expanse-8b). It is trained to answer in Tunisian دارجة, stay on topic, and keep responses short and direct when appropriate. ## Model type - Base model: `CohereLabs/aya-expanse-8b` - Fine-tuning method: PEFT / LoRA-style SFT adapter - Format: adapter checkpoint, not a fully merged standalone base model ## Training dataset - Dataset: `Syrinesmati/tunisian-question-response-dataset` - Train split: `25,340` rows - Eval split: `6,336` rows - Input format: conversational messages built from the dataset fields `instruction` and `response` ## Training setup - Trainer: TRL `SFTTrainer` - Epochs: `2` - Max sequence length: `1024` - Learning rate: `1e-5` - Per-device train batch size: `8` - Gradient accumulation: `4` - Precision: `bf16` when supported - Checkpoint resume: enabled ## Training metrics Final reported training metrics: - Training loss: `1.1876104943680041` - Mean token accuracy: `0.7577789686620235` - Training runtime: `50353.3546` seconds - Training steps: `1584` - Total tokens seen: `9,585,534` These are training metrics from the final log. No separate validation loss was recorded in the saved metrics file. ## Intended use Use this model for: - Tunisian Arabic question answering - chat-style assistant replies in Tunisian دارجة - short, direct conversational responses Not intended for: - factual safety-critical advice - medical/legal/financial decisions without verification - unsupported languages outside Arabic/Tunisian use cases ## How to use ### Option 1: load the adapter with the base model ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base_model_name = "CohereLabs/aya-expanse-8b" adapter_dir = "TounsiLM-8b" tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( base_model_name, device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, ) model = PeftModel.from_pretrained(model, adapter_dir) messages = [ {"role": "system", "content": "أنت مساعد تونسي تجاوب بالتونسي الدارج فقط."}, {"role": "user", "content": "شنوة تعمل كان الواحد يحس روحو تعبان؟"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ) inputs = {k: v.to(model.device) for k, v in inputs.items()} output_ids = model.generate(**inputs, max_new_tokens=128, do_sample=False) print(tokenizer.decode(output_ids[0], skip_special_tokens=True)) ``` ### Option 2: use the model in a pipeline ```python from transformers import pipeline gen = pipeline("text-generation", model=model, tokenizer=tokenizer) ``` ## Recommended inference settings - `do_sample=False` for more stable answers - `max_new_tokens=128` to reduce rambling - `repetition_penalty=1.1` ## Files included in this repository - `adapter_model.safetensors` - `adapter_config.json` - `chat_template.jinja` - tokenizer files - training metrics and logs ## Framework versions - PEFT: `0.19.1` - TRL: `1.3.0` - Transformers: `4.57.6` - PyTorch: `2.11.0` - Datasets: `4.8.5` - Tokenizers: `0.22.2` ## Notes This repository contains the fine-tuned adapter. To use it, load it on top of the base model `CohereLabs/aya-expanse-8b`. If you want a merged standalone model later, the adapter can be merged into the base model and re-uploaded as a separate repo. ## Citation If you use this model, please cite the base model and the training stack used to create it. ### TRL citation ```bibtex @software{vonwerra2020trl, title = {{TRL: Transformers Reinforcement Learning}}, author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin}, license = {Apache-2.0}, url = {https://github.com/huggingface/trl}, year = {2020} } ``` This model is a fine-tuned version of [CohereLabs/aya-expanse-8b](https://huggingface.co/CohereLabs/aya-expanse-8b). It has been trained using [TRL](https://github.com/huggingface/trl). ## Quick start ```python from transformers import pipeline question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?" generator = pipeline("text-generation", model="None", device="cuda") output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0] print(output["generated_text"]) ``` ## Training procedure This model was trained with SFT. ### Framework versions - PEFT 0.19.1 - TRL: 1.3.0 - Transformers: 4.57.6 - Pytorch: 2.11.0 - Datasets: 4.8.5 - Tokenizers: 0.22.2 ## Citations Cite TRL as: ```bibtex @software{vonwerra2020trl, title = {{TRL: Transformers Reinforcement Learning}}, author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin}, license = {Apache-2.0}, url = {https://github.com/huggingface/trl}, year = {2020} } ```