assn2-sft-qwen2.5-1.5b-lora
This repository contains the SFT LoRA adapter trained for CAS4133 Assignment 2.
Base model
Qwen/Qwen2.5-1.5B-Instruct
Evaluation accuracy on 100 held-out GSM8K-style examples
| Model | Accuracy |
|---|---|
| Base | 0.69 |
| SFT | 0.58 |
| DPO | 0.57 |
| SimPO | 0.58 |
Dataset
https://huggingface.co/datasets/Riacrdo/assn2-preference-dataset
- Downloads last month
- 22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support