assn2-simpo-qwen2.5-1.5b-lora
SimPO LoRA adapter for Qwen/Qwen2.5-1.5B-Instruct.
This repository contains a LoRA adapter trained for CAS4133 Assignment 2.
Base model
Qwen/Qwen2.5-1.5B-Instruct
Evaluation accuracy on 100 held-out GSM8K-style examples
| Model | Accuracy |
|---|---|
| Base | 0.69 |
| SFT | 0.58 |
| DPO | 0.57 |
| SimPO | 0.58 |
Dataset
https://huggingface.co/datasets/Riacrdo/assn2-preference-dataset
Note
The full merged model was not uploaded due to network instability. This adapter can be loaded with the base model.
- Downloads last month
- 16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support