assn2-sft-qwen2.5-1.5b-lora

This repository contains the SFT LoRA adapter trained for CAS4133 Assignment 2.

Base model

Qwen/Qwen2.5-1.5B-Instruct

Evaluation accuracy on 100 held-out GSM8K-style examples

Model Accuracy
Base 0.69
SFT 0.58
DPO 0.57
SimPO 0.58

Dataset

https://huggingface.co/datasets/Riacrdo/assn2-preference-dataset

Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Riacrdo/assn2-sft-qwen2.5-1.5b-lora

Adapter
(1032)
this model