tenacious-bench-simpo-judge-v1

LoRA adapter for Qwen2.5-0.5B-Instruct trained as a B2B sales compliance judge via CPO (Contrastive Preference Optimization) on 137 Tenacious-Bench preference pairs.

  • Accuracy: 92.7% on held-out partition (vs 69.1% rule-only baseline)
  • Training: CPO, LoRA r=16, beta=2.0, 3 epochs on Colab T4
  • Dataset: eyobed7b/tenacious-bench
  • Author: Eyobed Feleke
  • License: CC-BY-4.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for eyobed7b/tenacious-bench-simpo-judge-v1

Adapter
(623)
this model