tenacious-bench-simpo-judge-v1
LoRA adapter for Qwen2.5-0.5B-Instruct trained as a B2B sales compliance judge via CPO (Contrastive Preference Optimization) on 137 Tenacious-Bench preference pairs.
- Accuracy: 92.7% on held-out partition (vs 69.1% rule-only baseline)
- Training: CPO, LoRA r=16, beta=2.0, 3 epochs on Colab T4
- Dataset: eyobed7b/tenacious-bench
- Author: Eyobed Feleke
- License: CC-BY-4.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support