Qwen2.5-7B-GRPO-1M-Context-Medical-Reasoning-f16 / model-00002-of-00004.safetensors

Commit History

Trained with Unsloth
1666e4b
verified

dumbequation commited on