GLM-5.2 / .eval_results /deep-swe.yaml
nielsr's picture
nielsr HF Staff
Add community evaluation results for DEEP-SWE, GPQA, HLE, SWE-BENCH_PRO
dc7bbb1 verified
Raw
History Blame
153 Bytes
- dataset:
id: datacurve/deep-swe
task_id: deep_swe
value: 46.2
source:
url: https://huggingface.co/zai-org/GLM-5.2
name: Model Card