- dataset:
    id: allenai/olmOCR-bench
    task_id: overall
  value: 83.2
  notes: "H&F rewards omission, not transcription thus a model that outputs nothing scores perfectly. Excluded to keep Overall focused on real OCR quality."
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: arxiv_math
  value: 89.6
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: old_scans_math
  value: 85.6 
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: table_tests
  value: 89.0
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: old_scans
  value: 42.2
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: multi_column
  value: 84.8
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: long_tiny_text
  value: 91.4
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: nielsr
- dataset:
    id: allenai/olmOCR-bench
    task_id: headers_footers
  value: 19.7
  notes: "Instead of removing headers and footers, our model is trained for full-page transcription and explicitly rewards their presence (via flipped RLVR tests), which lowers this score under the original benchmark scoring."
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: staghado
- dataset:
    id: allenai/olmOCR-bench
    task_id: baseline
  value: 99.6
  source:
    url: https://huggingface.co/papers/2601.14251
    name: LightOnOCR technical report
    user: staghado