nvidia
/

DLER-Llama-Nemotron-8B-Merge-Research

Model card Files Files and versions

sliuau commited on Sep 3, 2025

Commit

cc47ceb

·

verified ·

1 Parent(s): c068f26

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -9,6 +9,14 @@ base_model:
 - nvidia/Llama-3.1-Nemotron-Nano-8B-v1
 ---
 # Model Overview
 ### Description:
 DLER-Llama-3.1-Nemotron-8B is an ultra-efficient 8B open-weight reasoning model designed for challenging tasks such as mathematics, programming, and scientific problem-solving. It is first trained with the DLER algorithm on agentica-org/DeepScaleR-Preview-Dataset and then enhanced using a weight-merging technique to merge with the base model to mitigate accuracy degradation. Compared to the Llama-3.1-Nemotron-8B model, DLER-Llama-Nemotron-8B-Merge achieves substantial efficiency gains, reducing the average response length by nearly 50% across diverse mathematical benchmarks without sacrificing accuracy.

 - nvidia/Llama-3.1-Nemotron-Nano-8B-v1
 ---
 # Model Overview
+<div align="center">
+<span style="font-family: default; font-size: 1.5em;">DLER-Llama-Nemotron-8B-Merge</span>
+<div>
+🚀 The leading efficient reasoning model for cutting-edge research and development 🌟
+</div>
+</div>
+![Comparison between Llama-3.1-Nemotron-Nano-8B-v1 and DLER-Llama-Nemotron-8B-Merge](./asset/latency_8b.png)
 ### Description:
 DLER-Llama-3.1-Nemotron-8B is an ultra-efficient 8B open-weight reasoning model designed for challenging tasks such as mathematics, programming, and scientific problem-solving. It is first trained with the DLER algorithm on agentica-org/DeepScaleR-Preview-Dataset and then enhanced using a weight-merging technique to merge with the base model to mitigate accuracy degradation. Compared to the Llama-3.1-Nemotron-8B model, DLER-Llama-Nemotron-8B-Merge achieves substantial efficiency gains, reducing the average response length by nearly 50% across diverse mathematical benchmarks without sacrificing accuracy.