DFlash Collection Block Diffusion for Flash Speculative Decoding • 23 items • Updated 4 days ago • 142
view article Article Accelerating Transformers Fine-Tuning with NVIDIA NeMo AutoModel nvidia • 8 days ago • 34
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 itazap, ariG23498, ArthurZ, sergiopaniego, merve, pcuenq • Dec 18, 2025 • 125
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 14 items • Updated 21 days ago • 55
view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models nvidia • Dec 15, 2025 • 113
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 312
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 reach-vb, pcuenq, lewtun, clem, Rocketknight1, clefourrier, celinah, Wauplin, marcsun13, pagezyhf, ahadnagy, joaogante • Aug 5, 2025 • 513
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated Mar 12 • 219