FlashMemory-DeepSeek-V4: Lightning Index Ultra-Long Context via Lookahead Sparse Attention Paper • 2606.09079 • Published 3 days ago • 51
nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 Text Generation • 561B • Updated about 9 hours ago • 56.9k • 183
Gemma 4 QAT Collection Gemma 4 QAT (Quantization-Aware Training) for 3x less memory use and near original accuracy. • 16 items • Updated 4 days ago • 72
Domino Collection Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding • 3 items • Updated 8 days ago • 2
unsloth/gemma-4-26B-A4B-it-qat-GGUF Image-Text-to-Text • 25B • Updated about 2 hours ago • 96.1k • 124