view article Article Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler +3 ariG23498, sayakpaul, sergiopaniego, ror, pcuenq • about 1 month ago • 129
DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding Paper • 2606.02091 • Published 27 days ago • 1
A Comprehensive Survey on Long Context Language Modeling Paper • 2503.17407 • Published Mar 20, 2025 • 49
Running 3.91k The Ultra-Scale Playbook 🌌 3.91k The ultimate guide to training LLM on large GPU Clusters