Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published 24 days ago • 30
Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published 24 days ago • 30
Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published 24 days ago • 30
Post-Trained MoE Can Skip Half Experts via Self-Distillation Paper • 2605.18643 • Published 24 days ago • 30
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published Nov 11, 2025 • 110
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published Sep 18, 2025 • 119
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4, 2025 • 77
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27, 2025 • 71
OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models Paper • 2307.03084 • Published Jul 5, 2023 • 1
Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models Paper • 2403.08281 • Published Mar 13, 2024
Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process Paper • 2405.11870 • Published May 20, 2024
UltraMedical: Building Specialized Generalists in Biomedicine Paper • 2406.03949 • Published Jun 6, 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding Paper • 2406.12295 • Published Jun 18, 2024