dynn-datasets

university

https://huggingface.co/

AI & ML interests

None defined yet.

Recent Activity

XingtaiHF authored a paper 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

youyc22 authored a paper 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

bambisheng authored a paper 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

View all activity

authored a paper 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 24 days ago • 30

authored a paper 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 24 days ago • 30

authored a paper 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 24 days ago • 30

submitted a paper to Daily Papers 22 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 24 days ago • 30

updated a dataset 3 months ago

dynn-datasets/Evaluation

Preview • Updated Mar 24 • 80

updated a dataset 3 months ago

dynn-datasets/Evaluation

Preview • Updated Mar 24 • 80

in dynn-datasets/Evaluation 3 months ago

Upload aime-2026.jsonl

#1 opened 3 months ago by

updated a dataset 3 months ago

dynn-datasets/Evaluation

Preview • Updated Mar 24 • 80

published a dataset 3 months ago

dynn-datasets/Evaluation

Preview • Updated Mar 24 • 80

authored a paper 3 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 60

authored a paper 7 months ago

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models

Paper • 2511.08577 • Published Nov 11, 2025 • 110

authored 2 papers 9 months ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18, 2025 • 119

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4, 2025 • 77

authored a paper about 1 year ago

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Paper • 2505.21600 • Published May 27, 2025 • 71

authored a paper about 1 year ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 122

authored 5 papers about 1 year ago

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

Paper • 2307.03084 • Published Jul 5, 2023 • 1

Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

Paper • 2403.08281 • Published Mar 13, 2024

Intuitive Fine-Tuning: Towards Unifying SFT and RLHF into a Single Process

Paper • 2405.11870 • Published May 20, 2024

UltraMedical: Building Specialized Generalists in Biomedicine

Paper • 2406.03949 • Published Jun 6, 2024

Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding

Paper • 2406.12295 • Published Jun 18, 2024