Gaotang Li's picture

Gaotang Li

gaotang

·

https://gaotangli.github.io/

GaotangLi

AI & ML interests

None yet

Recent Activity

commentedon a paper 2 days ago

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

upvoted a paper 6 days ago

Code as Agent Harness

authored a paper 11 days ago

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

View all activity

Organizations

None yet

Collections 3

View 3 collections

Papers 5

arxiv:2605.10899

arxiv:2510.00526

arxiv:2506.06444

arxiv:2505.02387

models 11

gaotang/deepseek-math-7b-base

Text Generation • 7B • Updated Aug 28, 2025 • 2

gaotang/RM-R1-DeepSeek-Distilled-Qwen-7B

Text Generation • 8B • Updated Jun 28, 2025 • 61 • 3

gaotang/RM-R1-Qwen2.5-Instruct-7B

Text Generation • 8B • Updated Jun 28, 2025 • 238 • 4

gaotang/RM-R1-DeepSeek-Distilled-Qwen-14B

Text Generation • 15B • Updated Jun 28, 2025 • 962 • 1

gaotang/RM-R1-Qwen2.5-Instruct-14B

Text Generation • 15B • Updated Jun 28, 2025 • 6 • 1

gaotang/RM-R1-Qwen2.5-Instruct-32B

Text Generation • 33B • Updated Jun 28, 2025 • 42 • 1

gaotang/RM-R1-DeepSeek-Distilled-Qwen-32B

Text Generation • 33B • Updated Jun 28, 2025 • 35 • 2

gaotang/qwen_7b_sky_filtered_code8k_math_10k_distilled_Claude_o3_0419

8B • Updated Apr 19, 2025 • 1

gaotang/qwen_7b_sky_filtered_code8k_math_10k_distilled_OpenAI

8B • Updated Apr 18, 2025

gaotang/qwen_14b_sky_filtered_code8k_math_10k_distilled_OpenAI

15B • Updated Apr 18, 2025

datasets 35

gaotang/figlet_font

Viewer • Updated Sep 8, 2025 • 45k • 13

gaotang/figlet_font_train

Viewer • Updated Sep 8, 2025 • 5 • 10

gaotang/huatuo_medical_sft_processed

Viewer • Updated Sep 7, 2025 • 19.7k • 11

gaotang/medical_sft_processed

Viewer • Updated Sep 6, 2025 • 23.5k • 48

gaotang/ParaConflict

Viewer • Updated Aug 31, 2025 • 2.15k • 40

gaotang/numina-cot-subset-val

Viewer • Updated Aug 28, 2025 • 128 • 8

gaotang/numina-cot-subset-67k

Viewer • Updated Aug 28, 2025 • 67.6k • 32

gaotang/ParaConfilct

Viewer • Updated Jul 17, 2025 • 2.15k • 188

gaotang/RM-R1-Reasoning-RLVR

Viewer • Updated May 20, 2025 • 73k • 21 • 1

gaotang/RM-R1-Entire-RLVR-Train

Viewer • Updated May 20, 2025 • 73k • 67 • 2

View 35 datasets