Shuxin Lin

shuxinl

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 13 days ago

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

new activity about 1 month ago

ibm-research/AssetOpsBench:Update data/scenarios/all_utterance.jsonl

upvoted a paper about 1 month ago

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

View all activity

Organizations

upvoted a paper 13 days ago

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

Paper • 2606.19704 • Published 14 days ago • 41

upvoted a paper about 1 month ago

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

Paper • 2605.09131 • Published May 9 • 59

upvoted a paper about 1 year ago

FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes

Paper • 2506.03278 • Published Jun 3, 2025 • 7