-
Continuous Latent Diffusion Language Model
Paper • 2605.06548 • Published • 80 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 231 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 156 -
Pretraining Language Models to Ponder in Continuous Space
Paper • 2505.20674 • Published • 3
Collections
Discover the best community collections!
Collections including paper arxiv:2601.07372
-
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era
Paper • 2601.07526 • Published • 23 -
Intelligent AI Delegation
Paper • 2602.11865 • Published • 16 -
ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents
Paper • 2511.12960 • Published • 1 -
CityRAG: Stepping Into a City via Spatially-Grounded Video Generation
Paper • 2604.19741 • Published • 17
-
HuggingFaceFW/finetranslations
Viewer • Updated • 3.33B • 10.8k • 294 -
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
Paper • 2411.00136 • Published -
The Illusion of Readiness in Health AI
Paper • 2509.18234 • Published • 1 -
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
Paper • 2601.07220 • Published
-
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team
Paper • 2506.14234 • Published • 41 -
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models
Paper • 2506.14435 • Published • 7 -
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Paper • 2504.19413 • Published • 60 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 167
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 56 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 62 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 73
-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 29 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 162
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 208 -
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 109 -
Small Language Models are the Future of Agentic AI
Paper • 2506.02153 • Published • 25
-
Continuous Latent Diffusion Language Model
Paper • 2605.06548 • Published • 80 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 231 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 156 -
Pretraining Language Models to Ponder in Continuous Space
Paper • 2505.20674 • Published • 3
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 56 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 62 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 73
-
MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era
Paper • 2601.07526 • Published • 23 -
Intelligent AI Delegation
Paper • 2602.11865 • Published • 16 -
ENGRAM: Effective, Lightweight Memory Orchestration for Conversational Agents
Paper • 2511.12960 • Published • 1 -
CityRAG: Stepping Into a City via Spatially-Grounded Video Generation
Paper • 2604.19741 • Published • 17
-
HuggingFaceFW/finetranslations
Viewer • Updated • 3.33B • 10.8k • 294 -
LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
Paper • 2411.00136 • Published -
The Illusion of Readiness in Health AI
Paper • 2509.18234 • Published • 1 -
The Roots of Performance Disparity in Multilingual Language Models: Intrinsic Modeling Difficulty or Design Choices?
Paper • 2601.07220 • Published
-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 29 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 162
-
Xolver: Multi-Agent Reasoning with Holistic Experience Learning Just Like an Olympiad Team
Paper • 2506.14234 • Published • 41 -
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models
Paper • 2506.14435 • Published • 7 -
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Paper • 2504.19413 • Published • 60 -
MemOS: A Memory OS for AI System
Paper • 2507.03724 • Published • 167
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 208 -
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 109 -
Small Language Models are the Future of Agentic AI
Paper • 2506.02153 • Published • 25