Supersede: Diagnosing and Training the Memory-Update Gap in LLM Agents Paper • 2606.27472 • Published 8 days ago
Running on Zero Agents Supersede Base vs Trained 🧠 Live base vs GRPO-trained Qwen2.5-3B on supersession
Running on Zero Agents Supersede Base vs Trained 🧠 Live base vs GRPO-trained Qwen2.5-3B on supersession
Supersede: Memory-Update Gap in LLM Agents Collection Open RL environment where the reward is temporal fact-currency. GRPO-trained Qwen2.5-3B LoRA lifts held-out supersession 9.0 -> 16.7 percent. • 4 items • Updated 3 days ago