v4 global residual sparse RL — single model deterministic eval Total: 150/240 = 62.50% Seen: 136/200 = 68.00% Unseen: 14/40 = 35.00% Per type: Top_Long: 43/60 = 71.67% (seen 74.0% / unseen 60.0%) Top_Short: 25/60 = 41.67% (seen 48.0% / unseen 10.0%) Pant_Long: 28/60 = 46.67% (seen 54.0% / unseen 10.0%) Pant_Short: 54/60 = 90.00% (seen 96.0% / unseen 60.0%) Reference baseline: SmolVLA four_types 30K (no residual): 60.42% (96 ep, 4 types) Historical v4 global with explore=True: 58.75% (240 ep)