Finetuned on RoboTwin multi-task dataset, RDT2-FM, but vlm was not frozen.
Full parameter finetune.
batchsize=16, 30000steps
Chat template
Files info
Base model