Instructions to use kaku/smolvla-candy-pick with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use kaku/smolvla-candy-pick with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=kaku/smolvla-candy-pick \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=kaku/smolvla-candy-pick - Notebooks
- Google Colab
- Kaggle
how to finetune this smolvla with my own dataset?
Has your model worked? In my attempt, I recorded 40 action records at the same position and trained 20,000 steps using the official fine-tuning code. However, my model was unable to complete the actions.
python lerobot/scripts/train.py --policy.path=lerobot/smolvla_base --dataset.repo_id=Xiaoyan97/pick_up_scissors_v1 --dataset.root=Xiaoyan97/pick_up_scissors_v1 --batch_size=64 --steps=20000 --output_dir=outputs/smolvla_v1 --job_name=my_smolvla_training --policy.device=cuda
Is it that my training steps are insufficient or for some other reason?
Has your model worked? In my attempt, I recorded 40 action records at the same position and trained 20,000 steps using the official fine-tuning code. However, my model was unable to complete the actions.
python lerobot/scripts/train.py --policy.path=lerobot/smolvla_base --dataset.repo_id=Xiaoyan97/pick_up_scissors_v1 --dataset.root=Xiaoyan97/pick_up_scissors_v1 --batch_size=64 --steps=20000 --output_dir=outputs/smolvla_v1 --job_name=my_smolvla_training --policy.device=cudaIs it that my training steps are insufficient or for some other reason?
@Xiaoyan97
・Training steps should be sufficient, as the policy trained for 4000 steps can already somewhat grasp the candy.
・Also, the cameras on wrist and top seems appropriate
Did you also provide language instructions during data collection and when running the policy?
I used top and wrist cameras, and the details can be found in my dataset. Before you reply, I increased the number of training steps to see if it worked. Once it works, I will give you feedback.