Upload pi0.5 UR7e PickandPlace 30-epoch (step 4300) with model card

4b47817 verified about 1 month ago

3.6 kB

license: apache-2.0
library_name: lerobot
pipeline_tag: robotics
model_name: pi05
base_model: lerobot/pi05_base
datasets:
  - CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps
tags:
  - robotics
  - lerobot
  - pi05
  - vision-language-action
  - imitation-learning
  - safetensors
  - ur7e

Model Card for π0.5 — UR7e PickandPlace (30 epoch)

π₀.₅ (Pi05) Policy

π₀.₅ is a Vision-Language-Action model with open-world generalization, from Physical Intelligence. The LeRobot implementation is adapted from their open source OpenPI repository. See the Physical Intelligence π₀.₅ blog post.

This checkpoint is a fine-tune of lerobot/pi05_base on the CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps dataset for a UR7e single-arm pick-and-place task.

This policy has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.

Training Summary

Field	Value
Base model	`lerobot/pi05_base`
Dataset	`CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps` (100 eps, 35,878 frames, 10 fps)
Robot	UR7e single-arm, 7-DoF (6 joints + gripper)
Cameras	`realsense_topview`, `realsense_wrist` (renamed → `base_0_rgb`/`left_wrist_0_rgb`)
Steps	4,300 (≈ 30 epoch · 35878 × 30 / 256)
Batch	32 × 2 GPU × 4 grad_accum = 256 per optimizer-step samples
VLM / Action expert	PaliGemma `gemma_2b` / `gemma_300m`, `bfloat16`
Optimizer	AdamW (lr 1e-4, betas (0.9, 0.95), wd 1e-10), cosine decay w/ warmup 1000
Chunk / Action steps	50 / 50
Memory	`gradient_checkpointing=true`, `compile_model=false`
Normalization	ACTION/STATE = `MEAN_STD`, VISUAL = `IDENTITY`
Image augmentation	brightness, contrast, saturation, hue, sharpness, affine (max 3, random order)
Hardware	2× NVIDIA RTX PRO 6000 Blackwell

action/observation.state dim 은 7 이며, π0.5 의 max_action_dim=32, max_state_dim=32 으로 자동 zero-pad 됩니다.

How to Get Started

Inference (load + step)

import torch
from lerobot.policies.pi05.modeling_pi05 import PI05Policy

policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05-UR7e-PickandPlace-30epoch")
policy.to("cuda").eval()

# observation 의 카메라 키는 학습 시 사용한 이름(`observation.images.base_0_rgb`,
# `observation.images.left_wrist_0_rgb`) 과 동일해야 합니다.
with torch.inference_mode():
    action = policy.select_action(observation)

Continue fine-tuning

lerobot-train \
  --policy.path=CoRL2026-CSI/pi05-UR7e-PickandPlace-30epoch \
  --dataset.repo_id=CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps \
  --output_dir=outputs/train/pi05_ur7e_pickandplace_ft \
  --job_name=pi05_ur7e_pickandplace_ft \
  --batch_size=32 --gradient_accumulation_steps=4 --steps=1000 \
  --policy.device=cuda --policy.dtype=bfloat16 \
  --policy.gradient_checkpointing=true --wandb.enable=true

원본 학습 스크립트는 scripts/cap/pi05_cap_ur7e_pickandplace.sh 이며, 정확한 hyperparameter 는 이 리포의 train_config.json 으로도 재구성 가능합니다.

Model Details

License: apache-2.0
Base model: lerobot/pi05_base
Library: LeRobot
Trained by: CoRL2026-CSI