Instructions to use CoRL2026-CSI/pi05-UR7e-PickandPlace-30epoch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/pi05-UR7e-PickandPlace-30epoch with LeRobot:
- Notebooks
- Google Colab
- Kaggle
File size: 3,602 Bytes
4b47817 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | ---
license: apache-2.0
library_name: lerobot
pipeline_tag: robotics
model_name: pi05
base_model: lerobot/pi05_base
datasets:
- CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps
tags:
- robotics
- lerobot
- pi05
- vision-language-action
- imitation-learning
- safetensors
- ur7e
---
# Model Card for Ο0.5 β UR7e PickandPlace (30 epoch)
**Οβ.β
(Pi05) Policy**
Οβ.β
is a Vision-Language-Action model with open-world generalization, from
Physical Intelligence. The LeRobot implementation is adapted from their open
source OpenPI repository. See the
[Physical Intelligence Οβ.β
blog post](https://www.physicalintelligence.company/blog/pi05).
This checkpoint is a **fine-tune of [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base)**
on the [`CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps`](https://huggingface.co/datasets/CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps)
dataset for a UR7e single-arm pick-and-place task.
This policy has been trained and pushed to the Hub using
[LeRobot](https://github.com/huggingface/lerobot). See the full documentation at
[LeRobot Docs](https://huggingface.co/docs/lerobot/index).
---
## Training Summary
| Field | Value |
|---|---|
| Base model | `lerobot/pi05_base` |
| Dataset | `CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps` (100 eps, 35,878 frames, 10 fps) |
| Robot | UR7e single-arm, 7-DoF (6 joints + gripper) |
| Cameras | `realsense_topview`, `realsense_wrist` (renamed β `base_0_rgb`/`left_wrist_0_rgb`) |
| Steps | 4,300 (β 30 epoch Β· 35878 Γ 30 / 256) |
| Batch | 32 Γ 2 GPU Γ 4 grad_accum = 256 per optimizer-step samples |
| VLM / Action expert | PaliGemma `gemma_2b` / `gemma_300m`, `bfloat16` |
| Optimizer | AdamW (lr 1e-4, betas (0.9, 0.95), wd 1e-10), cosine decay w/ warmup 1000 |
| Chunk / Action steps | 50 / 50 |
| Memory | `gradient_checkpointing=true`, `compile_model=false` |
| Normalization | ACTION/STATE = `MEAN_STD`, VISUAL = `IDENTITY` |
| Image augmentation | brightness, contrast, saturation, hue, sharpness, affine (max 3, random order) |
| Hardware | 2Γ NVIDIA RTX PRO 6000 Blackwell |
`action`/`observation.state` dim μ 7 μ΄λ©°, Ο0.5 μ `max_action_dim=32`, `max_state_dim=32` μΌλ‘ μλ zero-pad λ©λλ€.
---
## How to Get Started
### Inference (load + step)
```python
import torch
from lerobot.policies.pi05.modeling_pi05 import PI05Policy
policy = PI05Policy.from_pretrained("CoRL2026-CSI/pi05-UR7e-PickandPlace-30epoch")
policy.to("cuda").eval()
# observation μ μΉ΄λ©λΌ ν€λ νμ΅ μ μ¬μ©ν μ΄λ¦(`observation.images.base_0_rgb`,
# `observation.images.left_wrist_0_rgb`) κ³Ό λμΌν΄μΌ ν©λλ€.
with torch.inference_mode():
action = policy.select_action(observation)
```
### Continue fine-tuning
```bash
lerobot-train \
--policy.path=CoRL2026-CSI/pi05-UR7e-PickandPlace-30epoch \
--dataset.repo_id=CoRL2026-CSI/UR7e_CaP_PickandPlace_100epi_10fps \
--output_dir=outputs/train/pi05_ur7e_pickandplace_ft \
--job_name=pi05_ur7e_pickandplace_ft \
--batch_size=32 --gradient_accumulation_steps=4 --steps=1000 \
--policy.device=cuda --policy.dtype=bfloat16 \
--policy.gradient_checkpointing=true --wandb.enable=true
```
μλ³Έ νμ΅ μ€ν¬λ¦½νΈλ `scripts/cap/pi05_cap_ur7e_pickandplace.sh` μ΄λ©°,
μ νν hyperparameter λ μ΄ λ¦¬ν¬μ `train_config.json` μΌλ‘λ μ¬κ΅¬μ± κ°λ₯ν©λλ€.
---
## Model Details
- **License:** apache-2.0
- **Base model:** [`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base)
- **Library:** [LeRobot](https://github.com/huggingface/lerobot)
- **Trained by:** CoRL2026-CSI
|