File size: 2,809 Bytes
3c522cb 70952a0 3c522cb 9a723b7 3c522cb 9a723b7 3c522cb 9a723b7 3c522cb 9a723b7 3c522cb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 | ---
tags:
- reinforcement-learning
- robotics
- locomotion
- unitree
- go2
- mujoco
- ppo
library_name: rsl-rl
license: bsd-3-clause
---
# Unitree Go2 — Velocity Flat (PPO)
RL locomotion policy for the [Unitree Go2](https://www.unitree.com/go2/) quadruped robot, trained on flat terrain using PPO.
## Demo
[](https://www.youtube.com/watch?v=smxh8Uu2Zpo)
## Training
- **Framework**: [unitree_rl_mjlab](https://github.com/unitreerobotics/unitree_rl_mjlab) (MuJoCo Warp)
- **Task**: `Mjlab-Velocity-Flat-Unitree-Go2`
- **Algorithm**: PPO (RSL-RL)
- **Hardware**: 10× NVIDIA RTX A4000, 56 CPU cores
- **Environments**: 8192 parallel
- **Training time**: ~18 minutes (506 iterations)
## Results
| Metric | Value |
|---|---|
| Mean reward | **52.9** |
| Mean episode length | **1000** (max, no falls) |
| Steps/sec | 628K-738K |
## Files
| File | Description |
|---|---|
| `policy.onnx` + `policy.onnx.data` | ONNX model for deployment (go2_ctrl) |
| `model_500.pt` | Final PyTorch checkpoint (best for fine-tuning) |
| `model_0.pt` ... `model_400.pt` | Intermediate checkpoints every 100 steps |
| `params/deploy.yaml` | Deploy configuration (obs order, action scale, joint mapping) |
| `params/env.yaml` | Environment configuration |
| `params/agent.yaml` | Agent/PPO configuration |
| `events.out.tfevents.*` | TensorBoard training logs |
## Usage
### Deploy in MuJoCo simulator
```bash
# Copy ONNX model + deploy config
cp policy.onnx policy.onnx.data \
unitree_rl_mjlab/deploy/robots/go2/config/policy/velocity/v0/exported/
cp params/deploy.yaml \
unitree_rl_mjlab/deploy/robots/go2/config/policy/velocity/v0/params/
# Build controller
cd unitree_rl_mjlab/deploy/robots/go2
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release && make -j$(nproc)
# Run simulator + controller
cd unitree_mujoco/simulate/build && ./unitree_mujoco
cd unitree_rl_mjlab/deploy/robots/go2/build && ./go2_ctrl --network=lo
```
> **Important:** This model was trained **without `gait_phase`** and with **action scale 0.5**. The default `deploy.yaml` in unitree_rl_mjlab may differ — use `params/deploy.yaml` from this repo.
### Fine-tune on rough terrain
```bash
# Place model_500.pt in logs/rsl_rl/go2_velocity/<run_name>/
python scripts/train.py Mjlab-Velocity-Rough-Unitree-Go2 \
--agent.resume=True \
--agent.load-run="<run_name>" \
--agent.load-checkpoint="model_500.pt" \
--agent.algorithm.learning-rate=1e-4
```
## Known Issues
The upstream `unitree_rl_mjlab` has bugs that crash multi-GPU training on rough terrain — see [Issue #9](https://github.com/unitreerobotics/unitree_rl_mjlab/issues/9) and [PR #8](https://github.com/unitreerobotics/unitree_rl_mjlab/pull/8).
|