Go2+Z1 Walking Policy V2 (rotation-capable + heading-tracking)

PPO walking policy for Unitree Go2 + Z1 composite robot, upgraded over V1 to support large in-place rotations and heading commands — needed for autonomous navigation through warehouse aisles.

What's new vs V1

V1 V2
Yaw command range ω_z [-0.5, 0.5] rad/s [-2.0, 2.0] rad/s (covers 180° pivot)
Heading command none 30 % of envs receive heading command (heading_command=True, rel_heading_envs=0.3)
track_ang_vel_z_exp reward weight 0.75 1.2
Iterations 1500 3000
Task ID Isaac-Velocity-Flat-Go2Z1-v0 Isaac-Velocity-Flat-Go2Z1-V2-v0

V1 oscillated whenever the commanded yaw error exceeded ~90°. V2 fixes that by giving the policy direct experience with large angular commands during training.

Files

  • model_*.pt — actor-critic checkpoint (rsl-rl OnPolicyRunner format)

Usage

Identical to V1 (same architecture). See V1 README for code: https://huggingface.co/m3/go2z1-walking-rsl-rl-v1

For end-to-end inference inside Isaac Sim, the goal-directed nav script that drives this policy is:

# Pseudocode — see go2_z1_warehouse/stage4_joint_eval/walk_warehouse_navigate.py
yaw_err = (target_yaw - cur_yaw + π) % 2π - π
v_fwd   = clip(0.8 * cos(yaw_err), 0.2, 0.8)
w_z     = clip(1.5 * yaw_err, -2.0, 2.0)   # V2 supports the full range
cmd_term.vel_command_b[:] = (v_fwd, 0, w_z)
action = actor(obs)

Training data

On-policy RL — no offline dataset. The full task definition lives in:

Predecessor

Citation

@misc{go2z1-walking-v2,
  title  = {Go2+Z1 Walking Policy V2 (rotation-capable + heading-tracking)},
  author = {m3},
  year   = {2026},
  url    = {https://huggingface.co/m3/go2z1-walking-rsl-rl-v2}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading