--- tags: - reinforcement-learning - robotics - mujoco - onnx - rsl-rl library_name: pytorch --- # yam_lift_cube_vision Policy **Run:** `2026-05-03_08-02-44` **Uploaded:** 2026-05-03 13:39:35 UTC ## Training Configuration | Parameter | Value | |-----------|-------| | Algorithm | PPO | | Max Iterations | unknown | | Final Iteration | 2999 | | Learning Rate | unknown | | Gamma | unknown | | Learning Epochs | unknown | | Mini Batches | unknown | | Entropy Coefficient | unknown | ## Network Architecture | Component | Hidden Dimensions | Activation | |-----------|------------------|------------| | Actor | unknown | unknown | | Critic | unknown | unknown | ## Environment | Parameter | Value | |-----------|-------| | Num Environments | unknown | | Decimation | unknown | | Episode Length (s) | unknown | ## Command Ranges - No command ranges found ## Reward Functions - No reward terms found ## Files | File | Description | |------|-------------| | `model_final.pt` | Final PyTorch checkpoint (iteration 2999) | | `policy.onnx` | Exported ONNX policy | | `agent.yaml` | Agent configuration | | `env.yaml` | Environment configuration | | `mjlab.diff` | Git diff snapshot for reproducibility | ## Usage ### Load PyTorch Checkpoint ```python import torch from mjlab.rl.runner import MjlabOnPolicyRunner checkpoint = torch.load("model_final.pt", map_location="cpu") actor_state = checkpoint["actor_state_dict"] # Use with your environment setup ``` ### Load ONNX Policy ```python import onnxruntime as ort import numpy as np session = ort.InferenceSession("policy.onnx") observation = np.zeros((1, obs_dim), dtype=np.float32) action = session.run(None, {session.get_inputs()[0].name: observation})[0] ``` ### Load with MjLab ```bash # Option 1: Clone the HF repository git lfs install git clone https://huggingface.co/robomotic/mjlab-policies cd mjlab-policies # Navigate to the appropriate directory for this run # Option 2: Download just this run using HF CLI huggingface-cli download robomotic/mjlab-policies yam_lift_cube_vision/2026-05-03_08-02-44/model_final.pt # Play the policy uv run play --task yam_lift_cube_vision --checkpoint path/to/model_final.pt ```