metadata
tags:
- reinforcement-learning
- robotics
- mujoco
- onnx
- rsl-rl
library_name: pytorch
yam_lift_cube_vision Policy
Run: 2026-05-03_08-02-44
Uploaded: 2026-05-03 13:39:35 UTC
Training Configuration
| Parameter | Value |
|---|---|
| Algorithm | PPO |
| Max Iterations | unknown |
| Final Iteration | 2999 |
| Learning Rate | unknown |
| Gamma | unknown |
| Learning Epochs | unknown |
| Mini Batches | unknown |
| Entropy Coefficient | unknown |
Network Architecture
| Component | Hidden Dimensions | Activation |
|---|---|---|
| Actor | unknown | unknown |
| Critic | unknown | unknown |
Environment
| Parameter | Value |
|---|---|
| Num Environments | unknown |
| Decimation | unknown |
| Episode Length (s) | unknown |
Command Ranges
- No command ranges found
Reward Functions
- No reward terms found
Files
| File | Description |
|---|---|
model_final.pt |
Final PyTorch checkpoint (iteration 2999) |
policy.onnx |
Exported ONNX policy |
agent.yaml |
Agent configuration |
env.yaml |
Environment configuration |
mjlab.diff |
Git diff snapshot for reproducibility |
Usage
Load PyTorch Checkpoint
import torch
from mjlab.rl.runner import MjlabOnPolicyRunner
checkpoint = torch.load("model_final.pt", map_location="cpu")
actor_state = checkpoint["actor_state_dict"]
# Use with your environment setup
Load ONNX Policy
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("policy.onnx")
observation = np.zeros((1, obs_dim), dtype=np.float32)
action = session.run(None, {session.get_inputs()[0].name: observation})[0]
Load with MjLab
# Option 1: Clone the HF repository
git lfs install
git clone https://huggingface.co/robomotic/mjlab-policies
cd mjlab-policies
# Navigate to the appropriate directory for this run
# Option 2: Download just this run using HF CLI
huggingface-cli download robomotic/mjlab-policies yam_lift_cube_vision/2026-05-03_08-02-44/model_final.pt
# Play the policy
uv run play --task yam_lift_cube_vision --checkpoint path/to/model_final.pt