---
tags:
- reinforcement-learning
- robotics
- mujoco
- onnx
- rsl-rl
library_name: pytorch
---

# yam_lift_cube_vision Policy

**Run:** `2026-05-03_08-02-44`
**Uploaded:** 2026-05-03 13:39:35 UTC

## Training Configuration

| Parameter | Value |
|-----------|-------|
| Algorithm | PPO |
| Max Iterations | unknown |
| Final Iteration | 2999 |
| Learning Rate | unknown |
| Gamma | unknown |
| Learning Epochs | unknown |
| Mini Batches | unknown |
| Entropy Coefficient | unknown |

## Network Architecture

| Component | Hidden Dimensions | Activation |
|-----------|------------------|------------|
| Actor | unknown | unknown |
| Critic | unknown | unknown |

## Environment

| Parameter | Value |
|-----------|-------|
| Num Environments | unknown |
| Decimation | unknown |
| Episode Length (s) | unknown |

## Command Ranges

- No command ranges found

## Reward Functions

- No reward terms found

## Files

| File | Description |
|------|-------------|
| `model_final.pt` | Final PyTorch checkpoint (iteration 2999) |
| `policy.onnx` | Exported ONNX policy |
| `agent.yaml` | Agent configuration |
| `env.yaml` | Environment configuration |
| `mjlab.diff` | Git diff snapshot for reproducibility |

## Usage

### Load PyTorch Checkpoint

```python
import torch
from mjlab.rl.runner import MjlabOnPolicyRunner

checkpoint = torch.load("model_final.pt", map_location="cpu")
actor_state = checkpoint["actor_state_dict"]
# Use with your environment setup
```

### Load ONNX Policy

```python
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("policy.onnx")
observation = np.zeros((1, obs_dim), dtype=np.float32)
action = session.run(None, {session.get_inputs()[0].name: observation})[0]
```

### Load with MjLab

```bash
# Option 1: Clone the HF repository
git lfs install
git clone https://huggingface.co/robomotic/mjlab-policies
cd mjlab-policies
# Navigate to the appropriate directory for this run

# Option 2: Download just this run using HF CLI
huggingface-cli download robomotic/mjlab-policies yam_lift_cube_vision/2026-05-03_08-02-44/model_final.pt

# Play the policy
uv run play --task yam_lift_cube_vision --checkpoint path/to/model_final.pt
```