Instructions to use qualia-robotics/openarm-rl-best-policy with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- stable-baselines3
How to use qualia-robotics/openarm-rl-best-policy with stable-baselines3:
from huggingface_sb3 import load_from_hub checkpoint = load_from_hub( repo_id="qualia-robotics/openarm-rl-best-policy", filename="{MODEL FILENAME}.zip", ) - Notebooks
- Google Colab
- Kaggle
OpenArm RL best policy
Best PPO teacher policy chain from the OpenArm MuJoCo cube-in-box campaign.
This is not a single monolithic checkpoint. The best teacher is a chained controller:
- Grasp leg:
checkpoints/grasp_hover_v3_vm_ppo_1000000_steps.zip - Place leg:
checkpoints/place_fixed_v5_ppo_1000000_steps.zip - Runtime gate/controller:
code/eval_chained_gated.py
Final validated numbers, from BEST_POLICY.md:
- In-box: 43/50 = 86.0%
- Gentle: 42/50 = 84.0%
- Eval seeds: 1000-1049
- Handover: switch after 10 consecutive grasped/lifted steps (
STREAK=10,ZTHR=0.52) - Release gate: keep gripper closed until cube is above box footprint and low horizontal speed
See BEST_POLICY.md for exact reproduction command, known failures, and history.
Source working tree at upload time: /home/nvidia/.openclaw/workspace/projects/openarmmujoco.
- Downloads last month
- 44