mohan007 commited on
Commit
d66b7b6
·
verified ·
1 Parent(s): 8df43a9

Add README

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - mlx
5
+ - robotics
6
+ - pi0.5
7
+ - quantized
8
+ - apple-silicon
9
+ ---
10
+
11
+ # pi0.5 — 4-bit Quantized MLX Weights
12
+
13
+ 4-bit quantized weights for [lerobot/pi05_base](https://huggingface.co/lerobot/pi05_base) converted to Apple MLX format.
14
+
15
+ Runs on **Apple Silicon (M1/M2/M3)** with ~2.4 GB RAM. Loads in ~6s, inference in ~2s per action chunk.
16
+
17
+ ## Architecture
18
+ - PaliGemma 2B VLM (SigLIP + Gemma 2B) + Gemma 300M action expert
19
+ - Flow-matching policy: 10-step Forward Euler denoising
20
+ - Output: action chunk [B, 50, 32]
21
+
22
+ ## Usage
23
+
24
+ ```python
25
+ from huggingface_hub import hf_hub_download
26
+ import mlx.core as mx
27
+ import mlx.nn as nn
28
+
29
+ # Download quantized weights (~2.6 GB, one-time)
30
+ npz_path = hf_hub_download("mohan007/pi05-mlx-4bit", "pi05_mlx_4bit.npz")
31
+
32
+ # Load with mlx_pi05
33
+ from mlx_pi05.load import load_model
34
+ model = load_model(quantized_path=npz_path, quantize=True)
35
+ model.eval()
36
+
37
+ # Run inference
38
+ import numpy as np
39
+ image_mlx = mx.array(np.zeros((1, 3, 224, 224), dtype=np.float32))
40
+ lang_mlx = mx.array(np.array([[1, 2, 3, 4, 5]], dtype=np.int32))
41
+ actions = model.sample_actions(image_mlx, lang_mlx) # [1, 50, 32]
42
+ ```
43
+
44
+ ## Quantization
45
+ - Gemma 2B + expert layers: 4-bit (group_size=64)
46
+ - SigLIP kept in float16 (fc2 input dim 4304 not divisible by 64)
47
+ - Total: ~2.4 GB vs ~7.2 GB float16
48
+
49
+ ## Source
50
+ Converted from [lerobot/pi05_base](https://huggingface.co/lerobot/pi05_base) original float32 safetensors.