Omega-QVLA โ pi0.5 LIBERO W4A4 quantization packs
W4A4 quantization packs for the pi0.5 (openpi) LIBERO action policies, produced by Omega-QVLA.
Recipe (per side)
| side | rotation | quantizer | per-step |
|---|---|---|---|
| PaliGemma backbone | DuQuant svd_hadamard |
GPTQ | single bucket |
| Gemma Expert (action head) | DuQuant svd_hadamard |
RTN residual | yes (act_scale_table, 10 steps) |
All weights are W4, activations A4. The DuQuant rotation is stored in
compact block form (duquant_rotation_blocks + duquant_rotation_perm).
What these files are
These are quantization packs, NOT standalone models. Each quantized.pt is a
dict keyed by layer name; every record holds the quantized weight
(weight_res_q / baseline_q), the block-form rotation, and (Expert side) the
per-step act_scale_table. They are loaded at inference time on top of the
original pi0.5 FP checkpoint by Omega-QVLA's GptqLinear. You cannot
from_pretrained them directly.
| file | suite | records |
|---|---|---|
pi05_object/quantized.pt |
libero_object | 252 (126 PaliGemma + 126 Expert) |
pi05_spatial/quantized.pt |
libero_spatial | 252 |
pi05_goal/quantized.pt |
libero_goal | 252 |
pi05_long/quantized.pt |
libero_10 | 252 |
Usage
# 1. Get the repo + the original pi0.5 PyTorch checkpoint
git clone https://github.com/UCMP13753/Omega-QVLA && cd Omega-QVLA
SUITE=object
PACK=/path/to/pi05_${SUITE}/quantized.pt
INCLUDE_BOTH='.*paligemma_with_expert\.(paligemma\.model\.language_model|gemma_expert\.model)\.layers\.[0-9]+\..*\.(q_proj|k_proj|v_proj|o_proj|gate_proj|up_proj|down_proj).*'
env CONDA_ROOT=$HOME/miniconda3 METHOD=gptq SUITE=$SUITE WBITS=4 ABITS=4 \
GPU_LIST=0,1 PORT_BASE=8600 NUM_TRIALS_PER_TASK=10 GR00T_EVAL_INIT_OFFSET=10 \
OPENPI_ROOT=$HOME/openpi OPENPI_PY=$HOME/openpi/.venv/bin/python \
OPENPI_CONFIG=pi05_libero OPENPI_CHECKPOINT=/path/to/pi05_libero_pytorch \
OPENPI_GPTQ_PATH="$PACK" OPENPI_GPTQ_INCLUDE="$INCLUDE_BOTH" \
OUTPUT_ROOT=results/eval/pi05_${SUITE} \
bash scripts/run_pi05_libero_benchmark.sh
Both sides load through the GPTQ pack path (GptqLinear); the block-form rotation
and per-step scales are applied automatically.