---
library_name: transformers
base_model: Qwen/Qwen3.5-2B
tags:
  - vla
  - robotics
  - vision-language-action
  - qwen3.5
license: cc-by-nc-4.0
---

# Qwen3.5-2B-LoRA-LAP-UR5e-PyAV

[VLA-0](https://github.com/omron-sinicx/vla0) checkpoint: **Qwen/Qwen3.5-2B** fine-tuned with **LoRA** on LIBERO benchmark tasks.

VLA-0 represents robot actions directly as text tokens — no architectural changes to the base VLM.

Trained on **UR5e (RoboVerse)**.

## Quick Start

### 1. Download

```bash
pip install huggingface_hub
huggingface-cli download denkiwakame/Qwen3.5-2B-LoRA-LAP-UR5e-PyAV --local-dir ./Qwen3.5-2B-LoRA-LAP-UR5e-PyAV
```

### 2. Load with PEFT + transformers

```python
import pickle
import torch
from peft import PeftModel
from transformers import Qwen2_5_VLForConditionalGeneration, Qwen2_5_VLProcessor

ckpt_dir = "./Qwen3.5-2B-LoRA-LAP-UR5e-PyAV"

# Load base model + LoRA adapter
base = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "Qwen/Qwen3.5-2B", torch_dtype=torch.bfloat16, device_map="auto",
)
model = PeftModel.from_pretrained(base, f"{ckpt_dir}/model_final")
processor = Qwen2_5_VLProcessor.from_pretrained(f"{ckpt_dir}/model_final")

# Load dataset stats (required for action denormalization)
with open(f"{ckpt_dir}/dataset_stats.pkl", "rb") as f:
    dataset_stats = pickle.load(f)
```

### 3. Load with VLA-0 framework

```python
from rv_train.train import get_pretrained_model

model, cfg = get_pretrained_model("./Qwen3.5-2B-LoRA-LAP-UR5e-PyAV", device=0)
model.eval()
```

## `dataset_stats.pkl`

Action normalization statistics computed from the training dataset.
Required at inference time to denormalize model outputs back to the original action space.

```python
import pickle

with open("dataset_stats.pkl", "rb") as f:
    stats = pickle.load(f)
# stats contains mean/std for action dimensions
```

## Intermediate Checkpoints

`main` holds the recommended/final weights.
Earlier training-step snapshots are published as **branches** named `step-<global_step>` (e.g., `step-17000`, `step-18000`).
Load any of them by passing `revision=`:

```python
# Download a specific revision
huggingface-cli download denkiwakame/Qwen3.5-2B-LoRA-LAP-UR5e-PyAV --revision step-18000 --local-dir ./Qwen3.5-2B-LoRA-LAP-UR5e-PyAV-step-18000

# Or load directly via transformers
Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "denkiwakame/Qwen3.5-2B-LoRA-LAP-UR5e-PyAV",
    revision="step-18000",
    subfolder="model_final",
)
```

See the [repository branches tab](https://huggingface.co/denkiwakame/Qwen3.5-2B-LoRA-LAP-UR5e-PyAV/refs) for the full list.

## Training Details

- **Base Model**: `Qwen/Qwen3.5-2B`
- **Method**: LoRA
- **Dataset**: UR5e (RoboVerse)
- **Framework**: [VLA-0](https://github.com/omron-sinicx/vla0)

<details>
<summary>Training Config</summary>

```yaml
DATALOADER:
  ROBOVERSE:
    cfg_opts: IMAGE.crop_img:0.9:IMAGE.img_size:224:IMAGE.cam_list:('3p1','wrist_right1')
    cfg_path: libs/RoboVerse/roboverse/configs/ur5e_cluttered_pick_3obj_120.yaml
  batch_size: 16
  num_workers: 8
EXP:
  AMP: true
  DATASET: roboverse
  EXP_ID: lap_qwen3_5_2b_fft_ur5e_cluttered_pick_3obj_120_lora
  LOSS: {}
  LR_SCHED: none
  MODEL: qwen
  OPTIMIZER: adamw
  SEED: 0
EXP_EXTRA:
  no_test: true
  no_track: true
  no_val: true
  save_at_steps:
  - 2000
  - 4000
  - 6000
  - 8000
  save_ckp: 0
  save_last_ckpt: true
  test_eval_freq: 1
  val_eval_freq: 1
LR_SCHED:
  lr_clip: 1.0e-08
  lr_decay_factor: 0.5
  lr_patience: 4
MODEL:
  QWEN:
    action_mask_aug_per: 0.4
    action_type: original
    add_vision_id: true
    attention_dropout: 0.0
    enable_thinking: true
    grad_checkpoint: false
    history: 1
    horizon: 8
    lap_action_is_absolute: true
    lap_emit_holds: false
    lap_rotation_precision: 1
    lap_sum_decimal: 1f
    lora_config: default
    lora_rank: 8
    num_bins_actions: 1000
    num_cam: 2
    original_action_dim: 7
    qwen_model_id: Qwen/Qwen3.5-2B
    reasoning: true
    rgb_img_size:
    - 224
    - 224
    rgb_input: true
    tiled_rgb_imgs: true
    use_flash_attention_2: true
    use_lora: true
    use_qlora: false
TRAIN:
  clip_grad_norm: 0.0
  l2: 1.0e-10
  lr: 1.0e-05
  num_epochs: 100
  num_iters: 10000
  save_iter_ckp: 2500
WANDB:
  enable: true
  entity: ''
  log_interval: 100
  mode: online
  project: vla0
  resume_id: ''
  run_name: ''
  tags: ''


```

</details>

## Files

| File | Description |
|------|-------------|
| `model_final/adapter_config.json` | PEFT adapter configuration (includes `base_model_name_or_path`) |
| `model_final/adapter_model.safetensors` | LoRA adapter weights |
| `model_final/tokenizer.json` | Tokenizer |
| `dataset_stats.pkl` | Action normalization statistics (required for inference) |
| `config.yaml` | Training configuration |

## License

CC-BY-NC-4.0 (following the upstream VLA-0 license).
Subject to [Qwen License](https://huggingface.co/Qwen/Qwen3.5-2B) for the base model.