Instructions to use AlexWortega/qwen3.5-4b-capvec-v2-samemodel-20260514 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AlexWortega/qwen3.5-4b-capvec-v2-samemodel-20260514 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AlexWortega/qwen3.5-4b-capvec-v2-samemodel-20260514", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Qwen3.5-4B Capability Vector v2 — same-model contrast (2026-05-14)
STATUS — null result on aggregate, with new methodological finding. v2 replicates v1's null behavioural lift but with a cleaner mathematical signal (margin/ambient ratio ~2× v1). The same-model contrast removes the model-identity confound that motivated v1's "AUC=1.0 must mean capability" claim. Behaviourally: 1/5 pass-rate on the sprint at α=4 (same as base), 0/5 at α=2. Across 9 sweeps and ~140 docker runs in this run dir, no aggregate lift. Confirms that the direction encodes output style (parse_fail ↔ no_cmd trade-off), not task-solving capability.
What changed from v1
| dimension | v1 | v2 |
|---|---|---|
| positives | 5 SFT-pass traces | 6 SFT/RIFT-pass traces (reuses v1) |
| negatives | 12 traces from different LoRAs (base, cp600, dpo) | 20 traces from same SFT LoRA with reward=0 + parse_fail=0 + steps≥15 |
| confound | model identity baked into direction | same-model: only outcome varies |
| AUC at L22 | 1.000 | 1.000 |
| margin / ambient norm at L22 | 0.17 | 0.29 (~1.7× cleaner) |
How to use
import torch
from transformers import AutoTokenizer, AutoModelForImageTextToText
from huggingface_hub import hf_hub_download
tok = AutoTokenizer.from_pretrained('Qwen/Qwen3.5-4B')
model = AutoModelForImageTextToText.from_pretrained(
'Qwen/Qwen3.5-4B', dtype=torch.bfloat16, device_map={'':0})
vec_path = hf_hub_download('AlexWortega/qwen3.5-4b-capvec-v2-samemodel-20260514', 'vectors/dir.pt')
vec = torch.load(vec_path, weights_only=False)
# See vectors/ranking.csv for AUC-ordered layer list. L=22 chosen for cross-comparison with v1.
Behavioural results
Multi-task (3 configs × 6 tasks):
| config | pass / 6 | parse_fail/run | no_cmd/run |
|---|---|---|---|
| baseline | 2 | 2.2 | 1.2 |
| steered-L22-α4 | 2 | 1.3 ↓ | 2.7 ↑ |
| steered-L30-α4 | 1 | 1.8 | 0.3 |
Steering trades parse_fail for no_cmd. Format compliance improves, action emission degrades, net pass rate unchanged.
Across all sweeps: pass-rate Fisher's p > 0.5 vs base.
Key files
vectors/dir.pt— 32 directions, AUC=1.0 on L19–L31vectors/ranking.csv— full AUC rankingRESULTS.md,RESULTS_FINAL.md— honest write-upscripts/— collect, capture, compute, serve, sweep_eval (sgang-compatible)results/*/master_summary.csv— per-task trial data
Caveats
- n=26 contrast traces. AUC=1.0 is plausible but unrelated to behavioural lift.
- Direction at L22 cos with v1's L22 ≈ 0.5 — they share a partial subspace.
- α-grid sweep confirmed at higher α model breaks: α=8 → 1.4 avg steps before "done" bail, α=6 → in-budget but task-failing.