Qwen3.5-4B Capability Vector v2 — same-model contrast (2026-05-14)

STATUS — null result on aggregate, with new methodological finding. v2 replicates v1's null behavioural lift but with a cleaner mathematical signal (margin/ambient ratio ~2× v1). The same-model contrast removes the model-identity confound that motivated v1's "AUC=1.0 must mean capability" claim. Behaviourally: 1/5 pass-rate on the sprint at α=4 (same as base), 0/5 at α=2. Across 9 sweeps and ~140 docker runs in this run dir, no aggregate lift. Confirms that the direction encodes output style (parse_fail ↔ no_cmd trade-off), not task-solving capability.

What changed from v1

dimension	v1	v2
positives	5 SFT-pass traces	6 SFT/RIFT-pass traces (reuses v1)
negatives	12 traces from different LoRAs (base, cp600, dpo)	20 traces from same SFT LoRA with reward=0 + parse_fail=0 + steps≥15
confound	model identity baked into direction	same-model: only outcome varies
AUC at L22	1.000	1.000
margin / ambient norm at L22	0.17	0.29 (~1.7× cleaner)

How to use

import torch
from transformers import AutoTokenizer, AutoModelForImageTextToText
from huggingface_hub import hf_hub_download

tok = AutoTokenizer.from_pretrained('Qwen/Qwen3.5-4B')
model = AutoModelForImageTextToText.from_pretrained(
    'Qwen/Qwen3.5-4B', dtype=torch.bfloat16, device_map={'':0})

vec_path = hf_hub_download('AlexWortega/qwen3.5-4b-capvec-v2-samemodel-20260514', 'vectors/dir.pt')
vec = torch.load(vec_path, weights_only=False)
# See vectors/ranking.csv for AUC-ordered layer list. L=22 chosen for cross-comparison with v1.

Behavioural results

Multi-task (3 configs × 6 tasks):

config	pass / 6	parse_fail/run	no_cmd/run
baseline	2	2.2	1.2
steered-L22-α4	2	1.3 ↓	2.7 ↑
steered-L30-α4	1	1.8	0.3

Steering trades parse_fail for no_cmd. Format compliance improves, action emission degrades, net pass rate unchanged.

Across all sweeps: pass-rate Fisher's p > 0.5 vs base.

Key files

vectors/dir.pt — 32 directions, AUC=1.0 on L19–L31
vectors/ranking.csv — full AUC ranking
RESULTS.md, RESULTS_FINAL.md — honest write-up
scripts/ — collect, capture, compute, serve, sweep_eval (sgang-compatible)
results/*/master_summary.csv — per-task trial data

Caveats

n=26 contrast traces. AUC=1.0 is plausible but unrelated to behavioural lift.
Direction at L22 cos with v1's L22 ≈ 0.5 — they share a partial subspace.
α-grid sweep confirmed at higher α model breaks: α=8 → 1.4 avg steps before "done" bail, α=6 → in-budget but task-failing.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AlexWortega/qwen3.5-4b-capvec-v2-samemodel-20260514

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Finetuned

(260)

this model