ModuleWarden Auditor - Qwen3.6-27B LoRA (v2, verdict-calling)

A LoRA adapter that turns the abliterated Qwen3.6-27B into the auditor for ModuleWarden, an auditable npm supply-chain submission gate. It reads an audit dossier (a structured diff between two package versions) and writes an evidence-cited modulewarden.audit_report.v1: the verdict, the capability deltas that drove it, and a developer-facing summary.

What changed from v1

v1 was a narrator. It wrote the report in the right schema, but its verdicts collapsed to always-quarantine because the training set had no allow examples. v2 adds neutral and allow cases (the "rich-neutral" set), and the collapse is gone: on the held-out A/B it now calls the verdict correctly, not just describes it. In production the deterministic gate still owns the verdict; this adapter agrees with it on what has been tested and writes the auditable explanation.

One line

Reads a dossier, returns a verdict (allow / quarantine / block) with cited evidence in a fixed schema. The deterministic gate remains the production authority.

Intended use

Input: a modulewarden.audit_dossier.v1 (version_diff mode) - declared package purpose, semver delta, notable file changes with evidence refs, dependency changes, capability deltas.
Output: a modulewarden.audit_report.v1 - verdict, risk level, primary findings each tied to an evidence ref, benign explanations considered, developer-safe summary.
Built for AppSec review of internal code submissions (a PR that adds a dependency, or an engineer vendoring an open-source package).

Results (measured 2026-05-30, held-out only, greedy decode)

40 held-out cases the adapter never saw (12 gold-block, 28 gold-flag), sampled from the rich-neutral set:

Metric	Tuned LoRA auditor
In-schema audit report	100.0% (40/40)
Refuses / declines	0.0%
Verdict-match (exact allow / quarantine / block)	100.0% (40/40)
Block-recall (gold=block called block)	100.0% (12/12)
Flag-recall (gold in block/quarantine flagged)	100.0% (28/28)

Read this before quoting the number

These are 40 clean, in-distribution cases with zero adversarial or evasion cases (n_adversarial = 0). 100% across the board is a real fix over the v1 collapse (block-recall was 0.0), but it is not a production malware-detection rate. A broader and adversarial evaluation, and the cross-config ranking, are still pending. Do not read this as "100% detection." In production the deterministic gate owns the verdict; this adapter writes the cited report and matches the gate on what we have tested.

Why an abliterated base: a stock instruct model refuses to read and describe malicious npm code, and the auditor has to. The base is pre-abliterated with the Arditi refusal-direction method; the prompts are security-analysis framing, not jailbreaks.

How to load (PEFT)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "huihui-ai/Huihui-Qwen3.6-27B-abliterated"
adapter = "ademczuk/modulewarden-auditor-qwen3.6-27b-lora"

tok = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base, dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter)

Serving

vLLM: serves the adapter directly. --enable-lora --lora-modules mw=ademczuk/modulewarden-auditor-qwen3.6-27b-lora.
llama.cpp: merge the adapter first, then convert the merged model to GGUF. Qwen3.6 is a Gated DeltaNet plus Gated Attention hybrid, so a current llama.cpp build with the qwen3next operators is required.

Training

Base: huihui-ai/Huihui-Qwen3.6-27B-abliterated (a qwen3_5 vision-language model, loaded text-only to skip the vision tower).
Method: LoRA r16, alpha 32, dropout 0.05 on q/k/v/o/gate/up/down_proj.
Data: ModuleWarden audit dossiers from real GHSA cve_diff cases, plus added neutral and allow examples (the rich-neutral set) to remove the verdict skew that collapsed v1.
Config: the neutral-r16-lr1e4 run from the collapse-fix sweep (learning rate 1e-4), best by block-recall.
Hardware: 4x A100-SXM-64GB on CINECA Leonardo, bf16.

Limitations

40-case held-out eval, cve_diff plus neutral cases, zero adversarial cases tested. The 100% figures do not transfer to a claim about evasive or novel malware.
Cross-config ranking is not finished; this is the current best by block-recall, not a final selection.
In production the deterministic gate owns the verdict. This adapter can describe a risk the gate did not flag, and cannot override a verdict.
License inherits the Qwen3.6 base via the huihui base model.

Project

ModuleWarden is an auditable npm supply-chain gate built for the Zero-One Hack Vienna 2026 Sybilion Forecast lane. A forecast ranks dependencies by growth trajectory so reviewers vet the climbing ones first, a deterministic gate detects the known-bad, and this adapter calls and narrates the verdict into a git-committed Control Evidence Memo.

Downloads last month: 40

Model tree for ademczuk/modulewarden-auditor-qwen3.6-27b-lora

Base model

Qwen/Qwen3.6-27B

Finetuned

huihui-ai/Huihui-Qwen3.6-27B-abliterated

Adapter

(5)

this model