---
library_name: peft
base_model: huihui-ai/Huihui-Qwen3.6-27B-abliterated
pipeline_tag: text-generation
language:
- en
license: other
license_name: qwen
license_link: https://huggingface.co/huihui-ai/Huihui-Qwen3.6-27B-abliterated
tags:
- lora
- peft
- sft
- trl
- security
- supply-chain
- npm
- code-audit
---

# ModuleWarden Auditor - Qwen3.6-27B LoRA (v2, verdict-calling)

A LoRA adapter that turns the abliterated Qwen3.6-27B into the auditor for ModuleWarden, an auditable npm supply-chain submission gate. It reads an audit dossier (a structured diff between two package versions) and writes an evidence-cited `modulewarden.audit_report.v1`: the verdict, the capability deltas that drove it, and a developer-facing summary.

## What changed from v1

v1 was a narrator. It wrote the report in the right schema, but its verdicts collapsed to always-quarantine because the training set had no allow examples. v2 adds neutral and allow cases (the "rich-neutral" set), and the collapse is gone: on the held-out A/B it now calls the verdict correctly, not just describes it. In production the deterministic gate still owns the verdict; this adapter agrees with it on what has been tested and writes the auditable explanation.

## One line

Reads a dossier, returns a verdict (allow / quarantine / block) with cited evidence in a fixed schema. The deterministic gate remains the production authority.

## Intended use

- Input: a `modulewarden.audit_dossier.v1` (version_diff mode) - declared package purpose, semver delta, notable file changes with evidence refs, dependency changes, capability deltas.
- Output: a `modulewarden.audit_report.v1` - verdict, risk level, primary findings each tied to an evidence ref, benign explanations considered, developer-safe summary.
- Built for AppSec review of internal code submissions (a PR that adds a dependency, or an engineer vendoring an open-source package).

## Results (measured 2026-05-30, held-out only, greedy decode)

40 held-out cases the adapter never saw (12 gold-block, 28 gold-flag), sampled from the rich-neutral set:

| Metric | Tuned LoRA auditor |
|---|---|
| In-schema audit report | 100.0% (40/40) |
| Refuses / declines | 0.0% |
| Verdict-match (exact allow / quarantine / block) | 100.0% (40/40) |
| Block-recall (gold=block called block) | 100.0% (12/12) |
| Flag-recall (gold in block/quarantine flagged) | 100.0% (28/28) |

### Read this before quoting the number

These are 40 clean, in-distribution cases with **zero adversarial or evasion cases** (`n_adversarial = 0`). 100% across the board is a real fix over the v1 collapse (block-recall was 0.0), but it is **not** a production malware-detection rate. A broader and adversarial evaluation, and the cross-config ranking, are still pending. Do not read this as "100% detection." In production the deterministic gate owns the verdict; this adapter writes the cited report and matches the gate on what we have tested.

Why an abliterated base: a stock instruct model refuses to read and describe malicious npm code, and the auditor has to. The base is pre-abliterated with the Arditi refusal-direction method; the prompts are security-analysis framing, not jailbreaks.

## How to load (PEFT)

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = "huihui-ai/Huihui-Qwen3.6-27B-abliterated"
adapter = "ademczuk/modulewarden-auditor-qwen3.6-27b-lora"

tok = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base, dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter)
```

## Serving

- **vLLM**: serves the adapter directly. `--enable-lora --lora-modules mw=ademczuk/modulewarden-auditor-qwen3.6-27b-lora`.
- **llama.cpp**: merge the adapter first, then convert the merged model to GGUF. Qwen3.6 is a Gated DeltaNet plus Gated Attention hybrid, so a current llama.cpp build with the qwen3next operators is required.

## Training

- Base: `huihui-ai/Huihui-Qwen3.6-27B-abliterated` (a qwen3_5 vision-language model, loaded text-only to skip the vision tower).
- Method: LoRA r16, alpha 32, dropout 0.05 on `q/k/v/o/gate/up/down_proj`.
- Data: ModuleWarden audit dossiers from real GHSA cve_diff cases, plus added neutral and allow examples (the rich-neutral set) to remove the verdict skew that collapsed v1.
- Config: the `neutral-r16-lr1e4` run from the collapse-fix sweep (learning rate 1e-4), best by block-recall.
- Hardware: 4x A100-SXM-64GB on CINECA Leonardo, bf16.

## Limitations

- 40-case held-out eval, cve_diff plus neutral cases, **zero adversarial cases tested**. The 100% figures do not transfer to a claim about evasive or novel malware.
- Cross-config ranking is not finished; this is the current best by block-recall, not a final selection.
- In production the deterministic gate owns the verdict. This adapter can describe a risk the gate did not flag, and cannot override a verdict.
- License inherits the Qwen3.6 base via the huihui base model.

## Project

ModuleWarden is an auditable npm supply-chain gate built for the Zero-One Hack Vienna 2026 Sybilion Forecast lane. A forecast ranks dependencies by growth trajectory so reviewers vet the climbing ones first, a deterministic gate detects the known-bad, and this adapter calls and narrates the verdict into a git-committed Control Evidence Memo.