Instructions to use Ailiance-fr/qwen3-4b-mascarade-emc-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Ailiance-fr/qwen3-4b-mascarade-emc-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B-Instruct-2507") model = PeftModel.from_pretrained(base_model, "Ailiance-fr/qwen3-4b-mascarade-emc-lora") - Notebooks
- Google Colab
- Kaggle
- Model Card for qwen3-4b-mascarade-emc-lora
Model Card for qwen3-4b-mascarade-emc-lora
This model is a fine-tuned version of Qwen/Qwen3-4B. It has been trained using TRL with SFT on an EMC compliance corpus as part of the Ailiance mascarade LoRA family.
Quick start
from transformers import pipeline
question = "What decoupling capacitor strategy minimizes conducted emissions on a switching regulator?"
generator = pipeline("text-generation", model="Ailiance-fr/qwen3-4b-mascarade-emc-lora", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=256, return_full_text=False)[0]
print(output["generated_text"])
Bench results — ailiance-bench Phase 7 (CUDA, 2026-05-11)
Functional eval via the parsers/scorers from ailiance/ailiance-bench Phase 1 (bench_kicad_functional), ported to CUDA / transformers + PEFT for the Qwen3-4B-Instruct-2507 base.
| Dataset | n | Composite score | Duration |
|---|---|---|---|
emc-dsp-power |
10 | 0.646 | 1221.9s |
Composite score combines structural-parse-ok, component-count match, ground-node presence, etc. — see bench_kicad_functional.score_* for the exact formula. Greedy decoding, max_tokens per GEN_PARAMS.
Upstream base model — official evaluations
These are the official scores for the unmodified base model
Qwen/Qwen3-4B-Instruct-2507,
reported by Alibaba Qwen team. They represent the floor of capability that this
LoRA inherits before the hardware-domain fine-tune adapts behavior.
| Category | Benchmark | Qwen3-4B-Instruct-2507 |
|---|---|---|
| Knowledge | MMLU-Pro | 69.6 |
| Knowledge | MMLU-Redux | 84.2 |
| Knowledge | GPQA | 62.0 |
| Knowledge | SuperGPQA | 42.8 |
| Reasoning | AIME25 | 47.4 |
| Reasoning | HMMT25 | 31.0 |
| Reasoning | ZebraLogic | 80.2 |
| Reasoning | LiveBench 2024-11-25 | 63.0 |
| Coding | LiveCodeBench v6 | 35.1 |
| Coding | MultiPL-E | 76.8 |
| Coding | Aider-Polyglot | 12.9 |
| Alignment | IFEval | 83.4 |
| Alignment | Arena-Hard v2 | 43.4 |
| Alignment | Creative Writing v3 | 83.5 |
| Alignment | WritingBench | 83.4 |
| Agent | BFCL-v3 | 61.9 |
| Agent | TAU1-Retail | 48.7 |
| Agent | TAU1-Airline | 32.0 |
| Agent | TAU2-Retail | 40.4 |
| Multilingual | MultiIF | 69.0 |
| Multilingual | MMLU-ProX | 61.6 |
| Multilingual | INCLUDE | 60.1 |
| Multilingual | PolyMATH | 31.1 |
Source: official Qwen3-4B-Instruct-2507 model card.
Reading these numbers alongside the Phase 6 bench above: the upstream scores measure general capability (knowledge, reasoning, coding, alignment). The Phase 6 deltas measure hardware-domain specialization (KiCad, SPICE, schematic extraction). A rank-16 LoRA adapter modifies less than 1% of base weights, so the upstream scores remain approximately the floor — this LoRA adds the Phase 6 deltas on top of these inherited capabilities.
Training procedure
This model was trained with SFT on an EMC compliance corpus.
Framework versions
- TRL: 1.4.0
- Transformers: 5.8.0
- Pytorch: 2.11.0
- Datasets: 4.8.5
- Tokenizers: 0.22.2
Bench results — held-out token-overlap (eval_mascarade_lora, n=10)
Evaluated on 10 random held-out prompts from Ailiance-fr/mascarade-emc-dataset (seed=101 ≠ train seed 42).
| Metric | Value |
|---|---|
| Avg Jaccard token-overlap | 0.06 |
| Avg generation tokens | 143.5 |
| Avg latency (per sample, RTX 4090) | 7.8s |
Token-overlap is a coarse quality proxy — high overlap (>0.4) suggests the LoRA reproduces domain vocabulary; low overlap indicates either domain-shift or stylistic divergence from the reference. See ailiance/ailiance-bench for richer functional evaluations (KiCad DRC, SPICE convergence, etc.) on the same family.
Citations
@software{vonwerra2020trl,
title = {{TRL: Transformers Reinforcement Learning}},
author = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
license = {Apache-2.0},
url = {https://github.com/huggingface/trl},
year = {2020}
}
Bench (vs base Qwen3-4B)
Consolidated comparison of this LoRA against its base, drawing on two complementary evaluation streams. The reference base used for cross-adapter comparison in Phase 6 is gemma-e4b-eu-kiki-base (legacy Gemma-4 ancestor). A dedicated Qwen3-4B-Instruct-2507 baseline run is not in our pipeline yet — those rows are n/a.
Phase 6 — cross-adapter scoreboard (reference base: gemma-e4b-eu-kiki-base)
| Phase | iact-bench task | Base | Tuned (+mascarade) | Δ |
|---|---|---|---|---|
| P3 | kicad-sch-extract (cross-domain) | 0.308 | 0.785 | ++0.477 |
Phase 7 — CUDA functional eval on Qwen3-4B base (production-aligned)
| Dataset | n | Base (Qwen3-4B) | Tuned (this LoRA) | Δ |
|---|---|---|---|---|
emc-dsp-power |
10 | n/a | 0.646 | n/a |
Methodology: iact-bench v0.2.0 (audit-grade Docker validators), greedy decoding, max_tokens per GEN_PARAMS. NDJSON audit trail in ailiance/ailiance-bench. Scoring date: 2026-05-11 (commit 46801af).
Phase 6 numbers reflect adapter behavior on a Gemma-4 reference base; domain semantics transfer to the Qwen3-4B production base served via Tower Ollama
:8004, but absolute scores may shift. A Qwen3-4B baseline run is tracked for a future bench refresh.
Cross-domain forgetting check (Phase 9, 2026-05-11)
For each domain's eval set (seed=101, n samples held-out), compare this LoRA's Jaccard token-overlap vs the Qwen3-4B-Instruct-2507 baseline (no adapter) on the SAME prompts. Negative Δ = the LoRA degrades base behaviour on that domain.
| Eval domain | LoRA Jaccard | Δ vs base |
|---|---|---|
kicad |
0.09 | +0.003 |
spice |
0.007 | +0.002 |
stm32 |
0.055 | +0.005 |
emc |
0.061 | -0.005 ⬅ in-domain |
embedded |
0.072 | -0.002 |
platformio |
0.047 | +0.005 |
freecad |
0.03 | +0.009 |
dsp |
0.098 | -0.003 |
iot |
0.05 | -0.018 |
power |
0.078 | +0.010 |
In-domain Δ: -0.005 Out-of-domain mean Δ: 0.001
- Downloads last month
- 12
Model tree for Ailiance-fr/qwen3-4b-mascarade-emc-lora
Base model
Qwen/Qwen3-4B-Instruct-2507