--- library_name: transformers base_model: Qwen/Qwen3.5-2B tags: - pii - privacy - guard - qwen - lora - merged - vllm license: apache-2.0 --- # Qwen PII Guard (merged) Fine-tuned from `Qwen/Qwen3.5-2B` to detect personally-identifiable information in user prompts and emit a single JSON object listing the values found in each of 15 categories. Output schema: ```json {"is_valid": true, "category": {"Name": ["John Doe"], "Email": ["john@example.com"]}} ``` `is_valid` is `false` and `category` is `{}` when the prompt contains no PII. ## Categories name, email, phone_number, address, date, national_id, passport_number, drivers_license, tax_id, card_number, bank_account, credentials, ip_address, username ## Evaluation (transformers reference path) - test rows: **200** (held-out, from `test_dataset_pii.csv`) - `is_valid` accuracy: **1.0000** - category key-set accuracy: **0.9350** - category value-set accuracy: **0.8300** - binary F1 (`is_valid`): **1.0000** (P=1.000 R=1.000) - macro F1 over categories (key-presence): **0.9791** - macro F1 over categories (value-set): **0.9529** - parse errors: 0/200 Binary confusion matrix (positive = "contains PII"): | | predicted PII | predicted clean | |---|---:|---:| | actual PII | 177 | 0 | | actual clean | 0 | 23 | Per-category KEY-presence (did the model emit this category at all?): | Category | Support | Precision | Recall | F1 | |---|---:|---:|---:|---:| | address | 79 | 0.987 | 0.987 | 0.987 | | bank_account | 12 | 1.000 | 1.000 | 1.000 | | card_number | 25 | 1.000 | 1.000 | 1.000 | | credentials | 10 | 1.000 | 1.000 | 1.000 | | date | 95 | 1.000 | 1.000 | 1.000 | | drivers_license | 27 | 0.957 | 0.815 | 0.880 | | email | 76 | 0.987 | 1.000 | 0.993 | | ip_address | 9 | 1.000 | 1.000 | 1.000 | | name | 107 | 1.000 | 0.991 | 0.995 | | national_id | 52 | 0.911 | 0.981 | 0.944 | | passport_number | 21 | 0.955 | 1.000 | 0.977 | | phone_number | 63 | 1.000 | 0.984 | 0.992 | | tax_id | 24 | 0.920 | 0.958 | 0.939 | | username | 9 | 1.000 | 1.000 | 1.000 | Per-category VALUE-set (did the exact strings match within the category?): | Category | Support (string-spans) | Precision | Recall | F1 | |---|---:|---:|---:|---:| | address | 79 | 0.924 | 0.924 | 0.924 | | bank_account | 12 | 1.000 | 1.000 | 1.000 | | card_number | 26 | 1.000 | 1.000 | 1.000 | | credentials | 10 | 1.000 | 1.000 | 1.000 | | date | 123 | 1.000 | 1.000 | 1.000 | | drivers_license | 27 | 0.957 | 0.815 | 0.880 | | email | 82 | 0.988 | 1.000 | 0.994 | | ip_address | 9 | 1.000 | 1.000 | 1.000 | | name | 242 | 0.863 | 0.835 | 0.849 | | national_id | 59 | 0.869 | 0.898 | 0.883 | | passport_number | 21 | 0.955 | 1.000 | 0.977 | | phone_number | 65 | 0.984 | 0.969 | 0.977 | | tax_id | 24 | 0.840 | 0.875 | 0.857 | | username | 9 | 1.000 | 1.000 | 1.000 | Latency (transformers, single-prompt, greedy decoding): | mean | median | p95 | max | |---:|---:|---:|---:| | 3.15s | 2.77s | 6.45s | 9.82s | ## Quick start ```python from transformers import AutoModelForCausalLM, AutoTokenizer tok = AutoTokenizer.from_pretrained("Accuknoxtechnologies/PII-Qwen3.5-2B-v8") model = AutoModelForCausalLM.from_pretrained("Accuknoxtechnologies/PII-Qwen3.5-2B-v8", torch_dtype="auto", device_map="auto") prompt = "Please contact me at jane@example.com or +1 415 555 0100." msgs = [ {"role": "system", "content": ""}, {"role": "user", "content": prompt}, ] text = tok.apply_chat_template(msgs, tokenize=False, add_generation_prompt=True) out = model.generate(**tok(text, return_tensors="pt").to(model.device), max_new_tokens=512, do_sample=False, pad_token_id=tok.pad_token_id) print(tok.decode(out[0], skip_special_tokens=True)) ``` ## Evaluation — vLLM serving (merged model, text-only) Same **200 held-out prompts**, served through **vLLM `0.21.0`** instead of the transformers `.generate()` loop. Greedy decoding, dtype bf16, `enable_prefix_caching=True`, `enable_chunked_prefill=True`. This reflects production serving accuracy + latency. - JSON parse errors: `0/200` (`0.0%`) ### Accuracy (vLLM) | Metric | Value | |---|---:| | `is_valid` accuracy | **1.0000** | | category key-set accuracy | **0.9350** | | category value-set accuracy | **0.8300** | | Binary F1 (positive = contains PII) | **1.0000** | | Binary precision | 1.0000 | | Binary recall | 1.0000 | | Macro F1 (key-presence) | **0.9791** | | Macro F1 (value-set) | **0.9529** | ### Confusion matrix — binary `is_valid` (vLLM) | | predicted PII | predicted clean | |---|---:|---:| | **actual PII** | TP = 177 | FN = 0 | | **actual clean** | FP = 0 | TN = 23 | ### Per-category key-presence (vLLM) | Category | Support | Precision | Recall | F1 | |---|---:|---:|---:|---:| | address | 79 | 0.987 | 0.987 | 0.987 | | bank_account | 12 | 1.000 | 1.000 | 1.000 | | card_number | 25 | 1.000 | 1.000 | 1.000 | | credentials | 10 | 1.000 | 1.000 | 1.000 | | date | 95 | 1.000 | 1.000 | 1.000 | | drivers_license | 27 | 0.957 | 0.815 | 0.880 | | email | 76 | 0.987 | 1.000 | 0.993 | | ip_address | 9 | 1.000 | 1.000 | 1.000 | | name | 107 | 1.000 | 0.991 | 0.995 | | national_id | 52 | 0.911 | 0.981 | 0.944 | | passport_number | 21 | 0.955 | 1.000 | 0.977 | | phone_number | 63 | 1.000 | 0.984 | 0.992 | | tax_id | 24 | 0.920 | 0.958 | 0.939 | | username | 9 | 1.000 | 1.000 | 1.000 | ### vLLM inference latency (single-stream, batch = 1) | Stat | ms / prompt | |---|---:| | Mean | **576.0** | | Median | 511.6 | | p95 | 1151.7 | | p99 | 1440.7 | | Max | 3209.3 | | Under 1 s | 89.0% | ### vLLM throughput (single batched submit) - Prompts/sec: **27.73** - Output tokens/sec: 1569.0 - Input tokens/sec: 35596.5 - Batched wall time for all 200 prompts: 7.21 s --- *Card generated at 2026-05-31 07:39 UTC. Adapter weights: `Accuknoxtechnologies/PII-Qwen3.5-2B-v8`.*