Yash1005 commited on
Commit
502ecf7
·
verified ·
1 Parent(s): 00e9841

update model card

Browse files
Files changed (1) hide show
  1. README.md +57 -1
README.md CHANGED
@@ -112,5 +112,61 @@ out = model.generate(**tok(text, return_tensors="pt").to(model.device),
112
  print(tok.decode(out[0], skip_special_tokens=True))
113
  ```
114
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
115
  ---
116
- *Card generated at 2026-05-31 07:14 UTC. Adapter weights: `Accuknoxtechnologies/PII-Qwen3.5-2B-v8`.*
 
112
  print(tok.decode(out[0], skip_special_tokens=True))
113
  ```
114
 
115
+ ## Evaluation — vLLM serving (merged model, text-only)
116
+ Same **200 held-out prompts**, served through **vLLM `0.21.0`** instead of the transformers `.generate()` loop. Greedy decoding, dtype bf16, `enable_prefix_caching=True`, `enable_chunked_prefill=True`. This reflects production serving accuracy + latency.
117
+ - JSON parse errors: `0/200` (`0.0%`)
118
+
119
+ ### Accuracy (vLLM)
120
+ | Metric | Value |
121
+ |---|---:|
122
+ | `is_valid` accuracy | **1.0000** |
123
+ | category key-set accuracy | **0.9350** |
124
+ | category value-set accuracy | **0.8300** |
125
+ | Binary F1 (positive = contains PII) | **1.0000** |
126
+ | Binary precision | 1.0000 |
127
+ | Binary recall | 1.0000 |
128
+ | Macro F1 (key-presence) | **0.9791** |
129
+ | Macro F1 (value-set) | **0.9529** |
130
+
131
+ ### Confusion matrix — binary `is_valid` (vLLM)
132
+ | | predicted PII | predicted clean |
133
+ |---|---:|---:|
134
+ | **actual PII** | TP = 177 | FN = 0 |
135
+ | **actual clean** | FP = 0 | TN = 23 |
136
+
137
+ ### Per-category key-presence (vLLM)
138
+ | Category | Support | Precision | Recall | F1 |
139
+ |---|---:|---:|---:|---:|
140
+ | address | 79 | 0.987 | 0.987 | 0.987 |
141
+ | bank_account | 12 | 1.000 | 1.000 | 1.000 |
142
+ | card_number | 25 | 1.000 | 1.000 | 1.000 |
143
+ | credentials | 10 | 1.000 | 1.000 | 1.000 |
144
+ | date | 95 | 1.000 | 1.000 | 1.000 |
145
+ | drivers_license | 27 | 0.957 | 0.815 | 0.880 |
146
+ | email | 76 | 0.987 | 1.000 | 0.993 |
147
+ | ip_address | 9 | 1.000 | 1.000 | 1.000 |
148
+ | name | 107 | 1.000 | 0.991 | 0.995 |
149
+ | national_id | 52 | 0.911 | 0.981 | 0.944 |
150
+ | passport_number | 21 | 0.955 | 1.000 | 0.977 |
151
+ | phone_number | 63 | 1.000 | 0.984 | 0.992 |
152
+ | tax_id | 24 | 0.920 | 0.958 | 0.939 |
153
+ | username | 9 | 1.000 | 1.000 | 1.000 |
154
+
155
+ ### vLLM inference latency (single-stream, batch = 1)
156
+ | Stat | ms / prompt |
157
+ |---|---:|
158
+ | Mean | **576.0** |
159
+ | Median | 511.6 |
160
+ | p95 | 1151.7 |
161
+ | p99 | 1440.7 |
162
+ | Max | 3209.3 |
163
+ | Under 1 s | 89.0% |
164
+
165
+ ### vLLM throughput (single batched submit)
166
+ - Prompts/sec: **27.73**
167
+ - Output tokens/sec: 1569.0
168
+ - Input tokens/sec: 35596.5
169
+ - Batched wall time for all 200 prompts: 7.21 s
170
+
171
  ---
172
+ *Card generated at 2026-05-31 07:39 UTC. Adapter weights: `Accuknoxtechnologies/PII-Qwen3.5-2B-v8`.*