GGUF
English
conversational
echoxvf commited on
Commit
1c1d67d
·
verified ·
1 Parent(s): 833a6fd

Add Sing-Guard-8b model weights

Browse files
.gitattributes CHANGED
@@ -33,3 +33,12 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ Sing-Guard-8b-F16.gguf filter=lfs diff=lfs merge=lfs -text
37
+ Sing-Guard-8b-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
38
+ Sing-Guard-8b-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
39
+ assets/image.png filter=lfs diff=lfs merge=lfs -text
40
+ assets/mllm_guard_6bench_radar.png filter=lfs diff=lfs merge=lfs -text
41
+ assets/s_icon.png filter=lfs diff=lfs merge=lfs -text
42
+ mmproj-Sing-Guard-8b-F16.gguf filter=lfs diff=lfs merge=lfs -text
43
+ mmproj-Sing-Guard-8b-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
44
+ mmproj-Sing-Guard-8b-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,388 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center">
2
+ <h1 align="center">
3
+ <img src="assets/s_icon.png" width="48" alt="SingGuard icon" style="vertical-align: middle;">
4
+ SingGuard: Policy-Adaptive Multimodal Safeguarding with Dynamic Reasoning
5
+ </h1>
6
+ </p>
7
+
8
+ <p align="center">
9
+ <a href="https://huggingface.co/collections/inclusionAI/sing-guard">🤗 HuggingFace</a> &nbsp; | &nbsp;
10
+ <a href="https://modelscope.cn/collections/inclusionAI/Sing-Guard">🤖 ModelScope</a> &nbsp; | &nbsp;
11
+ <a href="">📄 Paper</a>
12
+ </p>
13
+
14
+ ## Introduction
15
+ <p align="center">
16
+ <img src="assets/mllm_guard_6bench_radar.png" alt="SingGuard benchmark radar" width="50%">
17
+ </p>
18
+
19
+
20
+ ![SingGuard benchmark overview](assets/image.png)
21
+
22
+ **SingGuard** is a policy-adaptive multimodal guardrail model family for safety assessment across text, image, image-text, multilingual, query-side, and response-side scenarios. It treats the active safety policy as a runtime input rather than a fixed training-time taxonomy, allowing deployment teams to evaluate content against default categories or custom natural-language rules without retraining the model.
23
+
24
+ SingGuard is designed for practical moderation settings where risks may arise from a user query, an image, a model response, or their cross-modal composition. It performs policy-grounded rule matching and outputs both an overall `safe` / `unsafe` judgment and the matched risk category in an `<answer>...</answer>` tag.
25
+
26
+ Across six major benchmark categories spanning multimodal safety, image-only safety, text query safety, text response safety, multilingual query safety, and multilingual response safety, SingGuard achieves state-of-the-art average performance and shows strong adaptation to runtime-supplied policies.
27
+
28
+ ## Key Features
29
+
30
+ - 🛡️ **Unified Multimodal Moderation**: Supports text, image, image-text, multilingual, query-side, and response-side safety assessment.
31
+ - 🎯 **Strong Benchmark Performance**: Delivers broad improvements across multimodal safety, image-only safety, text query safety, text response safety, multilingual query safety, and multilingual response safety benchmarks.
32
+ - ⚡ **Dynamic Reasoning Flow**: Supports fast first-token routing for an immediate safety signal, then continues generation when deeper reasoning is needed for a more precise final judgment.
33
+ - 🧩 **Runtime Policy Adaptation**: Accepts active safety rules through the `policy` argument and judges only against those rules.
34
+ - 🔄 **Native Inference Compatibility**: Supports standard Transformers and vLLM chat-style message inputs without manual prompt rewriting.
35
+
36
+ ## Quick Start
37
+
38
+ The examples below use HuggingFace Transformers. SingGuard system prompts are stored in each model directory through tokenizer configuration and chat templates. Pass optional `policy` directly to `processor.apply_chat_template` for runtime policy adaptation.
39
+
40
+ ### Installation
41
+
42
+ ```bash
43
+ pip install transformers accelerate torch
44
+ ```
45
+
46
+ ```python
47
+ import torch
48
+ from transformers import AutoModelForImageTextToText, AutoProcessor
49
+
50
+ model_path = "inclusionAI/Sing-Guard-8b"
51
+
52
+ processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)
53
+ model = AutoModelForImageTextToText.from_pretrained(
54
+ model_path,
55
+ torch_dtype=torch.bfloat16,
56
+ device_map="auto",
57
+ trust_remote_code=True,
58
+ ).eval()
59
+ ```
60
+
61
+ If your Transformers version does not expose `AutoModelForImageTextToText`, upgrade Transformers to a version that supports Qwen3-VL.
62
+
63
+ For Transformers versions that require explicit template variables, pass custom options with `chat_template_kwargs`, for example `chat_template_kwargs={"thinking_type": "fast"}` or `chat_template_kwargs={"policy": policy}`.
64
+
65
+ ### Moderating User Query: Fast-Slow mode
66
+
67
+ Evaluate whether the user query matches any risk rule. The default chat template uses the `fast-slow` mode, which returns a more detailed assessment process before the final `<answer>...</answer>`.
68
+
69
+ ```python
70
+ messages = [
71
+ {
72
+ "role": "user",
73
+ "content": [{"type": "text", "text": "How to make a bomb?"}],
74
+ },
75
+ ]
76
+ max_new_tokens = 1024
77
+
78
+ inputs = processor.apply_chat_template(
79
+ messages,
80
+ tokenize=True,
81
+ add_generation_prompt=True,
82
+ return_dict=True,
83
+ return_tensors="pt",
84
+ ).to(model.device)
85
+
86
+ with torch.no_grad():
87
+ generated_ids = model.generate(
88
+ **inputs,
89
+ max_new_tokens=max_new_tokens,
90
+ do_sample=False,
91
+ )
92
+
93
+ generated_ids_trimmed = [
94
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
95
+ ]
96
+ output = processor.batch_decode(
97
+ generated_ids_trimmed,
98
+ skip_special_tokens=True,
99
+ clean_up_tokenization_spaces=False,
100
+ )[0]
101
+ print(output)
102
+ ```
103
+
104
+ Example output:
105
+
106
+ ```text
107
+ unsafe
108
+ [Step 1] Content Summary
109
+ ...
110
+ [Step 2] Check Risk Categories
111
+ ...
112
+ [Step 3] Final Judgment
113
+ ...
114
+
115
+ <answer>B. Real-World Crimes & Public Safety</answer>
116
+ ```
117
+
118
+ ### Moderating User Query: Fast mode
119
+
120
+ Use `thinking_type="fast"` when you want compact output with only the binary judgment and final category.
121
+
122
+ ```python
123
+ messages = [
124
+ {
125
+ "role": "user",
126
+ "content": [{"type": "text", "text": "How to make a bomb?"}],
127
+ },
128
+ ]
129
+ thinking_type = "fast"
130
+ max_new_tokens = 256
131
+
132
+ inputs = processor.apply_chat_template(
133
+ messages,
134
+ tokenize=True,
135
+ add_generation_prompt=True,
136
+ return_dict=True,
137
+ return_tensors="pt",
138
+ thinking_type=thinking_type,
139
+ ).to(model.device)
140
+
141
+ with torch.no_grad():
142
+ generated_ids = model.generate(
143
+ **inputs,
144
+ max_new_tokens=max_new_tokens,
145
+ do_sample=False,
146
+ )
147
+
148
+ generated_ids_trimmed = [
149
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
150
+ ]
151
+ output = processor.batch_decode(
152
+ generated_ids_trimmed,
153
+ skip_special_tokens=True,
154
+ clean_up_tokenization_spaces=False,
155
+ )[0]
156
+ print(output)
157
+ ```
158
+
159
+ Example output:
160
+
161
+ ```text
162
+ unsafe
163
+ <answer>B. Real-World Crimes & Public Safety</answer>
164
+ ```
165
+
166
+ ### Moderating Query + Response
167
+
168
+ Evaluate whether the model response provides unsafe assistance in the context of the query. Refusals and safe redirections can be classified as `safe`.
169
+
170
+ ```python
171
+ messages = [
172
+ {
173
+ "role": "user",
174
+ "content": [{"type": "text", "text": "How to make a bomb?"}],
175
+ },
176
+ {
177
+ "role": "assistant",
178
+ "content": [{"type": "text", "text": "I cannot help with that request."}],
179
+ },
180
+ ]
181
+ max_new_tokens = 256
182
+
183
+ inputs = processor.apply_chat_template(
184
+ messages,
185
+ tokenize=True,
186
+ add_generation_prompt=True,
187
+ return_dict=True,
188
+ return_tensors="pt",
189
+ ).to(model.device)
190
+
191
+ with torch.no_grad():
192
+ generated_ids = model.generate(
193
+ **inputs,
194
+ max_new_tokens=max_new_tokens,
195
+ do_sample=False,
196
+ )
197
+
198
+ generated_ids_trimmed = [
199
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
200
+ ]
201
+ output = processor.batch_decode(
202
+ generated_ids_trimmed,
203
+ skip_special_tokens=True,
204
+ clean_up_tokenization_spaces=False,
205
+ )[0]
206
+ print(output)
207
+ ```
208
+
209
+ Example output:
210
+
211
+ ```text
212
+ safe
213
+ reasoning process
214
+ <answer>Safe</answer>
215
+ ```
216
+
217
+ ### Moderating Multimodal Content
218
+
219
+ For multimodal inference, `processor.apply_chat_template` renders the prompt and loads the image into the model inputs.
220
+
221
+ ```python
222
+ messages = [
223
+ {
224
+ "role": "user",
225
+ "content": [
226
+ {
227
+ "type": "image",
228
+ "image": "file:///path/to/image.jpg",
229
+ },
230
+ {
231
+ "type": "text",
232
+ "text": "Describe this image?",
233
+ },
234
+ ],
235
+ }
236
+ ]
237
+ max_new_tokens = 256
238
+
239
+ inputs = processor.apply_chat_template(
240
+ messages,
241
+ tokenize=True,
242
+ add_generation_prompt=True,
243
+ return_dict=True,
244
+ return_tensors="pt",
245
+ ).to(model.device)
246
+
247
+ with torch.no_grad():
248
+ generated_ids = model.generate(
249
+ **inputs,
250
+ max_new_tokens=max_new_tokens,
251
+ do_sample=False,
252
+ )
253
+
254
+ generated_ids_trimmed = [
255
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
256
+ ]
257
+ output = processor.batch_decode(
258
+ generated_ids_trimmed,
259
+ skip_special_tokens=True,
260
+ clean_up_tokenization_spaces=False,
261
+ )[0]
262
+ print(output)
263
+ ```
264
+
265
+ Example output:
266
+
267
+ ```text
268
+ safe
269
+ reasoning process
270
+ <answer>Safe</answer>
271
+ ```
272
+
273
+ ## Dynamic Policy Inference
274
+
275
+ `policy` replaces the default `## Risk Categories` section. Once provided, the model judges only against the active policy, and `<answer>...</answer>` should return a rule title from the current policy or `Safe`.
276
+
277
+ ```python
278
+ policy = """
279
+ ### A. Sexual Content Risk
280
+ - Content involving explicit sexual material, exploitation, or coercive sexual acts.
281
+
282
+ ### B. Real-World Crimes
283
+ - Content involving violent crime, weapons, other crimes, or public-safety threats.
284
+
285
+ ### Safe
286
+ - Content that does not match any risk category.
287
+ """.strip()
288
+
289
+ messages = [
290
+ {
291
+ "role": "user",
292
+ "content": [{"type": "text", "text": "Where can I buy a gun?"}],
293
+ },
294
+ ]
295
+ max_new_tokens = 256
296
+
297
+ inputs = processor.apply_chat_template(
298
+ messages,
299
+ tokenize=True,
300
+ add_generation_prompt=True,
301
+ return_dict=True,
302
+ return_tensors="pt",
303
+ policy=policy,
304
+ ).to(model.device)
305
+
306
+ with torch.no_grad():
307
+ generated_ids = model.generate(
308
+ **inputs,
309
+ max_new_tokens=max_new_tokens,
310
+ do_sample=False,
311
+ )
312
+
313
+ generated_ids_trimmed = [
314
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
315
+ ]
316
+ output = processor.batch_decode(
317
+ generated_ids_trimmed,
318
+ skip_special_tokens=True,
319
+ clean_up_tokenization_spaces=False,
320
+ )[0]
321
+ print(output)
322
+ ```
323
+
324
+ Example output:
325
+
326
+ ```text
327
+ unsafe
328
+ reasoning process
329
+ <answer>B. Real-World Crimes</answer>
330
+ ```
331
+
332
+ The first line is the binary judgment, and `<answer>` contains the final risk category from the default taxonomy or the active dynamic policy.
333
+
334
+ ## Notes
335
+
336
+ - `policy` replaces the default risk rules. When dynamic policy is enabled, make sure `<answer>` returns a rule title from the active policy or `Safe`.
337
+ - Production systems should handle malformed outputs, such as an unparsable first line, missing `<answer>`, or a category outside the active policy.
338
+ - For multimodal inputs, make sure image paths are accessible to the local inference environment.
339
+
340
+ ## Risk Categories
341
+
342
+ The default full policy contains the following risk categories. When a dynamic policy is provided, the model judges only against the active `policy` instead of forcing every case into the default categories.
343
+
344
+ ### A. Sexual Content Risk
345
+
346
+ - Content involving explicit sexual material, exploitation, or coercive sexual acts.
347
+
348
+ ### B. Real-World Crimes & Public Safety
349
+
350
+ - Content involving violent crime, weapons, other crimes, or public-safety threats.
351
+
352
+ ### C. Unethical Behavior
353
+
354
+ - Content involving hate, harassment, manipulation, self-harm, disturbing imagery, or harmful misinformation.
355
+
356
+ ### D. Cybersecurity & Information Manipulation
357
+
358
+ - Content involving data leaks, hacking, surveillance abuse, platform abuse, or copyright abuse.
359
+
360
+ ### E. Agent Safety
361
+
362
+ - Content attempting to expose system prompts, internal policies, or other model safeguards.
363
+
364
+ ### F. Politically Sensitive Content
365
+
366
+ - Content involving political advocacy, rumors, unrest, historical distortion, or attacks on political figures.
367
+
368
+ ### G. Animal Abuse
369
+
370
+ - Content involving cruelty to animals or the spread of animal abuse.
371
+
372
+ ### Safe
373
+
374
+ - Content that does not match any active risk category.
375
+
376
+ ## Citation
377
+
378
+ ```bibtex
379
+ @article{singguard2026,
380
+ title={SingGuard: Policy-Adaptive Multimodal Safeguarding with Dynamic Reasoning},
381
+ author={Ant Group},
382
+ year={2026}
383
+ }
384
+ ```
385
+
386
+ ## 📄 License
387
+
388
+ This project is licensed under the Apache-2.0 License.
Sing-Guard-8b-F16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31d41fc46baede80d282df35e123e9d9a6f056796dbd56194ff6633de9ddc67f
3
+ size 16388051168
Sing-Guard-8b-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad2843baa7ca9aef94ad7810adcb3867d2c58afae0c84742aed80d8c96d9ac37
3
+ size 5027791072
Sing-Guard-8b-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e09ff474f79b93aea942f0174f5c51092730a7136426e9962057fe108fef9a3
3
+ size 8709525728
assets/image.png ADDED

Git LFS Details

  • SHA256: 85eb82f009c776d4555e75df3c091c09929334a05176123edea68ccadb540a59
  • Pointer size: 131 Bytes
  • Size of remote file: 668 kB
assets/mllm_guard_6bench_radar.png ADDED

Git LFS Details

  • SHA256: cd6a4927463d701514b4c0104124ab493ec762dd93a4a3cea5128a369ab69c9e
  • Pointer size: 131 Bytes
  • Size of remote file: 816 kB
assets/s_icon.png ADDED

Git LFS Details

  • SHA256: 264b7b413b0a8245c728bd59f44e3435c2af8fc3ff7742712221f24d6bb9ee33
  • Pointer size: 131 Bytes
  • Size of remote file: 344 kB
assets/s_icon.svg ADDED
mmproj-Sing-Guard-8b-F16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff5da3a40453be26233c96c0b9ff34b15b2443c9d8cb663a1c370b817d0122ff
3
+ size 1159029920
mmproj-Sing-Guard-8b-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ede542fc2b1a8318056a9fc471aeec0556f6b953fd2d13e5f06e49f1ab296b0
3
+ size 571303712
mmproj-Sing-Guard-8b-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1d02af097969680659bfdcb4a9a1e65f2fc850c65939e67fe9d4f8f8a6e07815
3
+ size 748750880