Instructions to use Adeely93/SAGE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Adeely93/SAGE with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Adeely93/SAGE", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,8 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
|
| 2 |
# SAGE: Structure-Aware Geometric Regularization (ECCV-26)
|
| 3 |
|
| 4 |
**Paper:** The Illusion of High Utility in Safety Alignment of Text-to-Image Diffusion Models
|
| 5 |
-
**Venue:** ECCV 2026
|
| 6 |
**Authors:** Adeel Yousaf, Soumik Ghosh, James Beetham, Amrit Singh Bedi, Mubarak Shah
|
| 7 |
**Institution:** University of Central Florida
|
| 8 |
**Project Page:** [https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/](https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/)
|
|
@@ -11,21 +20,25 @@
|
|
| 11 |
|
| 12 |
## Overview
|
| 13 |
|
| 14 |
-
We show that existing T2I safety alignment methods create an **illusion of high utility** — they appear to
|
| 15 |
|
| 16 |
**SAGE** is a geometry-aware safety alignment method that preserves embedding spread and local similarity structure during fine-tuning, achieving only a **−1.2% TIFA drop** vs. **−6.2% for DES** while maintaining strong safety (Avg. ASR 1.2%).
|
| 17 |
|
| 18 |
---
|
| 19 |
|
| 20 |
-
## Model
|
| 21 |
-
|
| 22 |
-
This is the fine-tuned **text encoder** of Stable Diffusion v1.4. The UNet remains frozen.
|
| 23 |
|
| 24 |
```python
|
| 25 |
import torch
|
| 26 |
from diffusers import StableDiffusionPipeline
|
|
|
|
| 27 |
|
|
|
|
| 28 |
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
|
| 29 |
-
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
| 31 |
pipe = pipe.to("cuda")
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
library_name: diffusers
|
| 4 |
+
tags:
|
| 5 |
+
- text-to-image
|
| 6 |
+
- safety-alignment
|
| 7 |
+
- stable-diffusion
|
| 8 |
+
- ECCV
|
| 9 |
+
pipeline_tag: text-to-image
|
| 10 |
+
---
|
| 11 |
|
| 12 |
# SAGE: Structure-Aware Geometric Regularization (ECCV-26)
|
| 13 |
|
| 14 |
**Paper:** The Illusion of High Utility in Safety Alignment of Text-to-Image Diffusion Models
|
|
|
|
| 15 |
**Authors:** Adeel Yousaf, Soumik Ghosh, James Beetham, Amrit Singh Bedi, Mubarak Shah
|
| 16 |
**Institution:** University of Central Florida
|
| 17 |
**Project Page:** [https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/](https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/)
|
|
|
|
| 20 |
|
| 21 |
## Overview
|
| 22 |
|
| 23 |
+
We show that existing T2I safety alignment methods create an **illusion of high utility** — they appear to maintain high utility under coarse metrics (FID, CLIPScore) but suffer significant drops in fine-grained semantic fidelity (TIFA). We trace this to **semantic collapse** in the text encoder embedding space.
|
| 24 |
|
| 25 |
**SAGE** is a geometry-aware safety alignment method that preserves embedding spread and local similarity structure during fine-tuning, achieving only a **−1.2% TIFA drop** vs. **−6.2% for DES** while maintaining strong safety (Avg. ASR 1.2%).
|
| 26 |
|
| 27 |
---
|
| 28 |
|
| 29 |
+
## Use this Model
|
|
|
|
|
|
|
| 30 |
|
| 31 |
```python
|
| 32 |
import torch
|
| 33 |
from diffusers import StableDiffusionPipeline
|
| 34 |
+
from huggingface_hub import hf_hub_download
|
| 35 |
|
| 36 |
+
# Load base pipeline
|
| 37 |
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
|
| 38 |
+
|
| 39 |
+
# Download and load SAGE text encoder weights
|
| 40 |
+
ckpt_path = hf_hub_download(repo_id="Adeely93/SAGE", filename="SAGE.pt")
|
| 41 |
+
pipe.text_encoder.load_state_dict(torch.load(ckpt_path, map_location="cpu"))
|
| 42 |
+
|
| 43 |
pipe = pipe.to("cuda")
|
| 44 |
+
image = pipe("a photo of a dog in a park").images[0]
|