Instructions to use Adeely93/SAGE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Adeely93/SAGE with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Adeely93/SAGE", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
File size: 1,713 Bytes
aca88d9 3637f84 aca88d9 37b7831 3637f84 37b7831 cb5b297 37b7831 aca88d9 37b7831 cb5b297 37b7831 aca88d9 37b7831 aca88d9 37b7831 aca88d9 37b7831 aca88d9 37b7831 aca88d9 3637f84 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | ---
library_name: diffusers
license: mit
pipeline_tag: text-to-image
tags:
- text-to-image
- safety-alignment
- stable-diffusion
- ECCV
---
# SAGE: Structure-Aware Geometric Regularization (ECCV-26)
**Paper:** [The Illusion of High Utility in Safety Alignment of Text-to-Image Diffusion Models](https://huggingface.co/papers/2607.00402)
**Authors:** Adeel Yousaf, Soumik Ghosh, James Beetham, Amrit Singh Bedi, Mubarak Shah
**Institution:** University of Central Florida
**Project Page:** [https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/](https://adeelyousaf.github.io/SAGE_ECCV26_Project_Page/)
---
## Overview
We show that existing T2I safety alignment methods create an **illusion of high utility** — they appear to maintain high utility under coarse metrics (FID, CLIPScore) but suffer significant drops in fine-grained semantic fidelity (TIFA). We trace this to **semantic collapse** in the text encoder embedding space.
**SAGE** is a geometry-aware safety alignment method that preserves embedding spread and local similarity structure during fine-tuning, achieving only a **−1.2% TIFA drop** vs. **−6.2% for DES** while maintaining strong safety (Avg. ASR 1.2%).
---
## Use this Model
```python
import torch
from diffusers import StableDiffusionPipeline
from huggingface_hub import hf_hub_download
# Load base pipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
# Download and load SAGE text encoder weights
ckpt_path = hf_hub_download(repo_id="Adeely93/SAGE", filename="SAGE.pt")
pipe.text_encoder.load_state_dict(torch.load(ckpt_path, map_location="cpu"))
pipe = pipe.to("cuda")
image = pipe("a photo of a dog in a park").images[0]
``` |