license: other
license_name: ideogram-4-non-commercial
license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8
base_model: ideogram-ai/ideogram-4-fp8
pipeline_tag: text-to-image
tags:
- text-to-image
- diffusion
- flow-matching
- quantization
- gguf
- q8_0
- ideogram
Ideogram 4 — GGUF Q8_0 (Transformer Lab)
A GGUF Q8_0 (8.5 bits/weight) quantization of the Ideogram 4 DiT.
Note: this checkpoint is the quantized DiT only (both CFG branches). To run it you also need the Qwen3-VL text encoder and VAE from the base repo
ideogram-ai/ideogram-4-fp8and the custom inference code atgithub.com/ideogram-oss/ideogram4. The quantization recipe and loader are included in this repo (recipe-q8_0.json,gguf_loader.py).
Why this one
Q8_0 is quality-neutral vs the FP8 reference (Pick 18.71 vs ceiling 18.71) — a clean, near-lossless 8-bit GGUF at 19.7 GB.
Method
Weight-only GGUF Q8_0 (round-to-nearest) of the DiT linears; non-linear tensors kept F16.
Numbers
- Quality-neutral vs FP8 on a 50-prompt slice. Latency ~176 s/img (48 steps, 1024², RTX 3090).
How to run (self-contained)
Everything you need is in this repo. The GGUF is the quantized DiT only, so step 1 fetches the text encoder + VAE + the inference package.
python download_deps.py # one-time (needs gated access to ideogram-ai/ideogram-4-fp8)
python usage.py "a poster that says HELLO"
Files here: ideogram4-q8_0.gguf (the Q8_0 DiT), gguf_loader.py (dequant + load, reference),
download_deps.py, usage.py, recipe-q8_0.json.
gguf_loader.pyis a reference (dequant math validated; standalone loader not yet GPU-tested). This is not a llama.cpp / stable-diffusion.cpp file — it loads only via this PyTorch path + theideogram4pipeline.
License
Derived from Ideogram 4 under its non-commercial, research-only license. See LICENSE.