license: other
license_name: ideogram-4-non-commercial
license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8
base_model: ideogram-ai/ideogram-4-fp8
pipeline_tag: text-to-image
tags:
- text-to-image
- diffusion
- flow-matching
- quantization
- gguf
- q4_k
- ideogram
Ideogram 4 — GGUF Q4_K (Transformer Lab)
A GGUF Q4_K (4.5 bits/weight) quantization of the Ideogram 4 DiT, for consumer GPUs.
Note: this checkpoint is the quantized DiT only (both CFG branches). To run it you also need the Qwen3-VL text encoder and VAE from the base repo
ideogram-ai/ideogram-4-fp8and the custom inference code atgithub.com/ideogram-oss/ideogram4. Quantization recipe + loader: seerecipe*.jsonand the Transformer Lab repo.
Why this one
Q4_K is the Pareto winner on the quality-vs-memory frontier: at 10.4 GB (the same on-disk size class as the published NF4 build) it beats NF4 on quality by +0.84 Pick / +2.93 CLIP on a 50-prompt slice. If you're tight on VRAM, this is the build to grab.
Method
Weight-only GGUF Q4_K of the DiT linears (custom NumPy quantizer, verified bit-exact against the gguf-py reference decoder); non-linear tensors kept F16.
Numbers (preliminary — single n=50 slice)
- Pick 19.08 / CLIP 18.68 vs NF4 18.24 / 15.75 at equal size.
- Latency ~203 s/img (48 steps, 1024², RTX 3090); ~23% slower than NF4.
- Full-battery validation is in progress.
License
Derived from Ideogram 4 under its non-commercial, research-only license. See LICENSE.