Ideogram 4 โ€” GGUF Q8_0 (Transformer Lab)

A GGUF Q8_0 (8.5 bits/weight) quantization of the Ideogram 4 DiT.

โš ๏ธ Not a llama.cpp / stable-diffusion.cpp file. Despite the .gguf extension, this loads only via the included PyTorch gguf_loader.py + the ideogram4 pipeline. It is not compatible with llama.cpp, stable-diffusion.cpp, Ollama, etc.

โ„น๏ธ Quantized DiT only. This checkpoint is the DiT (both CFG branches). To generate you also need the Qwen3-VL text encoder and VAE from the base repo ideogram-ai/ideogram-4-fp8 and the custom inference code at github.com/ideogram-oss/ideogram4. The quantization recipe and loader are included in this repo (recipe-q8_0.json, gguf_loader.py).

Why this one

Q8_0 is quality-neutral vs the FP8 reference (Pick 18.71 vs ceiling 18.71) โ€” a clean, near-lossless 8-bit GGUF at 19.7 GB.

Method

Weight-only GGUF Q8_0 (round-to-nearest) of the DiT linears; non-linear tensors kept F16.

Numbers

  • Quality-neutral vs FP8 on a 50-prompt slice. Latency ~176 s/img (48 steps, 1024ยฒ, RTX 3090).

How to run (self-contained)

Everything you need is in this repo. The GGUF is the quantized DiT only, so step 1 fetches the text encoder + VAE + the inference package.

python download_deps.py            # one-time (needs gated access to ideogram-ai/ideogram-4-fp8)
python usage.py "a poster that says HELLO"

Files here: ideogram4-q8_0.gguf (the Q8_0 DiT), gguf_loader.py (dequant + load, reference), download_deps.py, usage.py, recipe-q8_0.json.

gguf_loader.py is a reference (dequant math validated; standalone loader not yet GPU-tested). This is not a llama.cpp / stable-diffusion.cpp file โ€” it loads only via this PyTorch path + the ideogram4 pipeline.

License

Derived from Ideogram 4 under its non-commercial, research-only license. See LICENSE.

Downloads last month
46
GGUF
Model size
19B params
Architecture
ideogram4
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for transformerlab/ideogram-4-gguf-q8_0

Quantized
(10)
this model