deep1401's picture
Upload README.md with huggingface_hub
491fb61 verified
|
Raw
History Blame
1.63 kB
metadata
license: other
license_name: ideogram-4-non-commercial
license_link: https://huggingface.co/ideogram-ai/ideogram-4-fp8
base_model: ideogram-ai/ideogram-4-fp8
pipeline_tag: text-to-image
tags:
  - text-to-image
  - diffusion
  - flow-matching
  - quantization
  - gguf
  - q4_k
  - ideogram

Ideogram 4 — GGUF Q4_K (Transformer Lab)

A GGUF Q4_K (4.5 bits/weight) quantization of the Ideogram 4 DiT, for consumer GPUs.

Note: this checkpoint is the quantized DiT only (both CFG branches). To run it you also need the Qwen3-VL text encoder and VAE from the base repo ideogram-ai/ideogram-4-fp8 and the custom inference code at github.com/ideogram-oss/ideogram4. Quantization recipe + loader: see recipe*.json and the Transformer Lab repo.

Why this one

Q4_K is the Pareto winner on the quality-vs-memory frontier: at 10.4 GB (the same on-disk size class as the published NF4 build) it beats NF4 on quality by +0.84 Pick / +2.93 CLIP on a 50-prompt slice. If you're tight on VRAM, this is the build to grab.

Method

Weight-only GGUF Q4_K of the DiT linears (custom NumPy quantizer, verified bit-exact against the gguf-py reference decoder); non-linear tensors kept F16.

Numbers (preliminary — single n=50 slice)

  • Pick 19.08 / CLIP 18.68 vs NF4 18.24 / 15.75 at equal size.
  • Latency ~203 s/img (48 steps, 1024², RTX 3090); ~23% slower than NF4.
  • Full-battery validation is in progress.

License

Derived from Ideogram 4 under its non-commercial, research-only license. See LICENSE.