Z-Image Turbo + Base β INT8 ConvRot (with Qwen 3 4B Text Encoder)
INT8 row-wise quantized versions of Z-Image Base and Z-Image Turbo, using ConvRot (Hadamard-rotation outlier suppression) for improved quantization fidelity, plus a matching INT8 ConvRot quantization of the Qwen 3 4B text encoder. Converted with convert_to_quant for native ComfyUI compatibility.
Files
| File | Description |
|---|---|
z_image_int8_convrot.safetensors |
Z-Image Base, INT8 + ConvRot |
z_image_turbo_int8_convrot.safetensors |
Z-Image Turbo, INT8 + ConvRot |
qwen_3_4b_int8_convrot.safetensors |
Qwen 3 4B text encoder, INT8 + ConvRot |
Why ConvRot + Row-Wise Scaling
ConvRot applies a group-wise Hadamard rotation to suppress weight outliers before quantization, improving INT8 fidelity versus plain per-tensor or per-row quantization alone. Critically, these conversions use --scaling_mode row, not tensor. Tensor-wise scaling computes a single scale factor for an entire weight matrix; even a small number of outlier values forces that global scale to widen, coarsening quantization precision across the rest of the matrix. In testing, this combination (ConvRot + tensor-wise scaling) produced visibly fuzzy, detail-smoothed output. Switching to row-wise scaling β which computes an independent scale per row, isolating outliers to the rows that contain them β resolved this and produced output sharpness matching or exceeding plain INT8 row-wise quantization.
If you encounter other ConvRot-quantized models with soft or "waxy" output, this scaling mode mismatch is the most likely culprit.
Quantization Recipe
ctq -i <model>.safetensors -o <model>-int8-convrot.safetensors \
--int8 --scaling_mode row --simple --low-memory \
--convrot --convrot-group-size 64 \
--zimage --comfy_quant --save-quant-metadata
The Qwen 3 4B text encoder was converted with the same flags, omitting --zimage (no architecture-specific preset needed for this text encoder; verify its native hidden dimensions divide cleanly by the chosen group size before quantizing).
Why group size 64
ComfyUI's comfy_kitchen runtime requires the ConvRot Hadamard block size to be a power of 4 (4, 16, 64, 256, 1024β¦), not merely a power of 2. A group size of 64 was chosen because it divides cleanly into every 2D weight dimension in the Z-Image architecture, requiring no manual layer exclusions.
Usage in ComfyUI
Load z_image_int8_convrot.safetensors or z_image_turbo_int8_convrot.safetensors with a standard UNETLoader node, and qwen_3_4b_int8_convrot.safetensors with a CLIPLoader node (type: your Z-Image workflow's text encoder type). No special ConvRot-aware nodes are required; the rotation metadata is embedded via --save-quant-metadata and read automatically by ComfyUI's mixed-precision quantization ops.
Hardware
Converted and tested on an RTX 3070 (8GB VRAM) using --low-memory streaming conversion.
Credits
Quantization tooling: silveroxides/convert_to_quant Base models: Tongyi-MAI/Z-Image, Tongyi-MAI/Z-Image-Turbo
Model tree for Winnougan/Z-Image-Base-Turbo-INT8-Convrot
Base model
Tongyi-MAI/Z-Image