---
license: apache-2.0
language:
- en
base_model: Tongyi-MAI/Z-Image-Turbo
base_model_relation: quantized
pipeline_tag: text-to-image
library_name: diffusers
tags:
- image
- image-generation
- z-image
- z-image-turbo
- fp8
- z-image-fp8
- z-image-turbo-fp8
---
# Z-Image-Turbo (FP8 E5M2 & E4M3FN)

This is a quantization of [Tongyi-MAI/Z-Image-Turbo](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) to **FP8 E5M2** and **FP8 E4M3FN**.

**License & Usage:**
This model strictly follows the original licensing terms and usage restrictions. Please refer to the [original model card](https://huggingface.co/Tongyi-MAI/Z-Image-Turbo) for details.

## Files in this repo
- Full Diffusers pipeline copied from `Tongyi-MAI/Z-Image` (cached snapshot) with FP8 transformer weights.
- Default transformer weights: `transformer/diffusion_pytorch_model.safetensors` (E4M3FN).
- Alternate transformer weights: `transformer/diffusion_pytorch_model_e5m2.safetensors` (E5M2).

To switch variants, load the pipeline and replace the transformer weights from the alternate file in `transformer/`.

## Requirements
- PyTorch with CUDA support (tested with 2.10.0+cu130)
- Diffusers (latest `main` recommended)
- For FP8 execution: NVIDIA Transformer Engine (TE) built for your CUDA + Python version

## FP8 execution (Transformer Engine)
The sample script `create-image.py` uses NVIDIA Transformer Engine (TE) to run FP8 kernels on supported GPUs (e.g., Blackwell).
Install TE in your environment and run the script from this repo directory.

## BF16 fallback (no FP8 kernels)
For GPUs without FP8 kernel support (or if TE is unavailable), use `create-image-bf16.py`. It loads the same FP8 weights
but casts to BF16 for compute so it runs everywhere (at lower speed vs true FP8).

## Usage
- After downloading, the scripts default to `MODEL_ID=ykarout/Z-Image-Turbo-FP8-Full`.
- To force local loading, set `USE_LOCAL=1`.