--- license: apache-2.0 base_model: black-forest-labs/FLUX.2-klein-4B tags: - core-ai - coreai - text-to-image - flux - flux2 - on-device - apple-silicon pipeline_tag: text-to-image library_name: coreai --- # FLUX.2 klein 4B — Core AI [Black Forest Labs' **FLUX.2 [klein] 4B**](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B) converted to **Core AI** for on-device image generation on Apple Silicon (macOS 27+), running on Apple's official diffusion runtime in [apple/coreai-models](https://github.com/apple/coreai-models). FLUX.2 [klein] is step-distilled: **4 denoising steps at guidance 1.0** produce a full 1024×1024 image. It pairs a 4B flow-matching diffusion transformer (DiT) with an 8B Qwen3 text encoder. > **macOS only.** At 4B the peak footprint (~6.5 GB — the text encoder stays resident > through the transformer) exceeds a 12 GB iPhone's ~6.1 GB per-process memory limit, even > with the transformer AOT-compiled. Use a smaller diffusion model (e.g. Stable Diffusion > 0.9B) for on-device iOS image generation. ## Components | Component | Description | | --- | --- | | `Transformer.aimodel` | Flow-matching DiT (25 blocks), 1024×1024 | | `TextEncoder.aimodel` | Qwen3 text encoder (hidden states 9 / 18 / 27) | | `VAEDecoder.aimodel` | Latent → 1024×1024 RGB image | | `VAEEncoder.aimodel` | 1024×1024 RGB image → latent (image-to-image) | | `tokenizer/`, `pipeline.json`, `vae_bn_*.npy` | Sidecar assets (auto-loaded) | Weights are 4-bit quantized (int4, per-block, block size 32); compute precision float16. The full bundle is **4.0 GB** — Transformer 2.0 GB · TextEncoder 1.8 GB · VAE 0.16 GB. ## Usage ### Sample app (easiest) [**CoreAIImageGen** (macOS)](https://github.com/john-rocky/coreai-model-zoo/tree/main/apps/CoreAIImageGen) — run the `CoreAIImageGenMac` scheme, tap **Download & Load**, type a prompt, **Generate**. ### Swift ```swift import CoreAIDiffusionPipeline let pipeline = try await Flux2Pipeline(from: modelURL) let config = PipelineConfiguration( prompt: "a photo of a cat", stepCount: 4, guidanceScale: 1.0, schedulerType: .discreteFlow ) let result = try await pipeline.generateImages(configuration: config) { _ in true } let image = result.images.first! ``` ### Command line (zoo reference tool) ```bash swift run -c release diffusion-runner \ --model path/to/FLUX.2-klein-4B \ --prompt "a photo of a cat" --steps 4 --guidance-scale 1.0 ``` ## How it was converted ```bash uv run coreai.diffusion.export flux2-klein-4b --platform macOS ``` ## Performance M4 Max (128 GB): **~17 s** for a 4-step 1024×1024 image (cold model load + 4 denoising steps + VAE decode). The distilled 4-step schedule means no negative prompt / CFG is needed (guidance 1.0). ## License Apache 2.0, inherited from the base model [black-forest-labs/FLUX.2-klein-4B](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B). The converted weights are redistributed under the same terms, with attribution to Black Forest Labs.