---
license: apache-2.0
base_model: black-forest-labs/FLUX.2-klein-4B
tags:
  - core-ai
  - coreai
  - text-to-image
  - flux
  - flux2
  - on-device
  - apple-silicon
pipeline_tag: text-to-image
library_name: coreai
---

# FLUX.2 klein 4B — Core AI

[Black Forest Labs' **FLUX.2 [klein] 4B**](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B)
converted to **Core AI** for on-device image generation on Apple Silicon (macOS 27+),
running on Apple's official diffusion runtime in
[apple/coreai-models](https://github.com/apple/coreai-models).

FLUX.2 [klein] is step-distilled: **4 denoising steps at guidance 1.0** produce a full
1024×1024 image. It pairs a 4B flow-matching diffusion transformer (DiT) with an 8B
Qwen3 text encoder.

> **macOS only.** At 4B the peak footprint (~6.5 GB — the text encoder stays resident
> through the transformer) exceeds a 12 GB iPhone's ~6.1 GB per-process memory limit, even
> with the transformer AOT-compiled. Use a smaller diffusion model (e.g. Stable Diffusion
> 0.9B) for on-device iOS image generation.

## Components

| Component | Description |
| --- | --- |
| `Transformer.aimodel` | Flow-matching DiT (25 blocks), 1024×1024 |
| `TextEncoder.aimodel` | Qwen3 text encoder (hidden states 9 / 18 / 27) |
| `VAEDecoder.aimodel` | Latent → 1024×1024 RGB image |
| `VAEEncoder.aimodel` | 1024×1024 RGB image → latent (image-to-image) |
| `tokenizer/`, `pipeline.json`, `vae_bn_*.npy` | Sidecar assets (auto-loaded) |

Weights are 4-bit quantized (int4, per-block, block size 32); compute precision
float16. The full bundle is **4.0 GB** — Transformer 2.0 GB · TextEncoder 1.8 GB ·
VAE 0.16 GB.

## Usage

### Sample app (easiest)

[**CoreAIImageGen** (macOS)](https://github.com/john-rocky/coreai-model-zoo/tree/main/apps/CoreAIImageGen)
— run the `CoreAIImageGenMac` scheme, tap **Download & Load**, type a prompt, **Generate**.

### Swift

```swift
import CoreAIDiffusionPipeline

let pipeline = try await Flux2Pipeline(from: modelURL)
let config = PipelineConfiguration(
    prompt: "a photo of a cat",
    stepCount: 4,
    guidanceScale: 1.0,
    schedulerType: .discreteFlow
)
let result = try await pipeline.generateImages(configuration: config) { _ in true }
let image = result.images.first!
```

### Command line (zoo reference tool)

```bash
swift run -c release diffusion-runner \
  --model path/to/FLUX.2-klein-4B \
  --prompt "a photo of a cat" --steps 4 --guidance-scale 1.0
```

## How it was converted

```bash
uv run coreai.diffusion.export flux2-klein-4b --platform macOS
```

## Performance

M4 Max (128 GB): **~17 s** for a 4-step 1024×1024 image (cold model load + 4 denoising
steps + VAE decode). The distilled 4-step schedule means no negative prompt / CFG is
needed (guidance 1.0).

## License

Apache 2.0, inherited from the base model
[black-forest-labs/FLUX.2-klein-4B](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B).
The converted weights are redistributed under the same terms, with attribution to
Black Forest Labs.