language: en
library_name: optimum.neuron
tags:
  - diffusion
  - image-generation
  - aws
  - neuronx
  - inf2
  - flux
  - compiled
  - bfloat16
license: creativeml-openrail-m
datasets:
  - n/a
pipeline_tag: text-to-image
base_model: Freepik/flux.1-lite-8B

# Flux Lite 8B – 1024×1024 (Tensor Parallelism 2, AWS Inf2)

🚀 This repository contains the **compiled NeuronX graph** for running [Freepik’s Flux.1-Lite-8B](https://huggingface.co/Freepik/flux.1-lite-8B) model on **AWS Inferentia2 (Inf2)** instances, optimized for **1024×1024 image generation** with **tensor parallelism = 2**.

The model has been compiled using [🤗 Optimum Neuron](https://huggingface.co/docs/optimum/neuron/index) to leverage AWS NeuronCores for efficient inference at scale.

---

## 🔧 Compilation Details
- **Base model:** `Freepik/flux.1-lite-8B`
- **Framework:** [optimum-neuron](https://github.com/huggingface/optimum-neuron)
- **Tensor Parallelism:** `2` (splits model across 2 NeuronCores)
- **Input resolution:** `1024 × 1024`
- **Batch size:** `1`
- **Precision:** `bfloat16`
- **Auto-casting:** disabled (`auto_cast="none"`)

---

## 📥 Installation

Make sure you are running on an **AWS Inf2 instance** with the [AWS Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-intro.html) installed.

```bash
pip install "optimum[neuron]" torch torchvision
```

---


# 🛠 Re-compilation Example

To compile this model yourself:

```bash

from optimum.neuron import NeuronFluxPipeline

compiler_args = {"auto_cast": "none"}
input_shapes = {"batch_size": 1, "height": 1024, "width": 1024}

pipe = NeuronFluxPipeline.from_pretrained(
    "Freepik/flux.1-lite-8B",
    torch_dtype="bfloat16",
    export=True,
    tensor_parallel_size=2,
    **compiler_args,
    **input_shapes,
)

pipe.save_pretrained("flux_lite_neuronx_1024_tp2/")

```


---
license: apache-2.0
---