language: en library_name: optimum.neuron tags: - diffusion - image-generation - aws - neuronx - inf2 - flux - compiled - bfloat16 license: creativeml-openrail-m datasets: - n/a pipeline_tag: text-to-image base_model: Freepik/flux.1-lite-8B # Flux Lite 8B โ€“ 1024ร—1024 (Tensor Parallelism 2, AWS Inf2) ๐Ÿš€ This repository contains the **compiled NeuronX graph** for running [Freepikโ€™s Flux.1-Lite-8B](https://huggingface.co/Freepik/flux.1-lite-8B) model on **AWS Inferentia2 (Inf2)** instances, optimized for **1024ร—1024 image generation** with **tensor parallelism = 2**. The model has been compiled using [๐Ÿค— Optimum Neuron](https://huggingface.co/docs/optimum/neuron/index) to leverage AWS NeuronCores for efficient inference at scale. --- ## ๐Ÿ”ง Compilation Details - **Base model:** `Freepik/flux.1-lite-8B` - **Framework:** [optimum-neuron](https://github.com/huggingface/optimum-neuron) - **Tensor Parallelism:** `2` (splits model across 2 NeuronCores) - **Input resolution:** `1024 ร— 1024` - **Batch size:** `1` - **Precision:** `bfloat16` - **Auto-casting:** disabled (`auto_cast="none"`) --- ## ๐Ÿ“ฅ Installation Make sure you are running on an **AWS Inf2 instance** with the [AWS Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-intro.html) installed. ```bash pip install "optimum[neuron]" torch torchvision ``` --- # ๐Ÿ›  Re-compilation Example To compile this model yourself: ```bash from optimum.neuron import NeuronFluxPipeline compiler_args = {"auto_cast": "none"} input_shapes = {"batch_size": 1, "height": 1024, "width": 1024} pipe = NeuronFluxPipeline.from_pretrained( "Freepik/flux.1-lite-8B", torch_dtype="bfloat16", export=True, tensor_parallel_size=2, **compiler_args, **input_shapes, ) pipe.save_pretrained("flux_lite_neuronx_1024_tp2/") ``` --- license: apache-2.0 ---