File size: 6,482 Bytes
e6a67f8 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | ---
license: apache-2.0
tags:
- defect-generation
- anomaly-detection
- industrial-inspection
- lora
- flux
- diffusion
- rlhf
language: en
pipeline_tag: image-to-image
---
# UniDG-RFT-LoRA
LoRA weights for **UniDG** (Universal Defect Generation), trained via **Consistency-RFT** with Flow-GRPO and dual reward models on the UDG dataset (300K quadruplets).
[[Paper]](https://arxiv.org/abs/2604.08915) [[Code]](https://github.com/RetoFan233/UniDG) [[UniDG-SFT-LoRA]](https://huggingface.co/retofan23333/UniDG-SFT-LoRA-Release)
## Overview
UniDG is a universal defect generation foundation model that transfers defects from a reference image to a target region via **Defect-Context Editing** and **MM-DiT multimodal attention**, without per-category fine-tuning. This checkpoint is the **Consistency-RFT** variant, further refined from UniDG-SFT using Flow-GRPO with dual reward models (Defect-Und-Reward & Defect-Recog-Reward) for improved defect fidelity and consistency.
| Variant | Training | Focus |
|---------|----------|-------|
| UniDG-SFT | Diversity-SFT with complementary sampling | Diverse defect patterns |
| **UniDG-RFT** (this) | Consistency-RFT with Flow-GRPO + dual rewards | Consistent & faithful defects |
## Important: Usage Difference from UniDG-SFT-LoRA
**The UniDG-RFT-LoRA weights are stored in PEFT format** (`adapter_model.safetensors` + `adapter_config.json`), which is different from UniDG-SFT-LoRA (which uses `pytorch_lora_weights.safetensors`). This means:
- **UniDG-SFT-LoRA** can be directly loaded via the `lora_weights_path` parameter in `ImageUniDG`.
- **UniDG-RFT-LoRA** must first be **merged into the base SFT model** using the provided `combine_peft_weights.py` script. After merging, the resulting model can be loaded directly without any additional LoRA loading step.
## Repository Contents
| File | Description |
|------|-------------|
| `adapter_model.safetensors` | PEFT LoRA weights (Consistency-RFT) |
| `adapter_config.json` | LoRA configuration (rank=64, alpha=128) |
| `combine_peft_weights.py` | Script to merge RFT LoRA into the base SFT model |
## Step-by-Step Usage
### Prerequisites
- [FLUX.1-Fill-dev](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev) (inpainting backbone)
- [FLUX.1-Redux-dev](https://huggingface.co/black-forest-labs/FLUX.1-Redux-dev) (reference conditioning)
- [UniDG-SFT-LoRA](https://huggingface.co/retofan23333/UniDG-SFT-LoRA-Release) (base SFT model — the RFT LoRA is fine-tuned on top of this)
- [UniDG code](https://github.com/RetoFan233/UniDG) (inference framework)
- Python dependencies: `diffusers`, `peft`, `torch`
### Step 1: Prepare the Base SFT Model
First, you need a base FLUX.1-Fill-dev model with UniDG-SFT-LoRA weights already merged in. If you haven't done this, you can prepare it by loading the SFT model and saving the merged weights:
```python
from diffusers import FluxFillPipeline
import torch
# Load base FLUX.1-Fill-dev
pipe = FluxFillPipeline.from_pretrained(
"path/to/FLUX.1-Fill-dev",
torch_dtype=torch.bfloat16,
)
# Load SFT LoRA weights
pipe.load_lora_weights("path/to/UniDG-SFT-LoRA-Release/pytorch_lora_weights.safetensors")
# Save the merged SFT model as the base for RFT merging
pipe.save_pretrained("path/to/FLUX.1-Fill-dev-UDG-SFT", safe_serialization=True, max_shard_size="5GB")
```
### Step 2: Merge RFT LoRA into the Base SFT Model
Use the provided `combine_peft_weights.py` to merge the RFT LoRA weights into the base SFT model:
```bash
python combine_peft_weights.py \
--base_model_path path/to/FLUX.1-Fill-dev-UDG-SFT \
--lora_weights_path path/to/UniDG-RFT-LoRA-Release \
--output_path path/to/FLUX.1-Fill-dev-UDG-RFT \
--save_full_pipeline
```
Parameters:
- `--base_model_path`: Path to the base SFT model (from Step 1)
- `--lora_weights_path`: Path to this RFT LoRA repository (containing `adapter_model.safetensors` and `adapter_config.json`)
- `--output_path`: Output path for the merged model
- `--save_full_pipeline`: Save the full pipeline (including VAE, text encoder, etc.) so you can load it directly later
- `--dtype`: Data type, default `bfloat16`
- `--device`: Device for loading, default `cpu` (recommended to avoid OOM)
> **Tip**: Use `--device cpu` (default) to save GPU memory during the merge process. The merge only needs to run once.
### Step 3: Use the Merged Model with UniDG
After merging, the model can be used directly with the UniDG inference code — **no additional LoRA loading is needed**:
```python
from unidg import ImageUniDG
from PIL import Image
import torch
# Load the merged RFT model — set lora_weights_path="" since LoRA is already merged
model = ImageUniDG(
flux_model_path="path/to/FLUX.1-Fill-dev-UDG-RFT",
redux_model_path="path/to/FLUX.1-Redux-dev",
lora_weights_path="", # No additional LoRA needed!
device="cuda:0",
dtype=torch.bfloat16,
)
result, mask = model.process_images(
target_image=Image.open("target.jpg"),
reference_image=Image.open("reference.jpg"),
reference_mask=Image.open("reference_mask.png"),
target_mask=Image.open("target_mask.png"),
num_inference_steps=28,
guidance_scale=3.5,
seed=42,
)
result.save("result.png")
```
### Quick Reference: SFT vs RFT Usage
| | UniDG-SFT | UniDG-RFT |
|---|-----------|-----------|
| Weight format | `pytorch_lora_weights.safetensors` | `adapter_model.safetensors` + `adapter_config.json` |
| Merge required? | No | Yes (with SFT base model) |
| `lora_weights_path` | Path to SFT weights | `""` (empty, after merge) |
| `flux_model_path` | `path/to/FLUX.1-Fill-dev` | `path/to/merged-RFT-model` |
| Load time | LoRA loaded on-the-fly | Pre-merged, no LoRA overhead |
## LoRA Configuration
| Parameter | Value |
|-----------|-------|
| PEFT type | LORA |
| Rank (r) | 64 |
| Alpha | 128 |
| Dropout | 0.0 |
| Target modules | `ff.net.0.proj`, `ff.net.2`, `ff_context.net.0.proj`, `proj_mlp`, `attn.to_q`, `attn.to_v`, `attn.to_add_out`, `attn.add_k_proj`, `attn.add_v_proj`, `ff_context.net.2`, `attn.add_q_proj`, `attn.to_out.0`, `attn.to_k` |
| Base model | FLUX.1-Fill-dev + UniDG-SFT-LoRA |
## Citation
```bibtex
@article{fan2026unidg,
title={Large-Scale Universal Defect Generation: Foundation Models and Datasets},
author={Fan, Yuanting and Liu, Jun and Gao, Bin-Bin and Chen, Xiaochen and Lin, Yuhuan and Dai, Zhewei and Zhan, Jiawei and Wang, Chengjie},
journal={arXiv preprint arXiv:2604.08915},
year={2026}
}
```
|