mamungtai-sat pormungtai commited on
Commit
5d0bada
·
1 Parent(s): 9bc4f6e

Add Character Studio app, registry, requirements, docs (#1)

Browse files

- Add Character Studio app, registry, requirements, docs (5a7069b0de1924567bb47c3a83a5cf6319c6b401)


Co-authored-by: pormungtailaw <pormungtai@users.noreply.huggingface.co>

Files changed (6) hide show
  1. README.md +56 -7
  2. README_TH.md +120 -0
  3. app.py +177 -0
  4. models.json +79 -0
  5. pipeline_manager.py +320 -0
  6. requirements.txt +18 -0
README.md CHANGED
@@ -1,15 +1,64 @@
1
  ---
2
  title: Character Studio
3
- emoji: 🚀
4
- colorFrom: red
5
- colorTo: gray
6
  sdk: gradio
7
- sdk_version: 6.15.2
8
- python_version: '3.12'
9
  app_file: app.py
10
  pinned: false
11
  license: apache-2.0
12
- short_description: Multi-model character generator (SD1.5 / SDXL / FLUX) on Zer
13
  ---
14
 
15
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Character Studio
3
+ emoji: 🎭
4
+ colorFrom: blue
5
+ colorTo: indigo
6
  sdk: gradio
7
+ sdk_version: 5.9.1
 
8
  app_file: app.py
9
  pinned: false
10
  license: apache-2.0
11
+ short_description: Multi-model character generator on ZeroGPU
12
  ---
13
 
14
+ # 🎭 Character Studio
15
+
16
+ A Hugging Face **ZeroGPU** Space that bundles many image models behind one UI for
17
+ character generation. Pick a model from an **editable registry**, type a prompt,
18
+ optionally drop a **reference image**, and generate.
19
+
20
+ ## Features
21
+ - **Editable model registry** — add / remove models by editing `models.json`, no code change.
22
+ - **Multiple base families** — SD1.5, SDXL, FLUX. Each model declares its own `base`.
23
+ - **Multiple input modes** — `txt2img`, `img2img`, `IP-Adapter` (style/subject), `Face identity` (FaceID).
24
+ - **Custom sources** — HF repos, full `.safetensors` checkpoints, and Civitai download URLs.
25
+
26
+ ## Hardware
27
+ Set the Space hardware to **ZeroGPU** (Nvidia, dynamic). Free tier works; pipelines
28
+ are cached on CPU and moved to GPU only during a generation call.
29
+
30
+ ## Secrets / environment variables (Settings → Variables and secrets)
31
+ - `HF_TOKEN` — needed only for **gated** models (e.g. FLUX.1-dev). Optional otherwise.
32
+ - `CIVITAI_TOKEN` — needed only if a registry entry pulls from a Civitai download URL.
33
+
34
+ ## Adding / removing models
35
+ See **README_TH.md** for the full Thai field guide. Quick version: each entry in
36
+ `models.json` looks like:
37
+
38
+ ```json
39
+ {
40
+ "id": "my-model",
41
+ "label": "My Model (SDXL)",
42
+ "base": "sdxl",
43
+ "type": "checkpoint",
44
+ "repo_id": "author/repo-on-hf",
45
+ "single_file_url": null,
46
+ "default_steps": 30,
47
+ "default_guidance": 6.0,
48
+ "enabled": true
49
+ }
50
+ ```
51
+
52
+ For a LoRA, set `"type": "lora"`, keep `repo_id` as the **base checkpoint**, and add
53
+ either `lora_repo_id` (+ optional `lora_weight_name`) or `lora_url` (Civitai), plus
54
+ `lora_scale`. After editing, click **🔄 Reload models** in the UI.
55
+
56
+ ## Notes
57
+ - IP-Adapter and Face identity modes are available for **SD1.5 / SDXL** only; FLUX
58
+ supports `txt2img` / `img2img`.
59
+ - Face identity uses InsightFace (`buffalo_l`) + IP-Adapter-FaceID and needs a clear face.
60
+ - Only one large model is held in memory at a time; switching models reloads.
61
+
62
+ ## Responsible use
63
+ This tool is for original character art and authorized creative work. Do not use the
64
+ Face identity feature to depict real people without their consent.
README_TH.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🎭 Character Studio — คู่มือภาษาไทย
2
+
3
+ Space รวมโมเดลสร้างตัวละครหลายตัวไว้ใน UI เดียว ทำงานบน **ZeroGPU**
4
+ เลือกโมเดลจากรายการ → พิมพ์ prompt → (ถ้าต้องการ) ใส่รูปต้นแบบ → กด Generate
5
+
6
+ ---
7
+
8
+ ## 1) วิธีนำขึ้น Hugging Face
9
+
10
+ 1. สร้าง Space ใหม่: https://huggingface.co/new-space
11
+ - **SDK = Gradio**
12
+ - **Hardware = ZeroGPU** (Nvidia, dynamic)
13
+ 2. อัปโหลดไฟล์ทั้งหมดในโฟลเดอร์นี้ (`app.py`, `pipeline_manager.py`, `models.json`,
14
+ `requirements.txt`, `README.md`) ขึ้นไปที่ root ของ Space
15
+ - ผ่านเว็บ (ลากวาง) หรือผ่าน git:
16
+ ```bash
17
+ git clone https://huggingface.co/spaces/<user>/<space-name>
18
+ # คัดลอกไฟล์ในโฟลเดอร์นี้เข้าไป แล้ว
19
+ git add . && git commit -m "init character studio" && git push
20
+ ```
21
+ 3. ไปที่ **Settings → Variables and secrets** ใส่ค่า (เท่าที่จำเป็น):
22
+ - `HF_TOKEN` — เฉพาะโมเดล gated เช่น FLUX.1-dev
23
+ - `CIVITAI_TOKEN` — เฉพาะเมื่อโหลดจากลิงก์ Civitai
24
+
25
+ ---
26
+
27
+ ## 2) เพิ่ม / ลบ / ปิดโมเดล (แก้ `models.json` อย่างเดียว)
28
+
29
+ แก้ไฟล์ `models.json` แล้วกดปุ่ม **🔄 Reload models** ใน UI (หรือ restart Space)
30
+
31
+ ### โครงสร้างแต่ละโมเดล
32
+
33
+ | field | ความหมาย |
34
+ |---|---|
35
+ | `id` | รหัสไม่ซ้ำ (อังกฤษ-ขีดกลาง) |
36
+ | `label` | ชื่อที่โชว์ใน UI |
37
+ | `base` | `"sd15"` / `"sdxl"` / `"flux"` — **สำคัญมาก** กำหนดว่าโหมดไหนใช้ได้ |
38
+ | `type` | `"checkpoint"` (โมเดลเต็ม) หรือ `"lora"` |
39
+ | `repo_id` | repo บน HF (สำหรับ checkpoint) หรือ **base checkpoint** (สำหรับ lora) |
40
+ | `single_file_url` | ลิงก์ `.safetensors` โดยตรง เช่น Civitai (ใช้แทน repo_id ได้) |
41
+ | `lora_repo_id` / `lora_weight_name` | สำหรับ LoRA ที่อยู่บน HF |
42
+ | `lora_url` | สำหรับ LoRA จาก Civitai (ลิงก์ download) |
43
+ | `lora_scale` | น้ำหนัก LoRA เช่น 0.8 |
44
+ | `trigger` | คำ trigger ที่จะเติมหน้า prompt อัตโนมัติ |
45
+ | `recommended_prompt` | prompt ตัวอย่าง (โชว์เป็น placeholder) |
46
+ | `negative_prompt` | negative เริ่มต้น |
47
+ | `default_steps` / `default_guidance` | ค่าเริ่มต้นเวลาเลือกโมเดลนี้ |
48
+ | `enabled` | `true`/`false` ปิดชั่วคราวได้โดยไม่ต้องลบ |
49
+
50
+ ### ตัวอย่าง — checkpoint จาก Civitai (SD1.5)
51
+ ```json
52
+ {
53
+ "id": "asian-realistic-v6",
54
+ "label": "AsianRealistic SDLife V6 (SD1.5)",
55
+ "base": "sd15",
56
+ "type": "checkpoint",
57
+ "repo_id": null,
58
+ "single_file_url": "https://civitai.com/api/download/models/130072",
59
+ "default_steps": 28,
60
+ "default_guidance": 6.5,
61
+ "enabled": true
62
+ }
63
+ ```
64
+ > ต้องใส่ `CIVITAI_TOKEN` ใน Secrets ด้วย
65
+
66
+ ### ตัวอย่าง — LoRA จาก Civitai (วางบน base SD1.5)
67
+ ```json
68
+ {
69
+ "id": "asian-girls-face",
70
+ "label": "Asian Girls Face (LoRA)",
71
+ "base": "sd15",
72
+ "type": "lora",
73
+ "repo_id": "stable-diffusion-v1-5/stable-diffusion-v1-5",
74
+ "lora_url": "https://civitai.com/api/download/models/67980",
75
+ "lora_scale": 0.8,
76
+ "enabled": true
77
+ }
78
+ ```
79
+
80
+ ### ตัวอย่าง — โมเดลบน HF (SDXL)
81
+ ```json
82
+ {
83
+ "id": "my-sdxl",
84
+ "label": "My SDXL model",
85
+ "base": "sdxl",
86
+ "type": "checkpoint",
87
+ "repo_id": "author/my-sdxl-repo",
88
+ "default_steps": 30,
89
+ "default_guidance": 6.0,
90
+ "enabled": true
91
+ }
92
+ ```
93
+
94
+ **ลบโมเดล** = ลบ block นั้นออกจาก array `models` หรือตั้ง `"enabled": false`
95
+
96
+ ---
97
+
98
+ ## 3) โหมดรูปต้นแบบ (Input mode)
99
+
100
+ | โหมด | ทำอะไร | ใช้กับ base |
101
+ |---|---|---|
102
+ | Text → Image | สร้างจาก prompt อย่างเดียว | ทุก base |
103
+ | Image → Image | แปลงรูปเดิม (ปรับ denoise) | ทุก base |
104
+ | IP-Adapter | ดึงสไตล์/องค์ประกอบจากรูป | sd15, sdxl |
105
+ | Face identity | ล็อกใบหน้าจากรูปต้นแบบ (FaceID) | sd15, sdxl |
106
+
107
+ > FLUX รองรับเฉพาะ txt2img / img2img (IP-Adapter/FaceID ของ FLUX ยังไม่รวมในเวอร์ชันนี้)
108
+
109
+ ---
110
+
111
+ ## 4) ข้อควรรู้เรื่อง ZeroGPU
112
+ - โมเดลใหญ่จะถูกเก็บทีละตัว สลับโมเดล = โหลดใหม่ (ครั้งแรกช้าหน่อย)
113
+ - หนึ่งครั้ง generate จำกัดเวลา GPU ~120 วินาที (ปรับใน `@spaces.GPU(duration=...)`)
114
+ - โมเดล Civitai/checkpoint เต็มก้อนใหญ่ ดาวน์โหลดครั้งแรกใช้เวลา — ใจเย็น
115
+
116
+ ---
117
+
118
+ ## 5) การใช้งานอย่างรับผิดชอบ
119
+ เครื่องมือนี้สำหรับงานสร้างสรรค์ตัวละครต้นฉบับ/งานที่ได้รับอนุญาต
120
+ **อย่าใช้โหมด Face identity สร้างภาพบุคคลจริงโดยไม่ได้รับความยินยอม**
app.py ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Character Studio — a ZeroGPU Hugging Face Space.
3
+
4
+ A multi-model character generator: pick a model from an editable registry,
5
+ type a prompt, optionally drop a reference image, and generate. Supports
6
+ SD1.5 / SDXL / FLUX bases and txt2img / img2img / IP-Adapter / FaceID modes.
7
+
8
+ Add or remove models by editing models.json (no code change needed), then
9
+ click "🔄 Reload models" or restart the Space.
10
+ """
11
+
12
+ import random
13
+ import traceback
14
+
15
+ import spaces # must be imported before torch on ZeroGPU
16
+ import gradio as gr
17
+
18
+ import pipeline_manager as pm
19
+
20
+ MAX_SEED = 2**31 - 1
21
+
22
+
23
+ # ---------------------------------------------------------------------------
24
+ # Registry helpers
25
+ # ---------------------------------------------------------------------------
26
+ def load_models():
27
+ return pm.load_registry()
28
+
29
+
30
+ MODELS = load_models()
31
+
32
+
33
+ def model_choices(models):
34
+ return [(m["label"], m["id"]) for m in models]
35
+
36
+
37
+ def modes_for(models, model_id):
38
+ m = pm.get_model(models, model_id)
39
+ if not m:
40
+ return [("Text → Image", "txt2img")]
41
+ return [(pm.MODE_LABELS[k], k) for k in pm.SUPPORTED_MODES[m["base"]]]
42
+
43
+
44
+ # ---------------------------------------------------------------------------
45
+ # GPU generation
46
+ # ---------------------------------------------------------------------------
47
+ @spaces.GPU(duration=120)
48
+ def generate(model_id, mode, prompt, negative_prompt, ref_image,
49
+ steps, guidance, denoise, ip_scale, width, height, seed, randomize):
50
+ models = load_models()
51
+ cfg = pm.get_model(models, model_id)
52
+ if cfg is None:
53
+ raise gr.Error("ไม่พบโมเดลที่เลือก โปรด Reload models / Selected model not found.")
54
+
55
+ if randomize or seed is None or int(seed) < 0:
56
+ seed = random.randint(0, MAX_SEED)
57
+
58
+ try:
59
+ img = pm.run_generation(
60
+ cfg=cfg, mode=mode, prompt=prompt, negative_prompt=negative_prompt,
61
+ ref_image=ref_image, steps=steps, guidance=guidance, denoise=denoise,
62
+ ip_scale=ip_scale, width=width, height=height, seed=seed,
63
+ )
64
+ except Exception as e:
65
+ traceback.print_exc()
66
+ raise gr.Error(str(e))
67
+
68
+ status = f"✅ {cfg['label']} · {pm.MODE_LABELS.get(mode, mode)} · seed {seed}"
69
+ return img, seed, status
70
+
71
+
72
+ # ---------------------------------------------------------------------------
73
+ # UI callbacks
74
+ # ---------------------------------------------------------------------------
75
+ def on_model_change(model_id):
76
+ models = load_models()
77
+ cfg = pm.get_model(models, model_id)
78
+ if not cfg:
79
+ return gr.update(), gr.update(), gr.update(), gr.update(), gr.update()
80
+ choices = modes_for(models, model_id)
81
+ return (
82
+ gr.update(choices=choices, value=choices[0][1]), # mode radio
83
+ gr.update(placeholder=cfg.get("recommended_prompt", "")), # prompt
84
+ gr.update(value=cfg.get("negative_prompt", "")), # negative
85
+ gr.update(value=cfg.get("default_steps", 28)), # steps
86
+ gr.update(value=cfg.get("default_guidance", 6.0)), # guidance
87
+ )
88
+
89
+
90
+ def reload_registry():
91
+ global MODELS
92
+ MODELS = load_models()
93
+ choices = model_choices(MODELS)
94
+ first = choices[0][1] if choices else None
95
+ return gr.update(choices=choices, value=first), f"🔄 โหลดแล้ว {len(MODELS)} โมเดล"
96
+
97
+
98
+ # ---------------------------------------------------------------------------
99
+ # Layout (mirrors the FLUX LoRA DLC reference UI)
100
+ # ---------------------------------------------------------------------------
101
+ CSS = """
102
+ #gen-btn {height: 100%; font-size: 1.3rem; font-weight: 700;}
103
+ .card {border-radius: 14px;}
104
+ footer {visibility: hidden;}
105
+ """
106
+
107
+ with gr.Blocks(css=CSS, theme=gr.themes.Soft(primary_hue="blue"),
108
+ title="Character Studio") as demo:
109
+ gr.Markdown("## 🎭 Character Studio — multi-model character generator (ZeroGPU)")
110
+
111
+ with gr.Row():
112
+ prompt = gr.Textbox(
113
+ label="Edit Prompt", lines=2, scale=4,
114
+ placeholder="✦ เลือกโมเดลแล้วพิมพ์ prompt / Choose a model and type the prompt",
115
+ )
116
+ gen_btn = gr.Button("Generate", variant="primary", scale=1, elem_id="gen-btn")
117
+
118
+ with gr.Row(equal_height=False):
119
+ # ---- left: model picker ----
120
+ with gr.Column(scale=1):
121
+ with gr.Group():
122
+ gr.Markdown("### 🧩 เลือกโมเดล / Models")
123
+ model_radio = gr.Radio(
124
+ choices=model_choices(MODELS),
125
+ value=MODELS[0]["id"] if MODELS else None,
126
+ label=None, container=False,
127
+ )
128
+ reload_btn = gr.Button("🔄 Reload models", size="sm")
129
+ reload_status = gr.Markdown("")
130
+
131
+ mode_radio = gr.Radio(
132
+ choices=modes_for(MODELS, MODELS[0]["id"]) if MODELS else [],
133
+ value="txt2img",
134
+ label="โหม��รูปต้นแบบ / Input mode",
135
+ )
136
+
137
+ # ---- right: output ----
138
+ with gr.Column(scale=1):
139
+ output = gr.Image(label="Generated Image", height=560, elem_classes="card")
140
+ status = gr.Markdown("")
141
+
142
+ # ---- advanced ----
143
+ with gr.Accordion("Advanced Settings", open=False):
144
+ with gr.Row():
145
+ with gr.Column():
146
+ ref_image = gr.Image(label="Input image (รูปต้นแบบ)", type="pil", height=240)
147
+ ip_scale = gr.Slider(0.0, 1.5, value=0.7, step=0.05,
148
+ label="Reference strength (IP-Adapter / FaceID)")
149
+ denoise = gr.Slider(0.1, 1.0, value=0.65, step=0.01,
150
+ label="Denoise strength (img2img · ต่ำ = อิงรูปมาก)")
151
+ with gr.Column():
152
+ negative_prompt = gr.Textbox(label="Negative prompt", lines=2)
153
+ with gr.Row():
154
+ steps = gr.Slider(1, 50, value=28, step=1, label="Steps")
155
+ guidance = gr.Slider(0.0, 15.0, value=6.5, step=0.1, label="Guidance (CFG)")
156
+ with gr.Row():
157
+ width = gr.Slider(384, 1280, value=768, step=64, label="Width")
158
+ height = gr.Slider(384, 1280, value=768, step=64, label="Height")
159
+ with gr.Row():
160
+ seed = gr.Number(value=-1, label="Seed (-1 = random)", precision=0)
161
+ randomize = gr.Checkbox(value=True, label="Randomize seed")
162
+
163
+ # ---- wiring ----
164
+ model_radio.change(
165
+ on_model_change, inputs=model_radio,
166
+ outputs=[mode_radio, prompt, negative_prompt, steps, guidance],
167
+ )
168
+ reload_btn.click(reload_registry, outputs=[model_radio, reload_status])
169
+
170
+ gen_inputs = [model_radio, mode_radio, prompt, negative_prompt, ref_image,
171
+ steps, guidance, denoise, ip_scale, width, height, seed, randomize]
172
+ gen_btn.click(generate, inputs=gen_inputs, outputs=[output, seed, status])
173
+ prompt.submit(generate, inputs=gen_inputs, outputs=[output, seed, status])
174
+
175
+
176
+ if __name__ == "__main__":
177
+ demo.queue(max_size=12).launch()
models.json ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_comment": "Editable model registry. Add/remove entries freely, then restart the Space (or click 🔄 Reload models). Each entry must keep these fields. See README_TH.md for the field guide.",
3
+
4
+ "models": [
5
+ {
6
+ "id": "sd15-realistic-base",
7
+ "label": "SD1.5 · Realistic Base",
8
+ "base": "sd15",
9
+ "type": "checkpoint",
10
+ "repo_id": "stable-diffusion-v1-5/stable-diffusion-v1-5",
11
+ "single_file_url": null,
12
+ "trigger": "",
13
+ "recommended_prompt": "RAW photo, a beautiful woman, detailed skin, soft lighting, 50mm, depth of field",
14
+ "negative_prompt": "(worst quality, low quality:1.4), deformed, extra fingers, watermark, text",
15
+ "default_steps": 28,
16
+ "default_guidance": 6.5,
17
+ "enabled": true
18
+ },
19
+ {
20
+ "id": "sd15-asian-girls-face-lora",
21
+ "label": "Asian Girls Face (LoRA · SD1.5)",
22
+ "base": "sd15",
23
+ "type": "lora",
24
+ "repo_id": "stable-diffusion-v1-5/stable-diffusion-v1-5",
25
+ "lora_url": "https://civitai.com/api/download/models/67980",
26
+ "lora_repo_id": null,
27
+ "lora_weight_name": null,
28
+ "lora_scale": 0.8,
29
+ "trigger": "",
30
+ "recommended_prompt": "RAW photo, asian girl, pretty face, natural skin texture, cinematic light",
31
+ "negative_prompt": "(worst quality, low quality:1.4), deformed, watermark, text",
32
+ "default_steps": 28,
33
+ "default_guidance": 6.5,
34
+ "enabled": true
35
+ },
36
+ {
37
+ "id": "sdxl-base",
38
+ "label": "SDXL · Base 1.0",
39
+ "base": "sdxl",
40
+ "type": "checkpoint",
41
+ "repo_id": "stabilityai/stable-diffusion-xl-base-1.0",
42
+ "single_file_url": null,
43
+ "trigger": "",
44
+ "recommended_prompt": "cinematic photo of a beautiful woman, 35mm, highly detailed, soft natural light",
45
+ "negative_prompt": "lowres, bad anatomy, worst quality, watermark, text",
46
+ "default_steps": 30,
47
+ "default_guidance": 6.0,
48
+ "enabled": true
49
+ },
50
+ {
51
+ "id": "flux-schnell",
52
+ "label": "FLUX.1 · Schnell (fast, open)",
53
+ "base": "flux",
54
+ "type": "checkpoint",
55
+ "repo_id": "black-forest-labs/FLUX.1-schnell",
56
+ "single_file_url": null,
57
+ "trigger": "",
58
+ "recommended_prompt": "a photorealistic portrait of a young woman, studio light, sharp focus, ultra detailed",
59
+ "negative_prompt": "",
60
+ "default_steps": 4,
61
+ "default_guidance": 0.0,
62
+ "enabled": true
63
+ },
64
+ {
65
+ "id": "flux-dev",
66
+ "label": "FLUX.1 · Dev (gated · needs HF token)",
67
+ "base": "flux",
68
+ "type": "checkpoint",
69
+ "repo_id": "black-forest-labs/FLUX.1-dev",
70
+ "single_file_url": null,
71
+ "trigger": "",
72
+ "recommended_prompt": "a photorealistic portrait of a young woman, golden hour, 85mm, bokeh, ultra detailed",
73
+ "negative_prompt": "",
74
+ "default_steps": 25,
75
+ "default_guidance": 3.5,
76
+ "enabled": false
77
+ }
78
+ ]
79
+ }
pipeline_manager.py ADDED
@@ -0,0 +1,320 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ pipeline_manager.py
3
+ -------------------
4
+ Loads diffusion pipelines from an editable registry (models.json) and runs
5
+ generation across multiple base families (SD1.5 / SDXL / FLUX) and multiple
6
+ input modes (txt2img / img2img / IP-Adapter / Face identity).
7
+
8
+ Designed for Hugging Face ZeroGPU: pipelines are built/cached on CPU and moved
9
+ to CUDA inside the @spaces.GPU-decorated caller (see app.py). Nothing here calls
10
+ .cuda() at import time.
11
+ """
12
+
13
+ import os
14
+ import json
15
+ import gc
16
+ import hashlib
17
+ import urllib.request
18
+ from pathlib import Path
19
+
20
+ import torch
21
+
22
+ # ---------------------------------------------------------------------------
23
+ # Constants / paths
24
+ # ---------------------------------------------------------------------------
25
+ HERE = Path(__file__).parent
26
+ REGISTRY_PATH = HERE / "models.json"
27
+ DOWNLOAD_DIR = Path(os.environ.get("CS_CACHE_DIR", "/tmp/cs_models"))
28
+ DOWNLOAD_DIR.mkdir(parents=True, exist_ok=True)
29
+
30
+ CIVITAI_TOKEN = os.environ.get("CIVITAI_TOKEN", "").strip()
31
+ HF_TOKEN = os.environ.get("HF_TOKEN", "").strip() or None
32
+
33
+ DTYPE = torch.bfloat16 if torch.cuda.is_available() else torch.float32
34
+ # SD1.5 / SDXL are most stable in float16; FLUX prefers bfloat16.
35
+ DTYPE_SD = torch.float16
36
+
37
+ DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
38
+
39
+ # Modes supported per base family. Used by the UI to gate options.
40
+ SUPPORTED_MODES = {
41
+ "sd15": ["txt2img", "img2img", "ip_adapter", "face_id"],
42
+ "sdxl": ["txt2img", "img2img", "ip_adapter", "face_id"],
43
+ "flux": ["txt2img", "img2img"],
44
+ }
45
+
46
+ MODE_LABELS = {
47
+ "txt2img": "Text → Image",
48
+ "img2img": "Image → Image (denoise)",
49
+ "ip_adapter": "IP-Adapter (style / subject)",
50
+ "face_id": "Face identity (FaceID)",
51
+ }
52
+
53
+ # ---------------------------------------------------------------------------
54
+ # Registry
55
+ # ---------------------------------------------------------------------------
56
+ def load_registry():
57
+ """Read models.json and return the list of enabled model configs."""
58
+ with open(REGISTRY_PATH, "r", encoding="utf-8") as f:
59
+ data = json.load(f)
60
+ models = [m for m in data.get("models", []) if m.get("enabled", True)]
61
+ return models
62
+
63
+
64
+ def get_model(models, model_id):
65
+ for m in models:
66
+ if m["id"] == model_id:
67
+ return m
68
+ return None
69
+
70
+
71
+ # ---------------------------------------------------------------------------
72
+ # Download helpers (Civitai / arbitrary URL → local cache)
73
+ # ---------------------------------------------------------------------------
74
+ def _download_url(url):
75
+ """Download a (Civitai or other) URL to the local cache and return the path."""
76
+ if not url:
77
+ return None
78
+ fname = hashlib.sha1(url.encode()).hexdigest()[:16] + ".safetensors"
79
+ dest = DOWNLOAD_DIR / fname
80
+ if dest.exists() and dest.stat().st_size > 1_000_000:
81
+ return str(dest)
82
+
83
+ dl_url = url
84
+ if "civitai.com" in url and CIVITAI_TOKEN and "token=" not in url:
85
+ sep = "&" if "?" in url else "?"
86
+ dl_url = f"{url}{sep}token={CIVITAI_TOKEN}"
87
+
88
+ req = urllib.request.Request(dl_url, headers={"User-Agent": "Mozilla/5.0"})
89
+ print(f"[download] {url} -> {dest}")
90
+ with urllib.request.urlopen(req) as resp, open(dest, "wb") as out:
91
+ while True:
92
+ chunk = resp.read(1 << 20)
93
+ if not chunk:
94
+ break
95
+ out.write(chunk)
96
+ return str(dest)
97
+
98
+
99
+ # ---------------------------------------------------------------------------
100
+ # Pipeline cache
101
+ # ---------------------------------------------------------------------------
102
+ # Keyed by model id. Stores the base txt2img pipeline (CPU). Adapters are loaded
103
+ # on demand and tracked via the `_cs_adapter` attribute on the pipe.
104
+ _PIPE_CACHE = {}
105
+ _FACE_APP = None # lazy insightface FaceAnalysis
106
+
107
+
108
+ def _free_cache(keep_id=None):
109
+ """Evict cached pipelines except keep_id to bound memory (simple LRU-ish)."""
110
+ for k in list(_PIPE_CACHE.keys()):
111
+ if k != keep_id:
112
+ del _PIPE_CACHE[k]
113
+ gc.collect()
114
+ if torch.cuda.is_available():
115
+ torch.cuda.empty_cache()
116
+
117
+
118
+ def _build_base_pipeline(cfg):
119
+ """Construct the txt2img pipeline for a model config (on CPU)."""
120
+ base = cfg["base"]
121
+ common = dict(token=HF_TOKEN)
122
+
123
+ if base == "sd15":
124
+ from diffusers import StableDiffusionPipeline
125
+ if cfg.get("single_file_url"):
126
+ local = _download_url(cfg["single_file_url"])
127
+ pipe = StableDiffusionPipeline.from_single_file(
128
+ local, torch_dtype=DTYPE_SD, safety_checker=None
129
+ )
130
+ else:
131
+ pipe = StableDiffusionPipeline.from_pretrained(
132
+ cfg["repo_id"], torch_dtype=DTYPE_SD, safety_checker=None, **common
133
+ )
134
+
135
+ elif base == "sdxl":
136
+ from diffusers import StableDiffusionXLPipeline
137
+ if cfg.get("single_file_url"):
138
+ local = _download_url(cfg["single_file_url"])
139
+ pipe = StableDiffusionXLPipeline.from_single_file(local, torch_dtype=DTYPE_SD)
140
+ else:
141
+ pipe = StableDiffusionXLPipeline.from_pretrained(
142
+ cfg["repo_id"], torch_dtype=DTYPE_SD, **common
143
+ )
144
+
145
+ elif base == "flux":
146
+ from diffusers import FluxPipeline
147
+ pipe = FluxPipeline.from_pretrained(cfg["repo_id"], torch_dtype=DTYPE, **common)
148
+
149
+ else:
150
+ raise ValueError(f"Unknown base family: {base}")
151
+
152
+ # Apply LoRA if this entry is a LoRA model.
153
+ if cfg.get("type") == "lora":
154
+ scale = float(cfg.get("lora_scale", 0.8))
155
+ if cfg.get("lora_repo_id"):
156
+ kwargs = {}
157
+ if cfg.get("lora_weight_name"):
158
+ kwargs["weight_name"] = cfg["lora_weight_name"]
159
+ pipe.load_lora_weights(cfg["lora_repo_id"], **kwargs)
160
+ elif cfg.get("lora_url"):
161
+ local = _download_url(cfg["lora_url"])
162
+ pipe.load_lora_weights(local)
163
+ try:
164
+ pipe.fuse_lora(lora_scale=scale)
165
+ except Exception as e: # noqa
166
+ print(f"[lora] fuse skipped: {e}")
167
+
168
+ pipe.set_progress_bar_config(disable=True)
169
+ pipe._cs_adapter = None # track loaded IP-Adapter / FaceID state
170
+ return pipe
171
+
172
+
173
+ def get_pipeline(cfg):
174
+ """Return a cached base pipeline for the model, building it if needed."""
175
+ mid = cfg["id"]
176
+ if mid not in _PIPE_CACHE:
177
+ _free_cache(keep_id=None) # one big model at a time on ZeroGPU
178
+ print(f"[pipeline] building {mid} ({cfg['base']})")
179
+ _PIPE_CACHE[mid] = _build_base_pipeline(cfg)
180
+ return _PIPE_CACHE[mid]
181
+
182
+
183
+ # ---------------------------------------------------------------------------
184
+ # Adapter management (IP-Adapter / FaceID)
185
+ # ---------------------------------------------------------------------------
186
+ _IP_ADAPTER_SPECS = {
187
+ "sd15": {
188
+ "ip_adapter": dict(repo="h94/IP-Adapter", subfolder="models",
189
+ weight_name="ip-adapter-plus_sd15.bin"),
190
+ "face_id": dict(repo="h94/IP-Adapter-FaceID", subfolder=None,
191
+ weight_name="ip-adapter-faceid_sd15.bin",
192
+ image_encoder_folder=None),
193
+ },
194
+ "sdxl": {
195
+ "ip_adapter": dict(repo="h94/IP-Adapter", subfolder="sdxl_models",
196
+ weight_name="ip-adapter-plus_sdxl_vit-h.bin"),
197
+ "face_id": dict(repo="h94/IP-Adapter-FaceID", subfolder=None,
198
+ weight_name="ip-adapter-faceid_sdxl.bin",
199
+ image_encoder_folder=None),
200
+ },
201
+ }
202
+
203
+
204
+ def _ensure_adapter(pipe, base, mode):
205
+ """Load the right IP-Adapter for `mode`, unloading any previous one."""
206
+ want = mode if mode in ("ip_adapter", "face_id") else None
207
+ if pipe._cs_adapter == want:
208
+ return
209
+ try:
210
+ pipe.unload_ip_adapter()
211
+ except Exception:
212
+ pass
213
+ pipe._cs_adapter = None
214
+ if want is None:
215
+ return
216
+ spec = _IP_ADAPTER_SPECS[base][want]
217
+ kwargs = {k: v for k, v in spec.items() if k != "repo"}
218
+ pipe.load_ip_adapter(spec["repo"], **kwargs)
219
+ pipe._cs_adapter = want
220
+
221
+
222
+ def _get_face_app():
223
+ global _FACE_APP
224
+ if _FACE_APP is None:
225
+ from insightface.app import FaceAnalysis
226
+ app = FaceAnalysis(name="buffalo_l",
227
+ providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
228
+ app.prepare(ctx_id=0, det_size=(640, 640))
229
+ _FACE_APP = app
230
+ return _FACE_APP
231
+
232
+
233
+ def _face_embeds(image):
234
+ """Return a torch tensor of FaceID embeddings for the largest face."""
235
+ import numpy as np
236
+ import cv2
237
+ app = _get_face_app()
238
+ arr = cv2.cvtColor(np.array(image.convert("RGB")), cv2.COLOR_RGB2BGR)
239
+ faces = app.get(arr)
240
+ if not faces:
241
+ raise ValueError("ไม่พบใบหน้าในรูปต้นแบบ / No face detected in the reference image.")
242
+ faces = sorted(faces, key=lambda f: (f.bbox[2] - f.bbox[0]) * (f.bbox[3] - f.bbox[1]))
243
+ emb = torch.from_numpy(faces[-1].normed_embedding) # [512]
244
+ # diffusers IP-Adapter-FaceID expects [2, 1, 1, 512]: [neg, pos] for CFG.
245
+ emb = emb.unsqueeze(0).unsqueeze(0).unsqueeze(0) # [1, 1, 1, 512]
246
+ return torch.cat([torch.zeros_like(emb), emb], dim=0).to(DTYPE_SD)
247
+
248
+
249
+ # ---------------------------------------------------------------------------
250
+ # Generation
251
+ # ---------------------------------------------------------------------------
252
+ def run_generation(cfg, mode, prompt, negative_prompt, ref_image,
253
+ steps, guidance, denoise, ip_scale, width, height, seed):
254
+ """Run one generation. MUST be called inside a @spaces.GPU context."""
255
+ base = cfg["base"]
256
+ if mode not in SUPPORTED_MODES[base]:
257
+ raise ValueError(
258
+ f"โหมด '{MODE_LABELS.get(mode, mode)}' ใช้กับ base {base.upper()} ไม่ได้ "
259
+ f"(รองรับ: {', '.join(MODE_LABELS[m] for m in SUPPORTED_MODES[base])})"
260
+ )
261
+
262
+ pipe = get_pipeline(cfg)
263
+ pipe = pipe.to(DEVICE)
264
+
265
+ generator = None
266
+ if seed is not None and int(seed) >= 0:
267
+ generator = torch.Generator(device=DEVICE).manual_seed(int(seed))
268
+
269
+ full_prompt = prompt
270
+ if cfg.get("trigger"):
271
+ full_prompt = f"{cfg['trigger']}, {prompt}".strip(", ")
272
+
273
+ call = dict(
274
+ prompt=full_prompt,
275
+ num_inference_steps=int(steps),
276
+ generator=generator,
277
+ width=int(width),
278
+ height=int(height),
279
+ )
280
+
281
+ # FLUX uses `guidance_scale` differently and has no negative prompt.
282
+ if base == "flux":
283
+ call["guidance_scale"] = float(guidance)
284
+ else:
285
+ call["guidance_scale"] = float(guidance)
286
+ call["negative_prompt"] = negative_prompt or None
287
+
288
+ # ----- mode wiring -----
289
+ if mode == "txt2img":
290
+ _ensure_adapter(pipe, base, None)
291
+
292
+ elif mode == "img2img":
293
+ _ensure_adapter(pipe, base, None) if base != "flux" else None
294
+ if ref_image is None:
295
+ raise ValueError("img2img ต้องอัปโหลดรูปต้นแบบก่อน / Upload a reference image first.")
296
+ from diffusers import AutoPipelineForImage2Image
297
+ i2i = AutoPipelineForImage2Image.from_pipe(pipe).to(DEVICE)
298
+ call.pop("width"); call.pop("height")
299
+ call["image"] = ref_image.convert("RGB")
300
+ call["strength"] = float(denoise)
301
+ out = i2i(**call).images[0]
302
+ return out
303
+
304
+ elif mode == "ip_adapter":
305
+ if ref_image is None:
306
+ raise ValueError("IP-Adapter ต้องอัปโหลดรูปต้นแบบก่อน / Upload a reference image first.")
307
+ _ensure_adapter(pipe, base, "ip_adapter")
308
+ pipe.set_ip_adapter_scale(float(ip_scale))
309
+ call["ip_adapter_image"] = ref_image.convert("RGB")
310
+
311
+ elif mode == "face_id":
312
+ if ref_image is None:
313
+ raise ValueError("Face identity ต้องอัปโหลดรูปใบหน้าก่อน / Upload a face image first.")
314
+ _ensure_adapter(pipe, base, "face_id")
315
+ pipe.set_ip_adapter_scale(float(ip_scale))
316
+ embeds = _face_embeds(ref_image).to(DEVICE)
317
+ call["ip_adapter_image_embeds"] = [embeds]
318
+
319
+ out = pipe(**call).images[0]
320
+ return out
requirements.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ZeroGPU provides the CUDA torch build; do not pin torch hard.
2
+ spaces
3
+ torch
4
+ torchvision
5
+ diffusers>=0.31.0
6
+ transformers>=4.44.0
7
+ accelerate>=0.33.0
8
+ peft>=0.12.0
9
+ safetensors>=0.4.3
10
+ sentencepiece
11
+ protobuf
12
+ huggingface_hub>=0.25.0
13
+ Pillow
14
+ numpy
15
+ opencv-python-headless
16
+ # Face-identity mode (IP-Adapter FaceID). Heavy; comment out if you don't use Face mode.
17
+ insightface==0.7.3
18
+ onnxruntime