Spaces:

mamungtai-sat
/

character-studio

Running on Zero

App Files Files Community

mamungtai-sat

pormungtai commited on 26 days ago

Commit

0446524

1 Parent(s): a1a4f4b

Fix scene truncation: front-load scene/lighting/composition in build_prompt + tighter compact-tag Typhoon output (fewer tokens, keep location) (#33)

Browse files

- Fix scene truncation: front-load scene/lighting/composition in build_prompt + tighter compact-tag Typhoon output (fewer tokens, keep location) (84c945dd6150562415f83c5db317982bf6c1a838)

Co-authored-by: pormungtailaw <pormungtai@users.noreply.huggingface.co>

Files changed (2) hide show

app.py +4 -2
pipeline_manager.py +11 -9

app.py CHANGED Viewed

@@ -166,8 +166,10 @@ def build_prompt(subject, age, ethnicity, skin, face, body, hair, eyes, outfit,
     if age and str(age).strip():
         who = f"{who} อายุ {str(age).strip()} ปี"
     parts.append(who)
-    # skin tone + face shape sit right after "who" (core identity), then the rest.
-    for v in (skin, face, body, hair, eyes, outfit, pose, expression, scene, lighting):
         if v and str(v).strip():
             parts.append(str(v).strip())
     thai = ", ".join(parts)

     if age and str(age).strip():
         who = f"{who} อายุ {str(age).strip()} ปี"
     parts.append(who)
+    # Priority order for CLIP's 77-token budget: compositional anchors first
+    # (location/lighting/outfit/pose), fine appearance details last (least harmful
+    # if truncated). Skin texture realism is carried by the model's style_prefix anyway.
+    for v in (scene, lighting, outfit, pose, expression, body, hair, skin, face, eyes):
         if v and str(v).strip():
             parts.append(str(v).strip())
     thai = ", ".join(parts)

pipeline_manager.py CHANGED Viewed

@@ -162,15 +162,17 @@ def translate_prompt(text, engine):
             return tok.batch_decode(out, skip_special_tokens=True)[0].strip()
         # typhoon: ask the LLM to rewrite as a clean English image prompt
         msgs = [
-            {"role": "system", "content": "You convert Thai text-to-image prompts "
-             "into a single concise, vivid English prompt for a PHOTOREALISTIC Stable "
-             "Diffusion model. Describe it as a real candid photograph: keep the subject, "
-             "clothing, pose, and scene, and add realistic photographic detail (natural "
-             "skin texture and pores, real hair strands, lifelike eyes, soft natural "
-             "light). NEVER use illustration/painting/anime/CG words such as 'masterpiece', "
-             "'best quality', 'artstation', 'render', '3d', 'anime' or 'painting'. "
-             "Output ONLY the English prompt as a comma-separated phrase — no quotes, "
-             "no explanation."},
             {"role": "user", "content": text},
         ]
         chat = tok.apply_chat_template(msgs, add_generation_prompt=True, tokenize=False)

             return tok.batch_decode(out, skip_special_tokens=True)[0].strip()
         # typhoon: ask the LLM to rewrite as a clean English image prompt
         msgs = [
+            {"role": "system", "content": "You convert Thai text-to-image prompts into "
+             "an English prompt for a PHOTOREALISTIC Stable Diffusion model. Output a "
+             "COMPACT comma-separated list of English tags / short phrases (booru-tag "
+             "style) — NOT full sentences. Omit articles and filler words (a, an, the, "
+             "with, that is). Keep it short to fit a 77-token limit, but INCLUDE EVERY "
+             "detail from the input — especially the location/scene, camera framing "
+             "(e.g. full body), clothing and pose; never drop the setting. Treat it as a "
+             "real candid photograph (natural skin texture, real hair, lifelike eyes, "
+             "natural light). NEVER use illustration/painting/anime/CG words such as "
+             "'masterpiece', 'best quality', 'render', '3d', 'anime' or 'painting'. "
+             "Output ONLY the comma-separated tags — no quotes, no explanation."},
             {"role": "user", "content": text},
         ]
         chat = tok.apply_chat_template(msgs, add_generation_prompt=True, tokenize=False)