fffiloni commited on
Commit
4dab514
·
verified ·
1 Parent(s): 6125e5e

Upload 5 files

Browse files
Files changed (3) hide show
  1. CHANGELOG.md +28 -0
  2. README.md +10 -0
  3. app.py +207 -1
CHANGELOG.md CHANGED
@@ -1,5 +1,11 @@
1
  # Changelog
2
 
 
 
 
 
 
 
3
  ## V5
4
 
5
  - Added Phase 5: `model_id` → model metadata analysis → Pi-adapted Gradio template → private Space → live API validation.
@@ -34,3 +40,25 @@
34
  - Added a new `pi_gist_recipe` worker payload.
35
  - The wrapper still performs independent final validation through the live Gradio API before declaring success.
36
  - Saved artifacts include `generated/GOAL.md`, optional `generated/PI_SUMMARY.md`, Pi logs, traces, API schema, and API test result.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Changelog
2
 
3
+ ## v5.2
4
+
5
+ - Fix Phase 5 generated Space runtime dependency conflict by pinning `huggingface_hub>=0.34.0,<1.0.0` in target Space `requirements.txt`.
6
+ - Add a Pi instruction not to remove the `huggingface_hub` compatibility pin.
7
+
8
+
9
  ## V5
10
 
11
  - Added Phase 5: `model_id` → model metadata analysis → Pi-adapted Gradio template → private Space → live API validation.
 
40
  - Added a new `pi_gist_recipe` worker payload.
41
  - The wrapper still performs independent final validation through the live Gradio API before declaring success.
42
  - Saved artifacts include `generated/GOAL.md`, optional `generated/PI_SUMMARY.md`, Pi logs, traces, API schema, and API test result.
43
+
44
+
45
+ ## v5.1
46
+
47
+ - Fixed Phase 5 Pi invocation for Pi 0.73.x: use `pi -p` instead of removed `--prompt`.
48
+ - No architecture changes; Phase 5 remains wrapper-owned for Hub operations and live API validation.
49
+
50
+
51
+ ## V6
52
+
53
+ - Added Phase 6 Runtime Recommender.
54
+ - Adds a no-build HF Job that analyzes model metadata, estimated file sizes, task/library, risks, and recommends CPU Basic / CPU Upgrade / ZeroGPU candidate / manual review.
55
+ - Writes `model_analysis.json`, `runtime_recommendation.json`, `state.json`, `events.jsonl`, and `report.md` to the Bucket.
56
+
57
+ ## V7 — LongCat article reproduction pass
58
+
59
+ - Added Phase 7: LongCat article-style reproduction workflow.
60
+ - Adds a dedicated HF Job worker that asks Pi to adapt a LongCat Space scaffold using the HF Spaces gist.
61
+ - Creates the target Space privately.
62
+ - Requests `zero-a10g` first, with optional fixed GPU fallback (`l40sx1` by default).
63
+ - Validates a cheap `/health` endpoint live via `gradio_client` before marking success.
64
+ - Stores hardware attempts, model analysis, generated files, Pi logs, traces, and report in the bucket.
README.md CHANGED
@@ -140,3 +140,13 @@ Phase 4 asks Pi to follow the HF Spaces Agent Quickstart gist and use the `hf` C
140
  ## V5
141
 
142
  Adds Phase 5: model-card analysis for simple Transformers text pipeline models. Recommended first test: `sshleifer/tiny-gpt2`. The Space remains private and success is still gated by wrapper-owned live API validation.
 
 
 
 
 
 
 
 
 
 
 
140
  ## V5
141
 
142
  Adds Phase 5: model-card analysis for simple Transformers text pipeline models. Recommended first test: `sshleifer/tiny-gpt2`. The Space remains private and success is still gated by wrapper-owned live API validation.
143
+
144
+
145
+ ## Phase 6
146
+
147
+ Adds a no-build runtime recommender Job that analyzes model metadata and writes `runtime_recommendation.json` to the Bucket.
148
+
149
+
150
+ ## Phase 7 — LongCat article reproduction
151
+
152
+ Phase 7 attempts an article-style LongCat Space build: private target Space, Pi-guided app adaptation, ZeroGPU first, fixed GPU fallback when explicitly enabled, and live `/health` API validation. Full video generation remains a manual-review step until model-specific runtime validation is complete.
app.py CHANGED
@@ -14,6 +14,7 @@ from src.jobs import (
14
  launch_hello_job,
15
  launch_pi_gist_recipe_job,
16
  launch_pi_model_card_job,
 
17
  launch_pi_space_smoke_job,
18
  )
19
  from src.runs import make_run_id, validate_run_id
@@ -21,7 +22,7 @@ from src.security import redact
21
 
22
 
23
  APP_DESCRIPTION = f"""
24
- # Agentic Space Factory — V5 Model Card
25
 
26
  This version validates the two critical foundations:
27
 
@@ -31,6 +32,8 @@ Phase 2: HF OAuth → HF Job → private target Space → file upload → live G
31
  Phase 3: HF OAuth → HF Job → Pi modifies app.py → private target Space → live API validation → Pi traces
32
  Phase 4: HF OAuth → HF Job → Pi reads gist → uses hf CLI → private Space → wrapper live API validation
33
  Phase 5: HF OAuth → HF Job → model-card analysis → Pi adapts template → private model Space → live API validation
 
 
34
  ```
35
 
36
  Configured bucket: `{settings.bucket_uri}`
@@ -81,6 +84,71 @@ def propose_model_run_id() -> str:
81
  return make_run_id("model")
82
 
83
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
 
85
  def launch_pi_model_card_job_ui(
86
  requested_run_id: str,
@@ -245,6 +313,144 @@ def build_demo() -> gr.Blocks:
245
  demo.load(fn=get_login_status, inputs=None, outputs=login_status)
246
 
247
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
248
  with gr.Tab("Phase 5 — Model card → private Space"):
249
  gr.Markdown(
250
  """
 
14
  launch_hello_job,
15
  launch_pi_gist_recipe_job,
16
  launch_pi_model_card_job,
17
+ launch_runtime_recommender_job,
18
  launch_pi_space_smoke_job,
19
  )
20
  from src.runs import make_run_id, validate_run_id
 
22
 
23
 
24
  APP_DESCRIPTION = f"""
25
+ # Agentic Space Factory — V6 Runtime Recommender
26
 
27
  This version validates the two critical foundations:
28
 
 
32
  Phase 3: HF OAuth → HF Job → Pi modifies app.py → private target Space → live API validation → Pi traces
33
  Phase 4: HF OAuth → HF Job → Pi reads gist → uses hf CLI → private Space → wrapper live API validation
34
  Phase 5: HF OAuth → HF Job → model-card analysis → Pi adapts template → private model Space → live API validation
35
+ Phase 6: HF OAuth → HF Job → model-card/runtime analysis → runtime/hardware recommendation → Bucket report
36
+ Phase 7: HF OAuth → HF Job → LongCat article-style Space → ZeroGPU attempt → fixed GPU fallback → live health API validation
37
  ```
38
 
39
  Configured bucket: `{settings.bucket_uri}`
 
84
  return make_run_id("model")
85
 
86
 
87
+ def propose_runtime_run_id() -> str:
88
+ return make_run_id("runtime")
89
+
90
+
91
+ def propose_longcat_run_id() -> str:
92
+ return make_run_id("longcat")
93
+
94
+
95
+ def launch_longcat_article_job_ui(
96
+ requested_run_id: str,
97
+ model_id: str,
98
+ target_space_name: str,
99
+ pi_model: str,
100
+ preferred_hardware: str,
101
+ allow_fixed_gpu_fallback: bool,
102
+ fallback_hardware: str,
103
+ profile: gr.OAuthProfile | None,
104
+ oauth_token: gr.OAuthToken | None,
105
+ ) -> tuple[str, str, str, str, str, str]:
106
+ username = _profile_username(profile)
107
+ token = _token_value(oauth_token)
108
+ if not username or not token:
109
+ raise gr.Error("Please sign in with Hugging Face first. OAuth profile/token is missing.")
110
+
111
+ run_id = validate_run_id(requested_run_id or propose_longcat_run_id())
112
+ result = launch_longcat_article_job(
113
+ token=token,
114
+ username=username,
115
+ target_slug=target_space_name,
116
+ model_id=model_id,
117
+ pi_model=pi_model,
118
+ preferred_space_hardware=preferred_hardware,
119
+ fallback_space_hardware=fallback_hardware,
120
+ allow_fixed_gpu_fallback=allow_fixed_gpu_fallback,
121
+ run_id=run_id,
122
+ )
123
+ job_url = result.get("job_url") or ""
124
+ target_space = result.get("target_space") or ""
125
+ target_url = result.get("target_space_url") or ""
126
+ summary = json.dumps(result, indent=2)
127
+ return run_id, result["job_id"], job_url, target_space, target_url, summary
128
+
129
+
130
+ def launch_runtime_recommender_job_ui(
131
+ requested_run_id: str,
132
+ model_id: str,
133
+ profile: gr.OAuthProfile | None,
134
+ oauth_token: gr.OAuthToken | None,
135
+ ) -> tuple[str, str, str, str]:
136
+ username = _profile_username(profile)
137
+ token = _token_value(oauth_token)
138
+ if not username or not token:
139
+ raise gr.Error("Please sign in with Hugging Face first. OAuth profile/token is missing.")
140
+
141
+ run_id = validate_run_id(requested_run_id or propose_runtime_run_id())
142
+ result = launch_runtime_recommender_job(
143
+ token=token,
144
+ username=username,
145
+ model_id=model_id,
146
+ run_id=run_id,
147
+ )
148
+ job_url = result.get("job_url") or ""
149
+ summary = json.dumps(result, indent=2)
150
+ return run_id, result["job_id"], job_url, summary
151
+
152
 
153
  def launch_pi_model_card_job_ui(
154
  requested_run_id: str,
 
313
  demo.load(fn=get_login_status, inputs=None, outputs=login_status)
314
 
315
 
316
+ with gr.Tab("Phase 7 — LongCat article reproduction"):
317
+ gr.Markdown(
318
+ """
319
+ This phase attempts to reproduce the article-style workflow for `meituan-longcat/LongCat-Video-Avatar-1.5`.
320
+
321
+ It creates a **private** target Space, asks Pi to adapt a LongCat app scaffold while following the HF Spaces gist, requests `zero-a10g` first, and optionally falls back to a fixed GPU hardware if ZeroGPU is unavailable/quota-limited.
322
+
323
+ Safety: the Space remains private, publication is never automatic, and the wrapper validates a cheap `/health` endpoint first. Full video generation may still require manual review and real GPU/runtime tuning.
324
+ """
325
+ )
326
+ with gr.Row():
327
+ longcat_run_id_box = gr.Textbox(label="Run ID", value=propose_longcat_run_id, interactive=True)
328
+ new_longcat_run_btn = gr.Button("Generate new run id")
329
+ new_longcat_run_btn.click(fn=propose_longcat_run_id, inputs=None, outputs=longcat_run_id_box)
330
+
331
+ longcat_model_id_box = gr.Textbox(
332
+ label="Model ID",
333
+ value="meituan-longcat/LongCat-Video-Avatar-1.5",
334
+ info="Default is the model from the article. You can override for controlled experiments.",
335
+ )
336
+ longcat_target_space_name = gr.Textbox(
337
+ label="Target Space name",
338
+ placeholder="e.g. space-factory-longcat-v1",
339
+ info="Use a fresh name. The Space is created under your username and remains private.",
340
+ )
341
+ longcat_pi_model_box = gr.Textbox(
342
+ label="Pi model",
343
+ value="moonshotai/Kimi-K2.5",
344
+ info="Model used by Pi through Hugging Face Inference Providers.",
345
+ )
346
+ with gr.Row():
347
+ longcat_preferred_hw = gr.Dropdown(
348
+ label="Preferred Space hardware",
349
+ choices=["zero-a10g", "l40sx1", "a10g-large", "a100-large", "h200"],
350
+ value="zero-a10g",
351
+ info="The worker requests this first. Use zero-a10g to try ZeroGPU.",
352
+ )
353
+ longcat_allow_fallback = gr.Checkbox(
354
+ label="Allow fixed GPU fallback",
355
+ value=True,
356
+ info="If ZeroGPU request fails, request the fallback hardware below. This may incur billing.",
357
+ )
358
+ longcat_fallback_hw = gr.Dropdown(
359
+ label="Fallback Space hardware",
360
+ choices=["l40sx1", "a10g-large", "a100-large", "h200", "t4-medium"],
361
+ value="l40sx1",
362
+ info="Used only if preferred hardware request fails and fallback is enabled.",
363
+ )
364
+
365
+ launch_longcat_btn = gr.Button("Run LongCat article reproduction", variant="primary")
366
+ phase7_job_id_box = gr.Textbox(label="Job ID", interactive=True)
367
+ phase7_job_url_box = gr.Textbox(label="Job URL", interactive=False)
368
+ phase7_target_space_box = gr.Textbox(label="Target Space", interactive=False)
369
+ phase7_target_url_box = gr.Textbox(label="Target Space URL", interactive=False)
370
+ phase7_launch_result = gr.Code(label="Launch result", language="json")
371
+
372
+ launch_longcat_btn.click(
373
+ fn=launch_longcat_article_job_ui,
374
+ inputs=[
375
+ longcat_run_id_box,
376
+ longcat_model_id_box,
377
+ longcat_target_space_name,
378
+ longcat_pi_model_box,
379
+ longcat_preferred_hw,
380
+ longcat_allow_fallback,
381
+ longcat_fallback_hw,
382
+ ],
383
+ outputs=[
384
+ longcat_run_id_box,
385
+ phase7_job_id_box,
386
+ phase7_job_url_box,
387
+ phase7_target_space_box,
388
+ phase7_target_url_box,
389
+ phase7_launch_result,
390
+ ],
391
+ )
392
+
393
+ phase7_refresh_btn = gr.Button("Refresh Phase 7 run status")
394
+ with gr.Tab("Phase 7 state"):
395
+ phase7_state = gr.Code(label="state.json", language="json")
396
+ with gr.Tab("Phase 7 events"):
397
+ phase7_events = gr.Code(label="events.jsonl", language="json")
398
+ with gr.Tab("Phase 7 report"):
399
+ phase7_report = gr.Markdown()
400
+ with gr.Tab("Phase 7 job"):
401
+ phase7_job_info = gr.Code(label="Job info/logs", language="json")
402
+
403
+ phase7_refresh_btn.click(
404
+ fn=refresh_run_ui,
405
+ inputs=[longcat_run_id_box, phase7_job_id_box],
406
+ outputs=[phase7_state, phase7_events, phase7_report, phase7_job_info],
407
+ )
408
+
409
+ with gr.Tab("Phase 6 — Runtime recommender"):
410
+ gr.Markdown(
411
+ """
412
+ This phase does **not** create a Space. It analyzes a `model_id` and writes a runtime/hardware recommendation into the Bucket.
413
+
414
+ Use it as a gate before auto-building a Space: small text models can go through Phase 5, Diffusers models become ZeroGPU candidates, and large/custom/gated models are marked for manual review.
415
+ """
416
+ )
417
+ with gr.Row():
418
+ runtime_run_id_box = gr.Textbox(label="Run ID", value=propose_runtime_run_id, interactive=True)
419
+ new_runtime_run_btn = gr.Button("Generate new run id")
420
+ new_runtime_run_btn.click(fn=propose_runtime_run_id, inputs=None, outputs=runtime_run_id_box)
421
+
422
+ runtime_model_id_box = gr.Textbox(
423
+ label="Model ID",
424
+ value="sshleifer/tiny-gpt2",
425
+ info="Try `sshleifer/tiny-gpt2` for CPU Basic, or a Diffusers text-to-image model to see a ZeroGPU candidate recommendation.",
426
+ )
427
+ launch_runtime_btn = gr.Button("Analyze runtime recommendation", variant="primary")
428
+ phase6_job_id_box = gr.Textbox(label="Job ID", interactive=False)
429
+ phase6_job_url_box = gr.Textbox(label="Job URL", interactive=False)
430
+ phase6_launch_result = gr.Code(label="Launch result", language="json")
431
+
432
+ launch_runtime_btn.click(
433
+ fn=launch_runtime_recommender_job_ui,
434
+ inputs=[runtime_run_id_box, runtime_model_id_box],
435
+ outputs=[runtime_run_id_box, phase6_job_id_box, phase6_job_url_box, phase6_launch_result],
436
+ )
437
+
438
+ phase6_refresh_btn = gr.Button("Refresh Phase 6 run status")
439
+ with gr.Tab("Phase 6 state"):
440
+ phase6_state = gr.Code(label="state.json", language="json")
441
+ with gr.Tab("Phase 6 events"):
442
+ phase6_events = gr.Code(label="events.jsonl", language="json")
443
+ with gr.Tab("Phase 6 report"):
444
+ phase6_report = gr.Markdown()
445
+ with gr.Tab("Phase 6 job"):
446
+ phase6_job_info = gr.Code(label="Job info/logs", language="json")
447
+
448
+ phase6_refresh_btn.click(
449
+ fn=refresh_run_ui,
450
+ inputs=[runtime_run_id_box, phase6_job_id_box],
451
+ outputs=[phase6_state, phase6_events, phase6_report, phase6_job_info],
452
+ )
453
+
454
  with gr.Tab("Phase 5 — Model card → private Space"):
455
  gr.Markdown(
456
  """