fffiloni's picture
Upload 5 files
52793cb verified
|
Raw
History Blame
6.81 kB
metadata
title: Agentic Space Factory
emoji: 🏭
colorFrom: blue
colorTo: purple
sdk: gradio
app_file: app.py
python_version: '3.11'
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
hf_oauth_scopes:
  - read-repos
  - write-repos
  - manage-repos
  - gated-repos
  - inference-api
  - jobs
  - read-billing

Agentic Space Factory — V10 Universal Model-Card Builder

This version validates the safe foundation for a Hugging Face-native “Agentic Space Factory”.

It now supports two phases:

Phase 1:
Gradio Space OAuth user
→ launch Hugging Face Job with the user's OAuth token
→ mount private Storage Bucket
→ write run state/events/report
→ read run status back in the orchestrator UI

Phase 2.1:
Gradio Space OAuth user
→ launch Hugging Face Job
→ create private target Gradio Space in the user's namespace
→ upload app.py / requirements.txt / README.md
→ validate the live Space through gradio_client
→ write run state/events/report to the Bucket

The configured bucket is:

hf://buckets/fffiloni/space-factory-runs

What this version does

  • Enables Hugging Face OAuth in a Gradio Space.
  • Requests the jobs scope, plus repo/inference scopes needed by later phases.
  • Launches CPU Hugging Face Jobs using huggingface_hub.run_job.
  • Mounts fffiloni/space-factory-runs as /output in the Job.
  • Passes the OAuth token to the Job as an encrypted secret, not as a CLI argument.
  • Phase 1 writes these files in the bucket:
runs/<run_id>/state.json
runs/<run_id>/events.jsonl
runs/<run_id>/report.md
  • Phase 2.1 additionally creates a private target Space and stores:
runs/<run_id>/target_space.json
runs/<run_id>/generated/app.py
runs/<run_id>/generated/requirements.txt
runs/<run_id>/generated/README.md
runs/<run_id>/tests/api_schema.json
runs/<run_id>/tests/test_result.json

What this version does not do yet

  • It does not run Pi yet.
  • It does not analyze model cards yet.
  • It does not configure ZeroGPU yet.
  • It does not publish anything publicly.
  • It does not overwrite existing target Spaces.

Those are intentionally left for the next increments once OAuth → Jobs → Bucket → private Space creation → live API validation is confirmed.

Configuration

Default values are in src/config.py.

You can override them with Space variables:

SPACE_FACTORY_BUCKET_SOURCE=fffiloni/space-factory-runs
SPACE_FACTORY_BUCKET_MOUNT=/output
SPACE_FACTORY_JOB_FLAVOR=cpu-basic
SPACE_FACTORY_JOB_TIMEOUT=15m
SPACE_FACTORY_JOB_IMAGE=python:3.12

For Phase 2.1, a 15-minute timeout is usually enough for a tiny Gradio Space. Increase it if Space builds are slow.

Local notes

OAuth injection only works inside a Hugging Face Space with hf_oauth: true. For local UI development, the app can render, but launching a Job requires a real OAuth token passed by Gradio in a Space.

Security posture

  • No global admin token is required.
  • The user's OAuth token is used only to launch the Job and is passed to the Job as a secret.
  • The worker script never prints the token.
  • The target bucket should remain private.
  • Phase 2.1 target Spaces are private by default.
  • Raw traces and future Pi sessions must stay private by default.

Phase 3 — Pi smoke test

  • Phase 4 — Pi gist recipe

This phase installs @mariozechner/pi-coding-agent inside the HF Job, configures Pi with Hugging Face Inference Providers using the OAuth token, and asks Pi to make one safe edit to a generated Gradio app before creating the private target Space.

Expected run artifacts:

runs/<run_id>/generated/app.py
runs/<run_id>/logs/pi_output.txt
runs/<run_id>/traces/raw/*.jsonl
runs/<run_id>/traces/redacted/*.jsonl
runs/<run_id>/tests/test_result.json

The target Space remains private by default. Success is only declared after the live Gradio API returns the expected Pi-modified output.

Phase 4

Phase 4 asks Pi to follow the HF Spaces Agent Quickstart gist and use the hf CLI inside an HF Job to create/upload a private Space. The wrapper independently validates the live Gradio API before reporting success.

V5

Adds Phase 5: model-card analysis for simple Transformers text pipeline models. Recommended first test: sshleifer/tiny-gpt2. The Space remains private and success is still gated by wrapper-owned live API validation.

Phase 6

Adds a no-build runtime recommender Job that analyzes model metadata and writes runtime_recommendation.json to the Bucket.

Phase 8 — LongCat article reproduction

Phase 8 attempts an article-style LongCat Space build: private target Space, Pi-guided app adaptation, ZeroGPU first, fixed GPU fallback when explicitly enabled, and live /health API validation. Full video generation remains a manual-review step until model-specific runtime validation is complete.

V8 LongCat robust changes

V8 focuses on the issues discovered during the first LongCat runs:

  • validates a cheap HTTP GET /health route before falling back to gradio_client;
  • collects best-effort Space runtime/log diagnostics into the Bucket;
  • treats request_space_hardware as best-effort because OAuth tokens may create/write Spaces but still fail on paid hardware changes;
  • stops retrying hardware on clear 401/auth failures and marks manual hardware action as required;
  • uploads the entire Pi workspace recursively, so generated packages such as longcat_video/ are preserved;
  • defaults the Pi model field to Qwen/Qwen3-Coder-Next, while keeping it editable;
  • adds an implementation mode: full-inference-attempt or safe-scaffold.

V9 LongCat full-inference gate

V9 keeps the robust V8 health validation but changes the meaning of success for LongCat-style runs. A Space that only boots and exposes /health is no longer treated as a full inference reproduction.

Key behavior:

  • default implementation mode is full-inference-gated;
  • Pi is instructed not to silently replace generation with a docs-only placeholder;
  • if real inference cannot be wired, Pi should produce TECHNICAL_BLOCKERS.json;
  • the worker writes inference_gate.json with a status such as technical_blocker, health_only, or full_inference_candidate_health_passed;
  • Pi must investigate SDPA, xformers, and HF Kernels flash-attn alternatives before declaring flash-attn a hard blocker;
  • hardware changes remain best-effort because OAuth tokens may create/write Spaces but fail on paid hardware changes.

This phase is designed to distinguish “bootable scaffold” from “functional model reproduction”.

V10 Universal builder

Phase 10 accepts any Hugging Face model card URL or owner/model ID, launches Pi in a HF Job, creates a private Space, and classifies the result with a full-inference gate or technical blockers.