fffiloni's picture
Upload 4 files
9c6bd0f verified
|
Raw
History Blame
2.95 kB

Architecture

Agentic Space Factory is a Hugging Face-native implementation of the local agent workflow described in the ZeroGPU Spaces article.

User
  β†’ Gradio orchestrator Space with HF OAuth
  β†’ ephemeral HF Job
  β†’ Pi coding agent + HF Inference Providers model
  β†’ generated private target Space
  β†’ Storage Bucket run record
  β†’ live validation job when hardware is ready

Components

Orchestrator Space

The public UI has two workflows:

  1. Build from model card β€” starts an HF Job that analyzes a model card and asks Pi to generate a private Gradio Space.
  2. Validate existing Space β€” starts a separate HF Job that smoke-tests a generated Space after hardware has been configured, measures latency, and stores the output artifact.

The orchestrator never stores a global admin token. It uses the signed-in user's HF OAuth token.

HF Jobs

Jobs do the long-running work: installing Pi, generating code, creating/uploading the target Space, checking runtime state, and running live validations.

The builder job is allowed to create a private Space and upload generated files. Hardware assignment is attempted on a best-effort basis only. If ZeroGPU or fixed-GPU assignment fails because of quota, billing, or OAuth limits, the run is marked as requiring manual hardware.

Pi + coding model

Pi runs inside the Job and uses a model such as Qwen/Qwen3-Coder-Next through Hugging Face Inference Providers. It receives a strict goal:

  • generate a Gradio app from the model card;
  • keep the Space private;
  • add /health and generation endpoints where possible;
  • do not mark placeholders as full inference;
  • write blockers if full inference is impossible.

Storage Bucket

Every run writes to the configured Bucket:

runs/<run_id>/state.json
runs/<run_id>/events.jsonl
runs/<run_id>/report.md
runs/<run_id>/generated/
runs/<run_id>/tests/
runs/<run_id>/artifacts/
runs/<run_id>/traces/

Target Space

Generated Spaces are private by default. The builder attempts ZeroGPU first when selected, then an optional fixed-GPU fallback. If both fail, the Space can still be configured manually in Settings, then validated with the second workflow.

Result lifecycle

Build from model card
  β†’ generated private Space
  β†’ ZeroGPU/fixed GPU best-effort
  β†’ health/API gate
  β†’ manual_hardware_required or candidate status

Validate existing Space
  β†’ call /generate or configured endpoint
  β†’ verify output type
  β†’ measure latency
  β†’ save artifact
  β†’ full_inference_success when output is valid

Known limits

  • Automatic paid hardware assignment through OAuth may fail; manual hardware selection is supported.
  • ZeroGPU may be unavailable because of quota or namespace limits.
  • Multi-GPU, Docker-only, ComfyUI, custom CUDA/FlashAttention, external API keys, or gated models may require manual intervention or produce technical_blocker.