# Architecture Agentic Space Factory is a Hugging Face-native implementation of the local agent workflow described in the ZeroGPU Spaces article. ```text User → Gradio orchestrator Space with HF OAuth → ephemeral HF Job → Pi coding agent + HF Inference Providers model → generated private target Space → Storage Bucket run record → live validation job when hardware is ready ``` ## Components ### Orchestrator Space The public UI has two workflows: 1. **Build from model card** — starts an HF Job that analyzes a model card and asks Pi to generate a private Gradio Space. 2. **Validate existing Space** — starts a separate HF Job that smoke-tests a generated Space after hardware has been configured, measures latency, and stores the output artifact. The orchestrator never stores a global admin token. It uses the signed-in user's HF OAuth token. ### HF Jobs Jobs do the long-running work: installing Pi, generating code, creating/uploading the target Space, checking runtime state, and running live validations. The builder job is allowed to create a private Space and upload generated files. Hardware assignment is attempted on a best-effort basis only. If ZeroGPU or fixed-GPU assignment fails because of quota, billing, or OAuth limits, the run is marked as requiring manual hardware. ### Pi + coding model Pi runs inside the Job and uses a model such as `Qwen/Qwen3-Coder-Next` through Hugging Face Inference Providers. It receives a strict goal: - generate a Gradio app from the model card; - keep the Space private; - add `/health` and generation endpoints where possible; - do not mark placeholders as full inference; - write blockers if full inference is impossible. ### Storage Bucket Every run writes to the configured Bucket: ```text runs//state.json runs//events.jsonl runs//report.md runs//generated/ runs//tests/ runs//artifacts/ runs//traces/ ``` ### Target Space Generated Spaces are private by default. The builder attempts ZeroGPU first when selected, then an optional fixed-GPU fallback. If both fail, the Space can still be configured manually in Settings, then validated with the second workflow. ## Result lifecycle ```text Build from model card → generated private Space → ZeroGPU/fixed GPU best-effort → health/API gate → manual_hardware_required or candidate status Validate existing Space → call /generate or configured endpoint → verify output type → measure latency → save artifact → full_inference_success when output is valid ``` ## Known limits - Automatic paid hardware assignment through OAuth may fail; manual hardware selection is supported. - ZeroGPU may be unavailable because of quota or namespace limits. - Multi-GPU, Docker-only, ComfyUI, custom CUDA/FlashAttention, external API keys, or gated models may require manual intervention or produce `technical_blocker`.