| # Architecture |
|
|
| Agentic Space Factory is a Hugging Face-native implementation of the local agent workflow described in the ZeroGPU Spaces article. |
|
|
| ```text |
| User |
| β Gradio orchestrator Space with HF OAuth |
| β ephemeral HF Job |
| β Pi coding agent + HF Inference Providers model |
| β generated private target Space |
| β Storage Bucket run record |
| β live validation job when hardware is ready |
| ``` |
|
|
| ## Components |
|
|
| ### Orchestrator Space |
|
|
| The public UI has two workflows: |
|
|
| 1. **Build from model card** β starts an HF Job that analyzes a model card and asks Pi to generate a private Gradio Space. |
| 2. **Validate existing Space** β starts a separate HF Job that smoke-tests a generated Space after hardware has been configured, measures latency, and stores the output artifact. |
|
|
| The orchestrator never stores a global admin token. It uses the signed-in user's HF OAuth token. |
|
|
| ### HF Jobs |
|
|
| Jobs do the long-running work: installing Pi, generating code, creating/uploading the target Space, checking runtime state, and running live validations. |
|
|
| The builder job is allowed to create a private Space and upload generated files. Hardware assignment is attempted on a best-effort basis only. If ZeroGPU or fixed-GPU assignment fails because of quota, billing, or OAuth limits, the run is marked as requiring manual hardware. |
|
|
| ### Pi + coding model |
|
|
| Pi runs inside the Job and uses a model such as `Qwen/Qwen3-Coder-Next` through Hugging Face Inference Providers. It receives a strict goal: |
|
|
| - generate a Gradio app from the model card; |
| - keep the Space private; |
| - add `/health` and generation endpoints where possible; |
| - do not mark placeholders as full inference; |
| - write blockers if full inference is impossible. |
|
|
| ### Storage Bucket |
|
|
| Every run writes to the configured Bucket: |
|
|
| ```text |
| runs/<run_id>/state.json |
| runs/<run_id>/events.jsonl |
| runs/<run_id>/report.md |
| runs/<run_id>/generated/ |
| runs/<run_id>/tests/ |
| runs/<run_id>/artifacts/ |
| runs/<run_id>/traces/ |
| ``` |
|
|
| ### Target Space |
|
|
| Generated Spaces are private by default. The builder attempts ZeroGPU first when selected, then an optional fixed-GPU fallback. If both fail, the Space can still be configured manually in Settings, then validated with the second workflow. |
|
|
| ## Result lifecycle |
|
|
| ```text |
| Build from model card |
| β generated private Space |
| β ZeroGPU/fixed GPU best-effort |
| β health/API gate |
| β manual_hardware_required or candidate status |
| |
| Validate existing Space |
| β call /generate or configured endpoint |
| β verify output type |
| β measure latency |
| β save artifact |
| β full_inference_success when output is valid |
| ``` |
|
|
| ## Known limits |
|
|
| - Automatic paid hardware assignment through OAuth may fail; manual hardware selection is supported. |
| - ZeroGPU may be unavailable because of quota or namespace limits. |
| - Multi-GPU, Docker-only, ComfyUI, custom CUDA/FlashAttention, external API keys, or gated models may require manual intervention or produce `technical_blocker`. |
|
|