File size: 2,951 Bytes
6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 6f5363d 9c6bd0f 58f8f62 9c6bd0f 58f8f62 9c6bd0f 58f8f62 9c6bd0f 58f8f62 9c6bd0f 58f8f62 9c6bd0f 58f8f62 9c6bd0f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | # Architecture
Agentic Space Factory is a Hugging Face-native implementation of the local agent workflow described in the ZeroGPU Spaces article.
```text
User
β Gradio orchestrator Space with HF OAuth
β ephemeral HF Job
β Pi coding agent + HF Inference Providers model
β generated private target Space
β Storage Bucket run record
β live validation job when hardware is ready
```
## Components
### Orchestrator Space
The public UI has two workflows:
1. **Build from model card** β starts an HF Job that analyzes a model card and asks Pi to generate a private Gradio Space.
2. **Validate existing Space** β starts a separate HF Job that smoke-tests a generated Space after hardware has been configured, measures latency, and stores the output artifact.
The orchestrator never stores a global admin token. It uses the signed-in user's HF OAuth token.
### HF Jobs
Jobs do the long-running work: installing Pi, generating code, creating/uploading the target Space, checking runtime state, and running live validations.
The builder job is allowed to create a private Space and upload generated files. Hardware assignment is attempted on a best-effort basis only. If ZeroGPU or fixed-GPU assignment fails because of quota, billing, or OAuth limits, the run is marked as requiring manual hardware.
### Pi + coding model
Pi runs inside the Job and uses a model such as `Qwen/Qwen3-Coder-Next` through Hugging Face Inference Providers. It receives a strict goal:
- generate a Gradio app from the model card;
- keep the Space private;
- add `/health` and generation endpoints where possible;
- do not mark placeholders as full inference;
- write blockers if full inference is impossible.
### Storage Bucket
Every run writes to the configured Bucket:
```text
runs/<run_id>/state.json
runs/<run_id>/events.jsonl
runs/<run_id>/report.md
runs/<run_id>/generated/
runs/<run_id>/tests/
runs/<run_id>/artifacts/
runs/<run_id>/traces/
```
### Target Space
Generated Spaces are private by default. The builder attempts ZeroGPU first when selected, then an optional fixed-GPU fallback. If both fail, the Space can still be configured manually in Settings, then validated with the second workflow.
## Result lifecycle
```text
Build from model card
β generated private Space
β ZeroGPU/fixed GPU best-effort
β health/API gate
β manual_hardware_required or candidate status
Validate existing Space
β call /generate or configured endpoint
β verify output type
β measure latency
β save artifact
β full_inference_success when output is valid
```
## Known limits
- Automatic paid hardware assignment through OAuth may fail; manual hardware selection is supported.
- ZeroGPU may be unavailable because of quota or namespace limits.
- Multi-GPU, Docker-only, ComfyUI, custom CUDA/FlashAttention, external API keys, or gated models may require manual intervention or produce `technical_blocker`.
|