fffiloni's picture
Upload 4 files
9c6bd0f verified
|
Raw
History Blame
2.95 kB
# Architecture
Agentic Space Factory is a Hugging Face-native implementation of the local agent workflow described in the ZeroGPU Spaces article.
```text
User
β†’ Gradio orchestrator Space with HF OAuth
β†’ ephemeral HF Job
β†’ Pi coding agent + HF Inference Providers model
β†’ generated private target Space
β†’ Storage Bucket run record
β†’ live validation job when hardware is ready
```
## Components
### Orchestrator Space
The public UI has two workflows:
1. **Build from model card** β€” starts an HF Job that analyzes a model card and asks Pi to generate a private Gradio Space.
2. **Validate existing Space** β€” starts a separate HF Job that smoke-tests a generated Space after hardware has been configured, measures latency, and stores the output artifact.
The orchestrator never stores a global admin token. It uses the signed-in user's HF OAuth token.
### HF Jobs
Jobs do the long-running work: installing Pi, generating code, creating/uploading the target Space, checking runtime state, and running live validations.
The builder job is allowed to create a private Space and upload generated files. Hardware assignment is attempted on a best-effort basis only. If ZeroGPU or fixed-GPU assignment fails because of quota, billing, or OAuth limits, the run is marked as requiring manual hardware.
### Pi + coding model
Pi runs inside the Job and uses a model such as `Qwen/Qwen3-Coder-Next` through Hugging Face Inference Providers. It receives a strict goal:
- generate a Gradio app from the model card;
- keep the Space private;
- add `/health` and generation endpoints where possible;
- do not mark placeholders as full inference;
- write blockers if full inference is impossible.
### Storage Bucket
Every run writes to the configured Bucket:
```text
runs/<run_id>/state.json
runs/<run_id>/events.jsonl
runs/<run_id>/report.md
runs/<run_id>/generated/
runs/<run_id>/tests/
runs/<run_id>/artifacts/
runs/<run_id>/traces/
```
### Target Space
Generated Spaces are private by default. The builder attempts ZeroGPU first when selected, then an optional fixed-GPU fallback. If both fail, the Space can still be configured manually in Settings, then validated with the second workflow.
## Result lifecycle
```text
Build from model card
β†’ generated private Space
β†’ ZeroGPU/fixed GPU best-effort
β†’ health/API gate
β†’ manual_hardware_required or candidate status
Validate existing Space
β†’ call /generate or configured endpoint
β†’ verify output type
β†’ measure latency
β†’ save artifact
β†’ full_inference_success when output is valid
```
## Known limits
- Automatic paid hardware assignment through OAuth may fail; manual hardware selection is supported.
- ZeroGPU may be unavailable because of quota or namespace limits.
- Multi-GPU, Docker-only, ComfyUI, custom CUDA/FlashAttention, external API keys, or gated models may require manual intervention or produce `technical_blocker`.