agentic-space-factory-etheroi

Paused

App Files Files Community

agentic-space-factory-etheroi / docs /ARCHITECTURE.md

fffiloni

Upload 4 files

9c6bd0f verified 27 days ago

preview code

Raw

History Blame

2.95 kB

	# Architecture

	Agentic Space Factory is a Hugging Face-native implementation of the local agent workflow described in the ZeroGPU Spaces article.

	```text
	User
	→ Gradio orchestrator Space with HF OAuth
	→ ephemeral HF Job
	→ Pi coding agent + HF Inference Providers model
	→ generated private target Space
	→ Storage Bucket run record
	→ live validation job when hardware is ready
	```

	## Components

	### Orchestrator Space

	The public UI has two workflows:

	1. Build from model card — starts an HF Job that analyzes a model card and asks Pi to generate a private Gradio Space.
	2. Validate existing Space — starts a separate HF Job that smoke-tests a generated Space after hardware has been configured, measures latency, and stores the output artifact.

	The orchestrator never stores a global admin token. It uses the signed-in user's HF OAuth token.

	### HF Jobs

	Jobs do the long-running work: installing Pi, generating code, creating/uploading the target Space, checking runtime state, and running live validations.

	The builder job is allowed to create a private Space and upload generated files. Hardware assignment is attempted on a best-effort basis only. If ZeroGPU or fixed-GPU assignment fails because of quota, billing, or OAuth limits, the run is marked as requiring manual hardware.

	### Pi + coding model

	Pi runs inside the Job and uses a model such as `Qwen/Qwen3-Coder-Next` through Hugging Face Inference Providers. It receives a strict goal:

	- generate a Gradio app from the model card;
	- keep the Space private;
	- add `/health` and generation endpoints where possible;
	- do not mark placeholders as full inference;
	- write blockers if full inference is impossible.

	### Storage Bucket

	Every run writes to the configured Bucket:

	```text
	runs/<run_id>/state.json
	runs/<run_id>/events.jsonl
	runs/<run_id>/report.md
	runs/<run_id>/generated/
	runs/<run_id>/tests/
	runs/<run_id>/artifacts/
	runs/<run_id>/traces/
	```

	### Target Space

	Generated Spaces are private by default. The builder attempts ZeroGPU first when selected, then an optional fixed-GPU fallback. If both fail, the Space can still be configured manually in Settings, then validated with the second workflow.

	## Result lifecycle

	```text
	Build from model card
	→ generated private Space
	→ ZeroGPU/fixed GPU best-effort
	→ health/API gate
	→ manual_hardware_required or candidate status

	Validate existing Space
	→ call /generate or configured endpoint
	→ verify output type
	→ measure latency
	→ save artifact
	→ full_inference_success when output is valid
	```

	## Known limits

	- Automatic paid hardware assignment through OAuth may fail; manual hardware selection is supported.
	- ZeroGPU may be unavailable because of quota or namespace limits.
	- Multi-GPU, Docker-only, ComfyUI, custom CUDA/FlashAttention, external API keys, or gated models may require manual intervention or produce `technical_blocker`.