---
title: Agentic Space Factory
emoji: 🏭
colorFrom: blue
colorTo: purple
sdk: gradio
app_file: app.py
python_version: "3.11"
pinned: false
hf_oauth: true
hf_oauth_expiration_minutes: 480
hf_oauth_scopes:
  - read-repos
  - write-repos
  - manage-repos
  - gated-repos
  - inference-api
  - jobs
  - read-billing
---

# Agentic Space Factory — V9 LongCat Full-Inference Gate

This version validates the safe foundation for a Hugging Face-native “Agentic Space Factory”.

It now supports two phases:

```text
Phase 1:
Gradio Space OAuth user
→ launch Hugging Face Job with the user's OAuth token
→ mount private Storage Bucket
→ write run state/events/report
→ read run status back in the orchestrator UI

Phase 2.1:
Gradio Space OAuth user
→ launch Hugging Face Job
→ create private target Gradio Space in the user's namespace
→ upload app.py / requirements.txt / README.md
→ validate the live Space through gradio_client
→ write run state/events/report to the Bucket
```

The configured bucket is:

```text
hf://buckets/fffiloni/space-factory-runs
```

## What this version does

- Enables Hugging Face OAuth in a Gradio Space.
- Requests the `jobs` scope, plus repo/inference scopes needed by later phases.
- Launches CPU Hugging Face Jobs using `huggingface_hub.run_job`.
- Mounts `fffiloni/space-factory-runs` as `/output` in the Job.
- Passes the OAuth token to the Job as an encrypted secret, not as a CLI argument.
- Phase 1 writes these files in the bucket:

```text
runs/<run_id>/state.json
runs/<run_id>/events.jsonl
runs/<run_id>/report.md
```

- Phase 2.1 additionally creates a private target Space and stores:

```text
runs/<run_id>/target_space.json
runs/<run_id>/generated/app.py
runs/<run_id>/generated/requirements.txt
runs/<run_id>/generated/README.md
runs/<run_id>/tests/api_schema.json
runs/<run_id>/tests/test_result.json
```

## What this version does not do yet

- It does not run Pi yet.
- It does not analyze model cards yet.
- It does not configure ZeroGPU yet.
- It does not publish anything publicly.
- It does not overwrite existing target Spaces.

Those are intentionally left for the next increments once OAuth → Jobs → Bucket → private Space creation → live API validation is confirmed.

## Configuration

Default values are in `src/config.py`.

You can override them with Space variables:

```bash
SPACE_FACTORY_BUCKET_SOURCE=fffiloni/space-factory-runs
SPACE_FACTORY_BUCKET_MOUNT=/output
SPACE_FACTORY_JOB_FLAVOR=cpu-basic
SPACE_FACTORY_JOB_TIMEOUT=15m
SPACE_FACTORY_JOB_IMAGE=python:3.12
```

For Phase 2.1, a 15-minute timeout is usually enough for a tiny Gradio Space. Increase it if Space builds are slow.

## Local notes

OAuth injection only works inside a Hugging Face Space with `hf_oauth: true`.
For local UI development, the app can render, but launching a Job requires a real OAuth token passed by Gradio in a Space.

## Security posture

- No global admin token is required.
- The user's OAuth token is used only to launch the Job and is passed to the Job as a secret.
- The worker script never prints the token.
- The target bucket should remain private.
- Phase 2.1 target Spaces are private by default.
- Raw traces and future Pi sessions must stay private by default.


## Phase 3 — Pi smoke test
- Phase 4 — Pi gist recipe

This phase installs `@mariozechner/pi-coding-agent` inside the HF Job, configures Pi with Hugging Face Inference Providers using the OAuth token, and asks Pi to make one safe edit to a generated Gradio app before creating the private target Space.

Expected run artifacts:

```text
runs/<run_id>/generated/app.py
runs/<run_id>/logs/pi_output.txt
runs/<run_id>/traces/raw/*.jsonl
runs/<run_id>/traces/redacted/*.jsonl
runs/<run_id>/tests/test_result.json
```

The target Space remains private by default. Success is only declared after the live Gradio API returns the expected Pi-modified output.


## Phase 4

Phase 4 asks Pi to follow the HF Spaces Agent Quickstart gist and use the `hf` CLI inside an HF Job to create/upload a private Space. The wrapper independently validates the live Gradio API before reporting success.


## V5

Adds Phase 5: model-card analysis for simple Transformers text pipeline models. Recommended first test: `sshleifer/tiny-gpt2`. The Space remains private and success is still gated by wrapper-owned live API validation.


## Phase 6

Adds a no-build runtime recommender Job that analyzes model metadata and writes `runtime_recommendation.json` to the Bucket.


## Phase 8 — LongCat article reproduction

Phase 8 attempts an article-style LongCat Space build: private target Space, Pi-guided app adaptation, ZeroGPU first, fixed GPU fallback when explicitly enabled, and live `/health` API validation. Full video generation remains a manual-review step until model-specific runtime validation is complete.

## V8 LongCat robust changes

V8 focuses on the issues discovered during the first LongCat runs:

- validates a cheap HTTP `GET /health` route before falling back to `gradio_client`;
- collects best-effort Space runtime/log diagnostics into the Bucket;
- treats `request_space_hardware` as best-effort because OAuth tokens may create/write Spaces but still fail on paid hardware changes;
- stops retrying hardware on clear 401/auth failures and marks manual hardware action as required;
- uploads the entire Pi workspace recursively, so generated packages such as `longcat_video/` are preserved;
- defaults the Pi model field to `Qwen/Qwen3-Coder-Next`, while keeping it editable;
- adds an implementation mode: `full-inference-attempt` or `safe-scaffold`.


## V9 LongCat full-inference gate

V9 keeps the robust V8 health validation but changes the meaning of success for LongCat-style runs. A Space that only boots and exposes `/health` is no longer treated as a full inference reproduction.

Key behavior:

- default implementation mode is `full-inference-gated`;
- Pi is instructed not to silently replace generation with a docs-only placeholder;
- if real inference cannot be wired, Pi should produce `TECHNICAL_BLOCKERS.json`;
- the worker writes `inference_gate.json` with a status such as `technical_blocker`, `health_only`, or `full_inference_candidate_health_passed`;
- Pi must investigate SDPA, xformers, and HF Kernels flash-attn alternatives before declaring flash-attn a hard blocker;
- hardware changes remain best-effort because OAuth tokens may create/write Spaces but fail on paid hardware changes.

This phase is designed to distinguish “bootable scaffold” from “functional model reproduction”.