# Feature Demo: F001 — Core Environment Loop > **Generated:** 2026-03-24T21:36:32Z > **Context source:** spec + discovery only (implementation not read) > **Feature entry:** [FEATURES.json (F001)](./FEATURES.json) --- ## What This Feature Does F001 turns the SQL environment from a non-functional loop into a usable episode flow: an agent can reset into a question, explore schema/data with structured actions, run SQL safely, and terminate with an answer or budget exhaustion. From a user perspective, this should feel predictable and teachable: fast query feedback, clear errors when a query/action is invalid, and clean episode boundaries. --- ## What Is Already Proven ### Verified in This Demo Run - Server startup works locally via `uv run uvicorn server.app:app --host 127.0.0.1 --port 8011` (startup/shutdown logs captured). - The environment currently fails at `/reset` in this workspace because the required Spider DB file is missing (`FileNotFoundError` for `student_assessment`). - Downloader CLI is present and runnable (`--help` works). - Downloader input hardening rejects unsafe DB identifiers (e.g. `../bad`). - Full local test suite passes (`25 passed`). ### Previously Verified Evidence - `specs/FEATURES.json` (`features[].id == F001`) records verification evidence: `uv run pytest tests/ -v`, 25/25 passed, verifier `approved` at `2026-03-24T21:27:31Z`. - `specs/F001-IMPLEMENTATION_SPEC.md` Section 10 states user-value behavior for reset/step lifecycle and structured actions. --- ## What Still Needs User Verification - Provision `data/databases/student_assessment/student_assessment.sqlite` successfully in your environment. - Re-run live `/reset` and `/step` API calls after DB provisioning to confirm end-to-end episode behavior (DESCRIBE/SAMPLE/QUERY/ANSWER). --- ## Quickstart / Verification Steps > Run these commands to see the feature in action: ```bash uv run uvicorn server.app:app --host 127.0.0.1 --port 8011 uv run python scripts/download_spider_databases.py --db-id student_assessment uv run pytest tests/ -v ``` If `/reset` fails with missing DB, complete the DB download/provisioning first, then retry API interactions. --- ## Live Local Proof ### Start the Environment Server This confirms the feature surface is exposed on a local API endpoint. ```bash uv run uvicorn server.app:app --host 127.0.0.1 --port 8011 ``` ```text INFO: Started server process [26402] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://127.0.0.1:8011 (Press CTRL+C to quit) INFO: Shutting down INFO: Waiting for application shutdown. INFO: Application shutdown complete. INFO: Finished server process [26402] bash tool terminated command after exceeding timeout 8000 ms ``` The API process starts successfully and advertises the expected local URL. ### Attempt Reset Without Database Provisioning (Proof Boundary) This shows the current environment boundary in this workspace: reset cannot complete until DB assets are present. ```bash uv run python - <<'PY' import httpx from server.app import app transport = httpx.ASGITransport(app=app) async def main(): async with httpx.AsyncClient(transport=transport, base_url="http://local") as client: try: await client.post('/reset', json={}) except Exception as exc: print(type(exc).__name__) print(str(exc)) import asyncio asyncio.run(main()) PY ``` ```text Loaded tokenizer: mistralai/Mistral-7B-Instruct-v0.1 FileNotFoundError Database 'student_assessment' not found in /Users/hjerp/Projects/sql-env-F001-core-environment-loop/data/databases ``` The failure is explicit and actionable (missing DB), not a crash or opaque error. --- ## Existing Evidence - Verification record source: `specs/FEATURES.json` → `features[F001].verification_evidence`. - Verification spec source: `specs/F001-VERIFICATION_SPEC.md` (unit/integration/API/E2E scenarios and edge-case checklist). --- ## Manual Verification Checklist 1. Download/provision Spider DB files so `student_assessment.sqlite` exists under `data/databases/student_assessment/`. 2. Start server: `uv run uvicorn server.app:app --host 127.0.0.1 --port 8011`. 3. POST `/reset` and confirm `done=false`, question present, and schema table names visible. 4. POST `/step` with `DESCRIBE` and `QUERY` actions; confirm step/budget updates and readable results. 5. POST invalid `QUERY` (non-SELECT) and verify clear error in observation. 6. POST `ANSWER` and verify terminal `done=true` with reward behavior. --- ## Edge Cases Exercised ### Unsafe Database Identifier Rejected ```bash uv run python scripts/download_spider_databases.py --db-id "../bad" ``` ```text ValueError: Invalid db_id. Only letters, numbers, and underscores are allowed. ``` This confirms input hardening against path-traversal style DB IDs. ### Upstream Database URL Failure Is Surfaced Clearly ```bash uv run python scripts/download_spider_databases.py --db-id student_assessment ``` ```text RuntimeError: Failed to download 'student_assessment' from Spider raw URL: HTTP Error 404: Not Found ``` This demonstrates an explicit failure mode for data provisioning when upstream URL resolution fails. --- ## Test Evidence (Optional) > Supplementary proof that the feature works correctly across scenarios. | Test Suite | Tests | Status | |---|---|---| | Smoke / contract regression (`tests/test_smoke.py`) | 25 | All passed | Representative command: ```bash uv run pytest tests/ -v ``` ```text ============================= test session starts ============================== ... collected 25 items ... ============================== 25 passed in 6.27s ============================== ``` --- ## Feature Links - Implementation spec: `specs/F001-IMPLEMENTATION_SPEC.md` - Verification spec: `specs/F001-VERIFICATION_SPEC.md` --- *Demo generated by `feature-demo` agent. Re-run with `/feature-demo F001` to refresh.*