# Tiny Narrator Submission Packet

## One-line pitch

Tiny Narrator turns an article into a guided screen-reader experience using small, inspectable model paths.

## Short description

Tiny Narrator is a custom Gradio Server app with a Reader route and a Generate route. Reader looks like an article/blog reader until screen-reader mode is switched on: it builds an internal semantic reading queue, narrates headings and paragraphs, describes generated images, summarizes the current section, speaks with Kokoro, and keeps a visible transcript of the spoken path with reader position, runtime, and latency. Generate creates a short article from a user topic and attaches a Klein thumbnail receipt.

The prototype is designed for a live hackathon demo: every model-facing path has a deterministic fallback, runtime readiness is labeled in the UI, and the repo exposes machine-readable evidence for model size, setup, demo flow, and accessibility behavior.

## Award targets

- Tiny Titan: `/api/model-budget` reports every model role at or below 4B parameters.
- Llama Champion: the reader-brain path targets a GGUF model through a llama.cpp OpenAI-compatible endpoint.
- Off-Brand: the visible UI is custom HTML, CSS, and JavaScript served by `gr.Server`.
- Field Notes: `FIELD_NOTES.md` and the evidence APIs document model choices, fallbacks, latency, and accessibility decisions.

## Model stack

| Role | Model | Size | Runtime |
| --- | --- | ---: | --- |
| Reader brain | `nvidia/NVIDIA-Nemotron-3-Nano-4B-GGUF` | 3.97B | `llama.cpp` |
| Image understanding | `openbmb/MiniCPM-V-4.6` | 1B | OpenAI-compatible chat completions |
| Speech | `hexgrad/Kokoro-82M` | 82M | Python |
| Image generation | `black-forest-labs/FLUX.2-klein-4B` | 4B | Modal-hosted Klein (bundled fallback when not configured) |

## Demo flow

1. Open the article and show that the app is a custom article interface, not a stock chatbot page.
2. Turn on screen-reader mode and press `Space` or `Next` to narrate the first semantic node.
3. Use `Heading`, `Image`, and `Summary` to navigate by article meaning instead of by raw page order.
4. Show the reader-first session panel: current item, live narration, transcript, model stack, and latency.
5. Point to the model stack panel for model id, runtime, parameter count, and Tiny Titan pass status.
6. Open the Generate route, enter a topic, and show the generated article plus `black-forest-labs/FLUX.2-klein-4B` thumbnail receipt.
7. Mention that `/api/demo-script` exposes the judge runbook and API evidence checks as structured data.

## Evidence endpoints

- `/api/health`: app identity, custom frontend marker, llama.cpp base URL, and model manifest.
- `/api/model-budget`: Tiny Titan parameter proof with numeric `params_billion` and per-model pass values.
- `/api/runtime-setup`: app command, copyable model runtime commands rendered in the UI, environment values, and fallback paths.
- `/api/runtime-status`: online or fallback-ready status for each model path.
- `/api/accessibility-audit`: semantic queue, keyboard navigation, reader cursor state, shortcut safety, live narration, alt text, transcript, user control, and fallback evidence.
- `/api/demo-script`: repeatable judge runbook and API checks.
- `/api/image-descriptions`: generated article image descriptions plus prompt, seed, model, asset URL, and fallback status receipts.
- `/api/generate-article`: topic-to-article generation using the reader-brain path plus Klein thumbnail provenance.
- `/api/submission-readiness`: one pass/fail rollup for model budget, award evidence, custom frontend, runtime setup, runtime status, accessibility, image receipts, and demo API checks.
- `/api/evidence-bundle`: copyable JSON bundle containing schema version, generation time, core judge evidence receipts, runtime status, and readiness rollup.

The checks in `/api/demo-script` include copyable curl and PowerShell-friendly `curl.exe` commands generated from `PUBLIC_BASE_URL`, and the POST checks include sample JSON bodies for `/api/reader-brain`, `/api/speak`, and `/api/generate-article`.

The accessibility audit also documents reader-mode details that matter during judging: the active item is exposed as a reader cursor with focus, visible outline, stable id, and `aria-current`; global shortcuts ignore form controls so voice, speed, and auto-advance settings remain usable while reader mode is active; reader controls expose `aria-keyshortcuts` plus visible Repeat and Stop commands.

The image receipts distinguish live Modal Klein inference from bundled fallback assets. When `KLEIN_MODAL_ENDPOINT` is configured and reachable, generated thumbnails use real `black-forest-labs/FLUX.2-klein-4B` inference through Modal. When the worker is unavailable, bundled SVG assets are used with explicit fallback runtime metadata.

Image descriptions use `openbmb/MiniCPM-V-4.6` through an OpenAI-compatible `/v1/chat/completions` endpoint when `MINICPM_VISION_BASE_URL` and `MINICPM_VISION_API_KEY` are configured. The same endpoint is also the first reader-brain fallback when llama.cpp is unavailable, before the app falls back to deterministic narration. Without MiniCPM, cached deterministic alt text keeps screen-reader mode stable.

The submission-readiness panel and API give judges a compact checklist for the whole build, so the live demo can move from individual receipts to an overall readiness view.

`/api/evidence-bundle` returns formatted JSON for quick judging notes. Transcript copy actions label clipboard-unavailable fallback states in the visible live narration region.

## Reliability notes

The demo remains navigable when local models are unavailable. The reader brain falls back from llama.cpp to MiniCPM-V-4.6 when configured, then to deterministic narration. Generated images start with meaningful HTML alt text, image descriptions fall back to cached alt text, speech falls back to browser speech plus transcript, and image generation falls back to bundled article assets when the Modal Klein worker is not configured or unreachable. Fallback states are labeled instead of hidden.