--- title: Sofía — Educational Companion (Spanish) colorFrom: blue colorTo: pink sdk: gradio sdk_version: 6.18.0 python_version: '3.12' app_file: app.py pinned: false license: mit short_description: Local-first Spanish voice companion for kids tags: - gradio - build-small-hackathon - backyard ai - backyard-ai - off the grid - off-the-grid - well tuned - well-tuned - off brand - off-brand - sharing is caring - sharing-is-caring - field notes - field-notes - best use of modal - best-use-of-modal - modal - zerogpu - qwen2.5 - qlora - conversational - education - spanish - kids - voice assistant - voice-assistant - text-to-speech - speech-to-text - track:backyard - sponsor:modal - achievement:offgrid - achievement:welltuned - achievement:offbrand - achievement:sharing - achievement:fieldnotes models: - build-small-hackathon/sofia-qwen2.5-7b --- # Sofía — local-first educational companion for young kids A voice companion for a ~3-year-old: warm conversation, curated stories and songs with real audio, and learning activities (counting, colors, animals), plus a parent panel. Built for the **Build Small Hackathon** (track **Backyard AI**). ## The story behind Sofía Here's why we built this, in our own words: > My baby girl has always been curious about every single thing. From the > moment she turned maybe 8 or 9 months, it was amusing how fast a learner she > is. Except for walking (because it seems to be more fun rolling than using > feet and legs), eating, speaking, imitating moves, jumping, potty training, > sleeping — it was all "easy" for her. And imagine being working-from-home > parents with a toddler in hyperspeed 24/7 :) — some days were like heaven, > and other days felt like Ares was in my living room having the battle of his > life. Not because of bad behavior, but because of the constant educational > incentives she demanded. > > When she turned 2, she started asking these amazing questions about daily > life — why is the sun like that? How are forks made? Are cats and dogs > siblings? That's when I had this idea of making a companion for her: to chat > about this kind of topic, feed her curiosity, and also help my wife > complement our homeschool activities. This is basically a tool for parents > devoted to their children's learning path — for the days when we're a bit > off and don't have 100% of our creativity, but we don't let ourselves give > less. This stayed as an idea until Build Small Hackaton appeared. > > My wife said: *"I would love maths if I had this when I was a child — > learning could be so much fun."* > >The next step is to build 3D printed little "robot" with a Raspberry Pi and a little touch screen so its not on the phone. >The long term idea is that she gets an agent/robot companion for life. >We would really love to keep building Sofia, give some hearts and feedback please. ## Submission Snapshot | | | |---|---| | **Live Space** | [build-small-hackathon/sofia-educational-companion](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion) | | **Demo video** | [YouTube](https://www.youtube.com/watch?v=Zeesmd69dbs) | | **Social post** | [X / Twitter](https://x.com/estebanbarac/status/2066658868566917173) | | **Track** | Backyard AI — an educational companion for our ~3-year-old daughter | | **Fine-tuned model** | [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b) — QLoRA on top of Qwen2.5-7B-Instruct | | **Training** | QLoRA on **Modal** (A10G), see [`finetune/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/finetune) | | **Open trace** | [`trace/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/trace) — 12 end-to-end turns of the real pipeline | | **Field Notes** | [`FIELD_NOTES.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/FIELD_NOTES.md) | ## TL;DR for judges - **Backyard AI:** a real, specific problem — our ~3-year-old's nonstop stream of "why?" questions and activities, and the days we (working-from-home parents) can't give her 100% of our creativity. Sofía is a voice companion that chats, tells curated stories/songs, and runs small learning activities (counting, colors, animals) — **to complement, never replace, parent time**. - **Off the Grid (local-first):** the fine-tuned `Qwen2.5-7B-Instruct` runs **inside this Space** via `transformers` on dynamic ZeroGPU (`@spaces.GPU`). No external inference APIs at any point. - **Well-Tuned (fine-tuned):** [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b) — a QLoRA fine-tune trained end-to-end on **Modal** (A10G, 72 steps, loss 2.51 → 0.14). It teaches **persona, style and safety**, never facts: those always come from curated `content/`. - **Off-Brand (custom UI):** fully custom, voice-first, kid-friendly frontend (`frontend/`) served via `gr.Server` — push-to-talk, an animated character with moods, inline story/song playback. - **Sharing is Caring (open trace):** [`trace/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/trace) — 12 real end-to-end turns (intent router → curated content → safety guard → fine-tuned LLM), including a turn where the model **refuses to invent a story** and offers a curated one instead. - **Best Use of Modal:** the entire QLoRA fine-tune (dataset → training → merge → publish) ran on Modal — see [`finetune/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/finetune). - **Field Notes:** [`FIELD_NOTES.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/FIELD_NOTES.md) — what we learned pivoting from llama.cpp to ZeroGPU, why a small fine-tuned model beats a bigger generic one for a 3-year-old, and the per-visitor ZeroGPU quota that nearly looked like a bug but wasn't. ## Architecture idea (the most important part) The LLM is **only the conversational glue**. Facts and content (stories, songs, activities) come from `content/`, **curated by the parents** — the model never invents them. This kills almost all hallucination, reduces the risk of inappropriate content, and is the honest fit with the small model that the judges reward. ``` frontend/ (custom kid-friendly UI, voice-first, gr.Server) │ ▼ app.py (gradio.Server / FastAPI) ├─ llm/ fine-tuned (QLoRA) Qwen2.5-7B-Instruct, transformers + ZeroGPU (@spaces.GPU) ├─ content/ source of truth (curated): stories.json, songs.json, learning.json, activities.json ├─ voice/ faster-whisper (STT, push-to-talk) + Kokoro (TTS, CPU) ├─ safety/ blocked-topic filter (input and output) └─ parental/ activity log + memory (sqlite) ``` ## What Sofía does - Warm conversation in Spanish, short sentences, one question per turn. - Push-to-talk (no always-on listening yet) with an animated character that changes mood (idle / listening / thinking / playing). - **Stories** (10), curated, with pre-rendered audio and inline playback. - **Songs** (6), curated, with inline playback. - **Learning activities**: numbers, colors and animals (`content/learning.json`), plus counting and recognition (`content/activities.json`). - Sofía changes color on request. - Memory of recent turns for conversational continuity. - **Parent panel** (behind a gate): recent activity and structured session memory. ## Merit mapping | # | Merit | Status in Sofía | |---|---|---| | 1 | **Off the Grid** (local-first) | ✅ Fine-tuned `Qwen2.5-7B-Instruct` runs inside the Space via `transformers` + ZeroGPU (`zero-a10g`), no external APIs. | | 2 | **Well-Tuned** (fine-tuned) | ✅ QLoRA published at [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b); `MODEL_ID` in `llm/engine.py` points there. Details: [`finetune/MODEL_CARD.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/finetune/MODEL_CARD.md). | | 3 | **Off-Brand** (custom UI) | ✅ Custom kid-friendly frontend (`frontend/`), voice-first, served by `gr.Server`. | | 4 | **Llama Champion** (llama.cpp) | | 5 | **Sharing is Caring** (open trace) | ✅ [`trace/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/trace) — 12 end-to-end turns of the real pipeline, with the exact prompt sent to the model at each step. | | 6 | **Field Notes** | ✅ [`FIELD_NOTES.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/FIELD_NOTES.md). | **Bonus — Best Use of Modal:** the entire QLoRA fine-tune for merit 2 (dataset, training, merge, and publishing) ran end-to-end on Modal (A10G). See [`finetune/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/finetune). ## Running locally ```bash pip install -r requirements.txt python app.py # http://localhost:7860 ``` Note: locally `@spaces.GPU` is a no-op, and `.to("cuda")` needs a real GPU with enough VRAM for Qwen2.5-7B in bf16 (~15GB). The Space (ZeroGPU) provides 48GB per call; on smaller local GPUs, loading the model may fail or be very slow on CPU. ### Running locally with Ollama (no large GPU) ```bash ollama serve & # if not already running ollama pull qwen2.5:7b # once pip install ollama LUMI_LLM_BACKEND=ollama python app.py ``` See `CLAUDE.md` for the full list of environment variables (`LUMI_LLM_BACKEND`, `LUMI_OLLAMA_MODEL`, `LUMI_SHARE`). ## Model [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b) — a QLoRA fine-tune of `Qwen/Qwen2.5-7B-Instruct` (merged, full weights), trained to always stay in the "Sofía" persona, present curated content **verbatim** (no paraphrasing or inventing), and refuse/redirect unsafe topics. Runs on the Space's dynamic GPU (ZeroGPU, ~48GB VRAM per call). Full details: [`finetune/MODEL_CARD.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/finetune/MODEL_CARD.md). ## Honest notes - STT for a 3-year-old is hard: we use push-to-talk and a limited vocabulary. - Sofía complements, never replaces, parent time. The parent panel logs activity.