---
title: Sofía — Educational Companion (Spanish)
colorFrom: blue
colorTo: pink
sdk: gradio
sdk_version: 6.18.0
python_version: '3.12'
app_file: app.py
pinned: false
license: mit
short_description: Local-first Spanish voice companion for kids
tags:
- gradio
- build-small-hackathon
- backyard ai
- backyard-ai
- off the grid
- off-the-grid
- well tuned
- well-tuned
- off brand
- off-brand
- sharing is caring
- sharing-is-caring
- field notes
- field-notes
- best use of modal
- best-use-of-modal
- modal
- zerogpu
- qwen2.5
- qlora
- conversational
- education
- spanish
- kids
- voice assistant
- voice-assistant
- text-to-speech
- speech-to-text
- track:backyard
- sponsor:modal
- achievement:offgrid
- achievement:welltuned
- achievement:offbrand
- achievement:sharing
- achievement:fieldnotes
models:
- build-small-hackathon/sofia-qwen2.5-7b
---

# Sofía — local-first educational companion for young kids

A voice companion for a ~3-year-old: warm conversation, curated stories and
songs with real audio, and learning activities (counting, colors, animals),
plus a parent panel. Built for the **Build Small Hackathon** (track
**Backyard AI**).

## The story behind Sofía

Here's why we built this, in our own words:

> My baby girl has always been curious about every single thing. From the
> moment she turned maybe 8 or 9 months, it was amusing how fast a learner she
> is. Except for walking (because it seems to be more fun rolling than using
> feet and legs), eating, speaking, imitating moves, jumping, potty training,
> sleeping — it was all "easy" for her. And imagine being working-from-home
> parents with a toddler in hyperspeed 24/7 :) — some days were like heaven,
> and other days felt like Ares was in my living room having the battle of his
> life. Not because of bad behavior, but because of the constant educational
> incentives she demanded.
>
> When she turned 2, she started asking these amazing questions about daily
> life — why is the sun like that? How are forks made? Are cats and dogs
> siblings? That's when I had this idea of making a companion for her: to chat
> about this kind of topic, feed her curiosity, and also help my wife
> complement our homeschool activities. This is basically a tool for parents
> devoted to their children's learning path — for the days when we're a bit
> off and don't have 100% of our creativity, but we don't let ourselves give
> less. This stayed as an idea until Build Small Hackaton appeared.
>
> My wife said: *"I would love maths if I had this when I was a child —
> learning could be so much fun."*
>
>The next step is to build 3D printed little "robot" with a Raspberry Pi and a little touch screen so its not on the phone.
>The long term idea is that she gets an agent/robot companion for life.
>We would really love to keep building Sofia, give some hearts and feedback please.


## Submission Snapshot

| | |
|---|---|
| **Live Space** | [build-small-hackathon/sofia-educational-companion](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion) |
| **Demo video** | [YouTube](https://www.youtube.com/watch?v=Zeesmd69dbs) |
| **Social post** | [X / Twitter](https://x.com/estebanbarac/status/2066658868566917173) |
| **Track** | Backyard AI — an educational companion for our ~3-year-old daughter |
| **Fine-tuned model** | [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b) — QLoRA on top of Qwen2.5-7B-Instruct |
| **Training** | QLoRA on **Modal** (A10G), see [`finetune/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/finetune) |
| **Open trace** | [`trace/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/trace) — 12 end-to-end turns of the real pipeline |
| **Field Notes** | [`FIELD_NOTES.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/FIELD_NOTES.md) |

## TL;DR for judges

- **Backyard AI:** a real, specific problem — our ~3-year-old's nonstop stream
  of "why?" questions and activities, and the days we (working-from-home parents) can't give
  her 100% of our creativity. Sofía is a voice companion that chats, tells
  curated stories/songs, and runs small learning activities (counting, colors,
  animals) — **to complement, never replace, parent time**.
- **Off the Grid (local-first):** the fine-tuned `Qwen2.5-7B-Instruct` runs
  **inside this Space** via `transformers` on dynamic ZeroGPU (`@spaces.GPU`).
  No external inference APIs at any point.
- **Well-Tuned (fine-tuned):** [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b)
  — a QLoRA fine-tune trained end-to-end on **Modal** (A10G, 72 steps, loss
  2.51 → 0.14). It teaches **persona, style and safety**, never facts: those
  always come from curated `content/`.
- **Off-Brand (custom UI):** fully custom, voice-first, kid-friendly frontend
  (`frontend/`) served via `gr.Server` — push-to-talk, an animated character
  with moods, inline story/song playback.
- **Sharing is Caring (open trace):** [`trace/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/trace) — 12 real end-to-end
  turns (intent router → curated content → safety guard → fine-tuned LLM),
  including a turn where the model **refuses to invent a story** and offers a
  curated one instead.
- **Best Use of Modal:** the entire QLoRA fine-tune (dataset → training →
  merge → publish) ran on Modal — see [`finetune/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/finetune).
- **Field Notes:** [`FIELD_NOTES.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/FIELD_NOTES.md) — what we learned
  pivoting from llama.cpp to ZeroGPU, why a small fine-tuned model beats a
  bigger generic one for a 3-year-old, and the per-visitor ZeroGPU quota that
  nearly looked like a bug but wasn't.

## Architecture idea (the most important part)

The LLM is **only the conversational glue**. Facts and content (stories,
songs, activities) come from `content/`, **curated by the parents** — the
model never invents them. This kills almost all hallucination, reduces the
risk of inappropriate content, and is the honest fit with the small model that
the judges reward.

```
frontend/ (custom kid-friendly UI, voice-first, gr.Server)
                                 │
                                 ▼
app.py (gradio.Server / FastAPI)
  ├─ llm/       fine-tuned (QLoRA) Qwen2.5-7B-Instruct, transformers + ZeroGPU (@spaces.GPU)
  ├─ content/   source of truth (curated): stories.json, songs.json, learning.json, activities.json
  ├─ voice/     faster-whisper (STT, push-to-talk) + Kokoro (TTS, CPU)
  ├─ safety/    blocked-topic filter (input and output)
  └─ parental/  activity log + memory (sqlite)
```

## What Sofía does

- Warm conversation in Spanish, short sentences, one question per turn.
- Push-to-talk (no always-on listening yet) with an animated character that
  changes mood (idle / listening / thinking / playing).
- **Stories** (10), curated, with pre-rendered audio and inline playback.
- **Songs** (6), curated, with inline playback.
- **Learning activities**: numbers, colors and animals (`content/learning.json`),
  plus counting and recognition (`content/activities.json`).
- Sofía changes color on request.
- Memory of recent turns for conversational continuity.
- **Parent panel** (behind a gate): recent activity and structured session
  memory.

## Merit mapping

| # | Merit | Status in Sofía |
|---|---|---|
| 1 | **Off the Grid** (local-first) | ✅ Fine-tuned `Qwen2.5-7B-Instruct` runs inside the Space via `transformers` + ZeroGPU (`zero-a10g`), no external APIs. |
| 2 | **Well-Tuned** (fine-tuned) | ✅ QLoRA published at [`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b); `MODEL_ID` in `llm/engine.py` points there. Details: [`finetune/MODEL_CARD.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/finetune/MODEL_CARD.md). |
| 3 | **Off-Brand** (custom UI) | ✅ Custom kid-friendly frontend (`frontend/`), voice-first, served by `gr.Server`. |
| 4 | **Llama Champion** (llama.cpp) | 
| 5 | **Sharing is Caring** (open trace) | ✅ [`trace/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/trace) — 12 end-to-end turns of the real pipeline, with the exact prompt sent to the model at each step. |
| 6 | **Field Notes** | ✅ [`FIELD_NOTES.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/FIELD_NOTES.md). |

**Bonus — Best Use of Modal:** the entire QLoRA fine-tune for merit 2
(dataset, training, merge, and publishing) ran end-to-end on Modal (A10G). See
[`finetune/`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/tree/main/finetune).

## Running locally

```bash
pip install -r requirements.txt
python app.py   # http://localhost:7860
```

Note: locally `@spaces.GPU` is a no-op, and `.to("cuda")` needs a real GPU with
enough VRAM for Qwen2.5-7B in bf16 (~15GB). The Space (ZeroGPU) provides 48GB
per call; on smaller local GPUs, loading the model may fail or be very slow on
CPU.

### Running locally with Ollama (no large GPU)

```bash
ollama serve &              # if not already running
ollama pull qwen2.5:7b       # once

pip install ollama
LUMI_LLM_BACKEND=ollama python app.py
```

See `CLAUDE.md` for the full list of environment variables
(`LUMI_LLM_BACKEND`, `LUMI_OLLAMA_MODEL`, `LUMI_SHARE`).

## Model

[`build-small-hackathon/sofia-qwen2.5-7b`](https://huggingface.co/build-small-hackathon/sofia-qwen2.5-7b)
— a QLoRA fine-tune of `Qwen/Qwen2.5-7B-Instruct` (merged, full weights),
trained to always stay in the "Sofía" persona, present curated content
**verbatim** (no paraphrasing or inventing), and refuse/redirect unsafe
topics. Runs on the Space's dynamic GPU (ZeroGPU, ~48GB VRAM per call). Full
details: [`finetune/MODEL_CARD.md`](https://huggingface.co/spaces/build-small-hackathon/sofia-educational-companion/blob/main/finetune/MODEL_CARD.md).

## Honest notes

- STT for a 3-year-old is hard: we use push-to-talk and a limited vocabulary.
- Sofía complements, never replaces, parent time. The parent panel logs
  activity.