---
title: Micro RPG Engine
emoji: 🍄
colorFrom: purple
colorTo: indigo
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
pinned: true
license: apache-2.0
short_description: A whole RPG world generated live by a small 1B-4B model.
tags:
  - small-models-hackathon
  - track:wood
  - thousand-token-wood
  - achievement:offbrand
  - off-brand
  - rpg
  - text-adventure
  - qwen
  - minicpm
---

<!-- SUBMISSÃO:
  Demo video: https://youtu.be/-XfaAcRHH28
  Social post: https://www.linkedin.com/posts/luiz-felipe-barbedo-94188215a_buildsmall-smallmodels-llm-share-7472417718395301889-ZSKJ/
  Track: Thousand Token Wood (entretenimento/whimsical)
  ⚠️ Confirme o slug exato da tag de track no template da org build-small-hackathon.
-->

> **🎥 Demo video:** https://youtu.be/-XfaAcRHH28 &nbsp;•&nbsp; **📣 Social post:** https://www.linkedin.com/posts/luiz-felipe-barbedo-94188215a_buildsmall-smallmodels-llm-share-7472417718395301889-ZSKJ/

# 🍄 Micro RPG Engine

A text RPG where a **small language model (1B–4B)** generates *everything* in real
time — the world, NPCs, dialogue, combat, the shop, random events. There is no
pre-written content. **No AI, no game.** Every playthrough is unique.

> Hugging Face Small Models Hackathon — **Track 2**

## The technical bet

The hard part with small models isn't writing pretty prose — it's **narrative
consistency**: not forgetting your HP, your inventory, that you already killed the
goblin. A generic "RPG-themed chatbot" loses the plot in three turns.

Our approach makes the **Python engine the source of truth**, not the model:

```
                 ┌─────────────────────────────────────────┐
   player input  │  GameEngine (turn loop)                  │
  ───────────────▶                                          │
                 │  1. build context from GameState  ───────┼──▶  System prompt
                 │  2. call the 1B-4B model                  │     + authoritative
                 │  3. parse output  ◀──────────────────────┼─────  state snapshot
                 │     ├─ <narrative> → shown to player      │
                 │     └─ <state> tags → VALIDATED & applied │
                 │  4. GameState mutates (HP, gold, items)   │
                 └─────────────────────────────────────────┘
```

The model never *remembers* the numbers — it receives them, fresh, every turn, and
may only *propose* deltas (`HP: -10`, `ITEM_ADD: Rusty Sword`) through a strict tag
protocol. The parser clamps and validates every change against the real state. The
model handles imagination; Python handles bookkeeping. That's what keeps a 1.5B
model coherent across a long dungeon crawl.

## Run locally

```bash
pip install -r requirements.txt
python app.py
```

By default it loads the model with `transformers`. To run with no local GPU, set a
Hugging Face token and it falls back to the serverless Inference API:

```bash
# Windows PowerShell
$env:HF_TOKEN = "hf_..."
$env:MICRORPG_BACKEND = "inference_api"
python app.py
```

## Configuration (env vars)

| Variable             | Default                       | Meaning                                  |
|----------------------|-------------------------------|------------------------------------------|
| `MICRORPG_MODEL`     | `Qwen/Qwen3-4B-Instruct-2507` | Model repo id                            |
| `MICRORPG_BACKEND`   | `transformers`                | `transformers` \| `inference_api` \| `mock` |
| `HF_TOKEN`           | —                             | Token for the Inference API backend      |
| `MICRORPG_MAX_TOKENS`| `512`                         | Max new tokens per turn                  |

Set `MICRORPG_BACKEND=mock` to run the full engine with a deterministic fake model
(no weights, no network) — handy for testing the parser and UI.

## Fine-tuning (the "Well-Tuned" quest)

The hard skill for a small model here is emitting the strict three-block tag format
with valid mechanics, every turn. We teach it with a **parser-validated synthetic
dataset**: `build_dataset.py` generates RPG turns in the exact protocol, then runs
**every single one through the real engine parser** and keeps only those that parse
and apply cleanly. 100% of the training data is guaranteed well-formed.

```bash
pip install -r requirements-train.txt          # GPU / Colab
python -m finetune.build_dataset --n 1200       # offline, no model needed
python -m finetune.train \
    --model Qwen/Qwen3-4B-Instruct-2507 \
    --out finetune/out/qwen3-4b-microrpg        # LoRA, ~few MB adapter
```

Play with your fine-tuned model by pointing the engine at the adapter:

```bash
# Windows PowerShell
$env:MICRORPG_ADAPTER = "finetune/out/qwen3-4b-microrpg"
python app.py
```

The dataset is model-agnostic — swap `--model` for MiniCPM, or a Llama for the
**Llama Champion** quest. Add `--load-4bit` for QLoRA on a small GPU.

## Project layout

```
app.py              Gradio UI + glue
style.css           Custom theme (parchment / arcane)
engine/
  game_state.py     GameState: HP, gold, inventory, location, NPCs, quest log
  prompts.py        System prompt + the tag protocol the model must follow
  llm.py            Model backends (transformers / inference API / mock)
  parser.py         Splits narrative from mechanics, validates deltas
  engine.py         GameEngine: the turn loop
finetune/
  build_dataset.py  Parser-validated synthetic turns → train.jsonl / eval.jsonl
  train.py          LoRA SFT (TRL/PEFT); produces a small adapter
tests/
  test_parser.py    Parser/engine smoke tests (run with mock backend)
```

## License

Apache-2.0.