--- title: Micro RPG Engine emoji: πŸ„ colorFrom: purple colorTo: indigo sdk: gradio sdk_version: 5.50.0 app_file: app.py pinned: true license: apache-2.0 short_description: A whole RPG world generated live by a small 1B-4B model. tags: - small-models-hackathon - track:wood - thousand-token-wood - achievement:offbrand - off-brand - rpg - text-adventure - qwen - minicpm --- > **πŸŽ₯ Demo video:** https://youtu.be/-XfaAcRHH28  β€’  **πŸ“£ Social post:** https://www.linkedin.com/posts/luiz-felipe-barbedo-94188215a_buildsmall-smallmodels-llm-share-7472417718395301889-ZSKJ/ # πŸ„ Micro RPG Engine A text RPG where a **small language model (1B–4B)** generates *everything* in real time β€” the world, NPCs, dialogue, combat, the shop, random events. There is no pre-written content. **No AI, no game.** Every playthrough is unique. > Hugging Face Small Models Hackathon β€” **Track 2** ## The technical bet The hard part with small models isn't writing pretty prose β€” it's **narrative consistency**: not forgetting your HP, your inventory, that you already killed the goblin. A generic "RPG-themed chatbot" loses the plot in three turns. Our approach makes the **Python engine the source of truth**, not the model: ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” player input β”‚ GameEngine (turn loop) β”‚ ───────────────▢ β”‚ β”‚ 1. build context from GameState ───────┼──▢ System prompt β”‚ 2. call the 1B-4B model β”‚ + authoritative β”‚ 3. parse output ◀──────────────────────┼───── state snapshot β”‚ β”œβ”€ β†’ shown to player β”‚ β”‚ └─ tags β†’ VALIDATED & applied β”‚ β”‚ 4. GameState mutates (HP, gold, items) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` The model never *remembers* the numbers β€” it receives them, fresh, every turn, and may only *propose* deltas (`HP: -10`, `ITEM_ADD: Rusty Sword`) through a strict tag protocol. The parser clamps and validates every change against the real state. The model handles imagination; Python handles bookkeeping. That's what keeps a 1.5B model coherent across a long dungeon crawl. ## Run locally ```bash pip install -r requirements.txt python app.py ``` By default it loads the model with `transformers`. To run with no local GPU, set a Hugging Face token and it falls back to the serverless Inference API: ```bash # Windows PowerShell $env:HF_TOKEN = "hf_..." $env:MICRORPG_BACKEND = "inference_api" python app.py ``` ## Configuration (env vars) | Variable | Default | Meaning | |----------------------|-------------------------------|------------------------------------------| | `MICRORPG_MODEL` | `Qwen/Qwen3-4B-Instruct-2507` | Model repo id | | `MICRORPG_BACKEND` | `transformers` | `transformers` \| `inference_api` \| `mock` | | `HF_TOKEN` | β€” | Token for the Inference API backend | | `MICRORPG_MAX_TOKENS`| `512` | Max new tokens per turn | Set `MICRORPG_BACKEND=mock` to run the full engine with a deterministic fake model (no weights, no network) β€” handy for testing the parser and UI. ## Fine-tuning (the "Well-Tuned" quest) The hard skill for a small model here is emitting the strict three-block tag format with valid mechanics, every turn. We teach it with a **parser-validated synthetic dataset**: `build_dataset.py` generates RPG turns in the exact protocol, then runs **every single one through the real engine parser** and keeps only those that parse and apply cleanly. 100% of the training data is guaranteed well-formed. ```bash pip install -r requirements-train.txt # GPU / Colab python -m finetune.build_dataset --n 1200 # offline, no model needed python -m finetune.train \ --model Qwen/Qwen3-4B-Instruct-2507 \ --out finetune/out/qwen3-4b-microrpg # LoRA, ~few MB adapter ``` Play with your fine-tuned model by pointing the engine at the adapter: ```bash # Windows PowerShell $env:MICRORPG_ADAPTER = "finetune/out/qwen3-4b-microrpg" python app.py ``` The dataset is model-agnostic β€” swap `--model` for MiniCPM, or a Llama for the **Llama Champion** quest. Add `--load-4bit` for QLoRA on a small GPU. ## Project layout ``` app.py Gradio UI + glue style.css Custom theme (parchment / arcane) engine/ game_state.py GameState: HP, gold, inventory, location, NPCs, quest log prompts.py System prompt + the tag protocol the model must follow llm.py Model backends (transformers / inference API / mock) parser.py Splits narrative from mechanics, validates deltas engine.py GameEngine: the turn loop finetune/ build_dataset.py Parser-validated synthetic turns β†’ train.jsonl / eval.jsonl train.py LoRA SFT (TRL/PEFT); produces a small adapter tests/ test_parser.py Parser/engine smoke tests (run with mock backend) ``` ## License Apache-2.0.