--- base_model: lifeart/danetki-qwen3-0.6b language: - ru license: apache-2.0 library_name: mlc-llm tags: - danetki - lateral-thinking - russian - qwen3 - qlora - sft - mlc-llm - webgpu - in-browser pipeline_tag: text-generation --- # Danetki MLC — Qwen3-0.6B for "Данетки" (WebGPU / in-browser) This is the **MLC/WebGPU quantized** version of [lifeart/danetki-qwen3-0.6b](https://huggingface.co/lifeart/danetki-qwen3-0.6b) — a Qwen3-0.6B model fine-tuned with QLoRA for playing [Данетки](https://en.wikipedia.org/wiki/Lateral_thinking_puzzle) (Russian lateral thinking puzzles). The model is compiled with [MLC-LLM](https://github.com/mlc-ai/mlc-llm) for **in-browser inference via WebGPU** using the [`@mlc-ai/web-llm`](https://github.com/mlc-ai/web-llm) library. It runs entirely client-side with no server required. ## Model Details | Property | Value | |---|---| | **Source model** | [lifeart/danetki-qwen3-0.6b](https://huggingface.co/lifeart/danetki-qwen3-0.6b) | | **Base model** | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) | | **Fine-tuning method** | QLoRA (r=16, alpha=32, dropout=0.15) | | **Quantization** | q4f16_1 (4-bit weights, 16-bit activations) | | **Runtime** | MLC-LLM WebGPU WASM | | **Context length** | 4096 tokens | | **Language** | Russian | | **Task** | Classification (Да / Нет / Не важно) | ## Task Given a puzzle context (condition + answer) and a player's question, respond with **exactly one word**: - **Да** (Yes) - **Нет** (No) - **Не важно** (Irrelevant) ### System Prompt ``` Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно. ``` ### Input Format ``` Контекст: {condition} Разгадка: {answer} Вопрос: {question} ``` ## Usage with web-llm ```typescript import { CreateMLCEngine } from "@mlc-ai/web-llm"; const engine = await CreateMLCEngine("danetki-qwen3-0.6B-q4f16_1", { appConfig: { model_list: [{ model: "https://huggingface.co/lifeart/danetki-mlc/resolve/main/", model_id: "danetki-qwen3-0.6B-q4f16_1", model_lib: "https://huggingface.co/lifeart/danetki-mlc/resolve/main/Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm", }], }, }); const reply = await engine.chat.completions.create({ messages: [ { role: "system", content: "Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.", }, { role: "user", content: "Контекст: Человек зашёл в бар и попросил стакан воды. Бармен достал пистолет и направил на него. Человек поблагодарил и ушёл.\n\nРазгадка: У человека была икота, и бармен испугал его, чтобы она прошла.\n\nВопрос: Человек хотел пить воду?", }, ], temperature: 0.1, max_tokens: 10, }); console.log(reply.choices[0].message.content); // "Нет" ``` ## Training Details See the full training details on the [source model card](https://huggingface.co/lifeart/danetki-qwen3-0.6b). Key highlights: - **~183K training examples** from scraped Данетки Q&A, DaNetQA reformatted, and context perturbation augmentation - **Rebalanced class distribution**: ~36% Да / ~36% Нет / ~28% Не важно - Cleaned NV labels: removed uncertainty responses ("не знаю", "некорректно") misclassified as irrelevant - **Completion-only loss** with prompt/completion format — only assistant classification tokens contribute to the gradient - Explicit chatml template override (no `` tags) to match MLC production inference - Label smoothing (0.05) for regularization - LR 2e-5, cosine schedule, bf16, effective batch size 16, early stopping (patience=5), best checkpoint by eval_loss ### Benchmark 190 test questions across 25 puzzles (variant B — with facts for long answers): | Class | Precision | Recall | F1 | |-------|-----------|--------|-----| | Да | 71.8% | 91.4% | 80.4% | | Нет | 77.8% | 59.3% | 67.3% | | НВ | 83.3% | 70.0% | 76.1% | | **Overall** | | **75.8%** | **74.6%** | ## Model Files This repository contains the MLC-LLM compiled model: - `params_shard_*.bin` — quantized model weight shards (q4f16_1) - `Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm` — WebGPU WASM runtime - `mlc-chat-config.json` — MLC-LLM configuration - `tokenizer.json`, `tokenizer_config.json` — tokenizer files ## Limitations - Only responds in Russian - Limited to three response categories: Да / Нет / Не важно - The model has a slight "Да" bias (recall 91% vs 64% for other classes) - Quality depends on how well the puzzle context covers the player's question - May produce incorrect answers for ambiguous or edge-case questions - Requires a WebGPU-capable browser for in-browser inference