---
base_model: lifeart/danetki-qwen3-0.6b
language:
  - ru
license: apache-2.0
library_name: mlc-llm
tags:
  - danetki
  - lateral-thinking
  - russian
  - qwen3
  - qlora
  - sft
  - mlc-llm
  - webgpu
  - in-browser
pipeline_tag: text-generation
---

# Danetki MLC — Qwen3-0.6B for "Данетки" (WebGPU / in-browser)

This is the **MLC/WebGPU quantized** version of [lifeart/danetki-qwen3-0.6b](https://huggingface.co/lifeart/danetki-qwen3-0.6b) — a Qwen3-0.6B model fine-tuned with QLoRA for playing [Данетки](https://en.wikipedia.org/wiki/Lateral_thinking_puzzle) (Russian lateral thinking puzzles).

The model is compiled with [MLC-LLM](https://github.com/mlc-ai/mlc-llm) for **in-browser inference via WebGPU** using the [`@mlc-ai/web-llm`](https://github.com/mlc-ai/web-llm) library. It runs entirely client-side with no server required.

## Model Details

| Property | Value |
|---|---|
| **Source model** | [lifeart/danetki-qwen3-0.6b](https://huggingface.co/lifeart/danetki-qwen3-0.6b) |
| **Base model** | [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B) |
| **Fine-tuning method** | QLoRA (r=16, alpha=32, dropout=0.15) |
| **Quantization** | q4f16_1 (4-bit weights, 16-bit activations) |
| **Runtime** | MLC-LLM WebGPU WASM |
| **Context length** | 4096 tokens |
| **Language** | Russian |
| **Task** | Classification (Да / Нет / Не важно) |

## Task

Given a puzzle context (condition + answer) and a player's question, respond with **exactly one word**:

- **Да** (Yes)
- **Нет** (No)
- **Не важно** (Irrelevant)

### System Prompt

```
Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.
```

### Input Format

```
Контекст: {condition}

Разгадка: {answer}

Вопрос: {question}
```

## Usage with web-llm

```typescript
import { CreateMLCEngine } from "@mlc-ai/web-llm";

const engine = await CreateMLCEngine("danetki-qwen3-0.6B-q4f16_1", {
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/lifeart/danetki-mlc/resolve/main/",
      model_id: "danetki-qwen3-0.6B-q4f16_1",
      model_lib:
        "https://huggingface.co/lifeart/danetki-mlc/resolve/main/Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm",
    }],
  },
});

const reply = await engine.chat.completions.create({
  messages: [
    {
      role: "system",
      content:
        "Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.",
    },
    {
      role: "user",
      content:
        "Контекст: Человек зашёл в бар и попросил стакан воды. Бармен достал пистолет и направил на него. Человек поблагодарил и ушёл.\n\nРазгадка: У человека была икота, и бармен испугал его, чтобы она прошла.\n\nВопрос: Человек хотел пить воду?",
    },
  ],
  temperature: 0.1,
  max_tokens: 10,
});

console.log(reply.choices[0].message.content); // "Нет"
```

## Training Details

See the full training details on the [source model card](https://huggingface.co/lifeart/danetki-qwen3-0.6b).

Key highlights:
- **~183K training examples** from scraped Данетки Q&A, DaNetQA reformatted, and context perturbation augmentation
- **Rebalanced class distribution**: ~36% Да / ~36% Нет / ~28% Не важно
- Cleaned NV labels: removed uncertainty responses ("не знаю", "некорректно") misclassified as irrelevant
- **Completion-only loss** with prompt/completion format — only assistant classification tokens contribute to the gradient
- Explicit chatml template override (no `<think>` tags) to match MLC production inference
- Label smoothing (0.05) for regularization
- LR 2e-5, cosine schedule, bf16, effective batch size 16, early stopping (patience=5), best checkpoint by eval_loss

### Benchmark

190 test questions across 25 puzzles (variant B — with facts for long answers):

| Class | Precision | Recall | F1 |
|-------|-----------|--------|-----|
| Да | 71.8% | 91.4% | 80.4% |
| Нет | 77.8% | 59.3% | 67.3% |
| НВ | 83.3% | 70.0% | 76.1% |
| **Overall** | | **75.8%** | **74.6%** |

## Model Files

This repository contains the MLC-LLM compiled model:

- `params_shard_*.bin` — quantized model weight shards (q4f16_1)
- `Qwen3-0.6B-q4f16_1-ctx4k_cs1k-webgpu.wasm` — WebGPU WASM runtime
- `mlc-chat-config.json` — MLC-LLM configuration
- `tokenizer.json`, `tokenizer_config.json` — tokenizer files

## Limitations

- Only responds in Russian
- Limited to three response categories: Да / Нет / Не важно
- The model has a slight "Да" bias (recall 91% vs 64% for other classes)
- Quality depends on how well the puzzle context covers the player's question
- May produce incorrect answers for ambiguous or edge-case questions
- Requires a WebGPU-capable browser for in-browser inference