Instructions to use lifeart/danetki-qwen3-0.6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lifeart/danetki-qwen3-0.6b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="lifeart/danetki-qwen3-0.6b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("lifeart/danetki-qwen3-0.6b") model = AutoModelForMultimodalLM.from_pretrained("lifeart/danetki-qwen3-0.6b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use lifeart/danetki-qwen3-0.6b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lifeart/danetki-qwen3-0.6b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lifeart/danetki-qwen3-0.6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/lifeart/danetki-qwen3-0.6b
- SGLang
How to use lifeart/danetki-qwen3-0.6b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lifeart/danetki-qwen3-0.6b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lifeart/danetki-qwen3-0.6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lifeart/danetki-qwen3-0.6b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lifeart/danetki-qwen3-0.6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use lifeart/danetki-qwen3-0.6b with Docker Model Runner:
docker model run hf.co/lifeart/danetki-qwen3-0.6b
Danetki Qwen3-0.6B — Fine-tuned for "Данетки" (lateral thinking puzzles)
A Qwen3-0.6B model fine-tuned with QLoRA for playing Данетки — a Russian lateral thinking puzzle game. The host describes a mysterious situation and players ask yes/no questions to figure out what happened. This model acts as the host, answering player questions.
Try it in your browser — runs entirely client-side via WebGPU, no server required.
Task
Given a puzzle context (condition + answer) and a player's question, respond with exactly one word:
| Response | Meaning |
|---|---|
| Да | Yes |
| Нет | No |
| Не важно | Irrelevant |
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "lifeart/danetki-qwen3-0.6b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
# Override chat template to standard chatml (no <think> tags)
CHAT_TEMPLATE = (
"{%- for message in messages %}"
"{{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>\\n' }}"
"{%- endfor %}"
"{%- if add_generation_prompt %}"
"{{- '<|im_start|>assistant\\n' }}"
"{%- endif %}"
)
tokenizer.chat_template = CHAT_TEMPLATE
messages = [
{
"role": "system",
"content": "Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно."
},
{
"role": "user",
"content": (
"Контекст: Человек зашёл в бар и попросил стакан воды. "
"Бармен достал пистолет и направил на него. "
"Человек поблагодарил и ушёл.\n\n"
"Разгадка: У человека была икота, и бармен испугал его, "
"чтобы она прошла.\n\n"
"Вопрос: Человек хотел пить воду?"
)
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response) # "Нет"
Important: Override
tokenizer.chat_templatewith standard chatml (as shown above). The model was trained with this template — Qwen3's default template includes<think>tags which are incompatible.
Input Format
System Prompt
Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.
User Message
Контекст: {condition}
Разгадка: {answer}
Вопрос: {question}
Model Details
| Property | Value |
|---|---|
| Base model | Qwen/Qwen3-0.6B |
| Fine-tuning method | QLoRA (4-bit quantized base + LoRA adapters, merged) |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.15 |
| LoRA targets | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Language | Russian |
| License | Apache 2.0 |
| Task | Classification — Да / Нет / Не важно |
Training Details
Training Data
~183K examples (train) / ~28K (val) from:
- Scraped Данетки Q&A pairs — question-answer pairs parsed from puzzle threads on danetka.ru, where real players asked questions and hosts responded. Responses were classified into Да/Нет/Не важно categories.
- DaNetQA reformatted to Данетки style — the Russian DaNetQA reading comprehension benchmark, reformatted into the puzzle host chat format.
- Context perturbation augmentation — augmented examples with perturbed contexts to improve robustness.
Data Quality
- Rebalanced class distribution: ~36% Да / ~36% Нет / ~28% Не важно
- Cleaned NV labels: removed uncertainty responses ("не знаю", "некорректно") misclassified as irrelevant
- Removed cross-puzzle negatives (caused label noise)
- Stratified Нет downsampling to match Да count
- НВ oversampling (target 25%)
Training Configuration
| Parameter | Value |
|---|---|
| Learning rate | 2e-5 |
| LR schedule | Cosine with 10% warmup |
| Weight decay | 0.05 |
| Label smoothing | 0.05 |
| Precision | bf16 |
| Batch size | 8 (per device) x 2 (gradient accumulation) = 16 effective |
| Max sequence length | 512 |
| Loss | Completion-only with prompt/completion format |
| Chat template | Standard chatml (no <think> tags) |
| Early stopping | Patience 5, best checkpoint by eval_loss |
| Hardware | NVIDIA L40S (HF Jobs) |
Benchmark Results
Evaluated on 190 test questions across 25 puzzles (greedy decoding):
| Configuration | Correct | Accuracy | Macro F1 |
|---|---|---|---|
| Baseline (no facts) | 138/190 | 73% | 71.7% |
| With facts injection | 144/190 | 76% | 74.6% |
| Delta | +6 | +3pp | +2.9pp |
Per-class metrics (with facts):
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Да | 71.8% | 91.4% | 80.4% |
| Нет | 77.8% | 59.3% | 67.3% |
| Не важно | 83.3% | 70.0% | 76.1% |
Known Limitations
- Slight "Да" bias: recall 91% for Да vs ~60-70% for Нет/НВ
- Only responds in Russian
- Quality depends on how well the puzzle context covers the player's question
Framework Versions
- PEFT 0.13.0+
- TRL 0.12.0+
- Transformers 4.45.0+
- PyTorch 2.1.0+
Also Available As
- Web demo — play the full game in your browser (WebGPU)
- MLC quantized model (q4f16_1) — for in-browser inference via @mlc-ai/web-llm
Citation
@misc{danetki-qwen3-0.6b,
title={Danetki Qwen3-0.6B: Fine-tuned for Russian Lateral Thinking Puzzles},
author={lifeart},
year={2025},
url={https://huggingface.co/lifeart/danetki-qwen3-0.6b}
}
- Downloads last month
- 23
Model tree for lifeart/danetki-qwen3-0.6b
Base model
Qwen/Qwen3-0.6B-Base
docker model run hf.co/lifeart/danetki-qwen3-0.6b