Instructions to use lifeart/danetki-qwen3-0.6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lifeart/danetki-qwen3-0.6b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lifeart/danetki-qwen3-0.6b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("lifeart/danetki-qwen3-0.6b")
model = AutoModelForMultimodalLM.from_pretrained("lifeart/danetki-qwen3-0.6b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use lifeart/danetki-qwen3-0.6b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lifeart/danetki-qwen3-0.6b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lifeart/danetki-qwen3-0.6b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/lifeart/danetki-qwen3-0.6b

SGLang

How to use lifeart/danetki-qwen3-0.6b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lifeart/danetki-qwen3-0.6b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lifeart/danetki-qwen3-0.6b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lifeart/danetki-qwen3-0.6b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lifeart/danetki-qwen3-0.6b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use lifeart/danetki-qwen3-0.6b with Docker Model Runner:
```
docker model run hf.co/lifeart/danetki-qwen3-0.6b
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Danetki Qwen3-0.6B — Fine-tuned for "Данетки" (lateral thinking puzzles)

A Qwen3-0.6B model fine-tuned with QLoRA for playing Данетки — a Russian lateral thinking puzzle game. The host describes a mysterious situation and players ask yes/no questions to figure out what happened. This model acts as the host, answering player questions.

Try it in your browser — runs entirely client-side via WebGPU, no server required.

Task

Given a puzzle context (condition + answer) and a player's question, respond with exactly one word:

Response	Meaning
Да	Yes
Нет	No
Не важно	Irrelevant

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "lifeart/danetki-qwen3-0.6b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")

# Override chat template to standard chatml (no <think> tags)
CHAT_TEMPLATE = (
    "{%- for message in messages %}"
    "{{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>\\n' }}"
    "{%- endfor %}"
    "{%- if add_generation_prompt %}"
    "{{- '<|im_start|>assistant\\n' }}"
    "{%- endif %}"
)
tokenizer.chat_template = CHAT_TEMPLATE

messages = [
    {
        "role": "system",
        "content": "Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно."
    },
    {
        "role": "user",
        "content": (
            "Контекст: Человек зашёл в бар и попросил стакан воды. "
            "Бармен достал пистолет и направил на него. "
            "Человек поблагодарил и ушёл.\n\n"
            "Разгадка: У человека была икота, и бармен испугал его, "
            "чтобы она прошла.\n\n"
            "Вопрос: Человек хотел пить воду?"
        )
    }
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=10, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(response)  # "Нет"

Important: Override tokenizer.chat_template with standard chatml (as shown above). The model was trained with this template — Qwen3's default template includes <think> tags which are incompatible.

Input Format

System Prompt

Ты ведущий игры Данетки. Тебе дан контекст и вопрос игрока. Отвечай ТОЛЬКО одним словом: Да, Нет, или Не важно.

User Message

Контекст: {condition}

Разгадка: {answer}

Вопрос: {question}

Model Details

Property	Value
Base model	Qwen/Qwen3-0.6B
Fine-tuning method	QLoRA (4-bit quantized base + LoRA adapters, merged)
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.15
LoRA targets	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Language	Russian
License	Apache 2.0
Task	Classification — Да / Нет / Не важно

Training Details

Training Data

~183K examples (train) / ~28K (val) from:

Scraped Данетки Q&A pairs — question-answer pairs parsed from puzzle threads on danetka.ru, where real players asked questions and hosts responded. Responses were classified into Да/Нет/Не важно categories.
DaNetQA reformatted to Данетки style — the Russian DaNetQA reading comprehension benchmark, reformatted into the puzzle host chat format.
Context perturbation augmentation — augmented examples with perturbed contexts to improve robustness.

Data Quality

Rebalanced class distribution: ~36% Да / ~36% Нет / ~28% Не важно
Cleaned NV labels: removed uncertainty responses ("не знаю", "некорректно") misclassified as irrelevant
Removed cross-puzzle negatives (caused label noise)
Stratified Нет downsampling to match Да count
НВ oversampling (target 25%)

Training Configuration

Parameter	Value
Learning rate	2e-5
LR schedule	Cosine with 10% warmup
Weight decay	0.05
Label smoothing	0.05
Precision	bf16
Batch size	8 (per device) x 2 (gradient accumulation) = 16 effective
Max sequence length	512
Loss	Completion-only with prompt/completion format
Chat template	Standard chatml (no `<think>` tags)
Early stopping	Patience 5, best checkpoint by eval_loss
Hardware	NVIDIA L40S (HF Jobs)

Benchmark Results

Evaluated on 190 test questions across 25 puzzles (greedy decoding):

Configuration	Correct	Accuracy	Macro F1
Baseline (no facts)	138/190	73%	71.7%
With facts injection	144/190	76%	74.6%
Delta	+6	+3pp	+2.9pp

Per-class metrics (with facts):

Class	Precision	Recall	F1
Да	71.8%	91.4%	80.4%
Нет	77.8%	59.3%	67.3%
Не важно	83.3%	70.0%	76.1%

Known Limitations

Slight "Да" bias: recall 91% for Да vs ~60-70% for Нет/НВ
Only responds in Russian
Quality depends on how well the puzzle context covers the player's question

Framework Versions

PEFT 0.13.0+
TRL 0.12.0+
Transformers 4.45.0+
PyTorch 2.1.0+

Also Available As

Web demo — play the full game in your browser (WebGPU)
MLC quantized model (q4f16_1) — for in-browser inference via @mlc-ai/web-llm

Citation

@misc{danetki-qwen3-0.6b,
  title={Danetki Qwen3-0.6B: Fine-tuned for Russian Lateral Thinking Puzzles},
  author={lifeart},
  year={2025},
  url={https://huggingface.co/lifeart/danetki-qwen3-0.6b}
}

Downloads last month: 23

Safetensors

Model size

0.8B params

Tensor type

BF16

Model tree for lifeart/danetki-qwen3-0.6b

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-0.6B

Finetuned

(1006)

this model

Finetunes

1 model

Quantizations

1 model