NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16
DARE-TIES merge of three Kazakh/Russian Qwen3-600M models, producing a stronger KK/RU specialist.
Released as part of the NOESIS Professional Multilingual Dubbing Automation Platform (framework: DHCF-FNO -- Deterministic Hybrid Control Framework for Frozen Neural Operators).
- Founder: Ilia Bolotnikov
- Organization: AMAImedia.com
- X (Twitter): @AMAImediacom
- LinkedIn: Ilia Bolotnikov
- Telegram: @djbionicl
- NOESIS version: v14.7
- Release date: 2026-04
Model summary
| Property | Value |
|---|---|
| Architecture | Qwen3ForCausalLM |
| Parameters | ~600M |
| Hidden size | 1 280 |
| Layers | 28 |
| Vocab size | 64 000 (custom KK/RU tokenizer) |
| Precision | BF16 |
| Disk footprint | ~1.3 GB |
| Merge method | DARE-TIES (RNG seed 1729) |
| Primary language | Kazakh (KK) |
| Secondary language | Russian (RU) |
Source models
| Model | Role | Weight | Density |
|---|---|---|---|
ekitil-core-qwen3-600m-kkru-base-v1 |
Base (foundation) | -- | -- |
ekitil-qwen3-600m-kk (step 4500) |
KK specialist | 0.45 | 0.53 |
ekitil-qwen3-600m-kkru (step 2000) |
KK+RU generalist | 0.35 | 0.53 |
The base model provides the foundation architecture. DARE randomly drops (1 - density) of each task vector's weights and rescales survivors, then TIES elects majority-sign directions before summing into the merged model.
Why this merge
Each source model captures a different aspect of Kazakh language:
- ekitil-core -- balanced KK+RU base
- ekitil-kk -- maximally Kazakh-specialized (4 500 training steps)
- ekitil-kkru -- bilingual KK+RU (2 000 training steps)
DARE-TIES combines all three while suppressing conflicting parameter updates, producing better KK coverage than any single model alone.
How to use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "AMAImedia/NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [{"role": "user", "content": "Salem! Qazaq tilinde soylesesik."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
Note:
vocab_size=64000-- custom Kazakh tokenizer, not Qwen3 standard 151 936. Not compatible with standard Qwen3 tokenizer pipelines.
NOESIS context
In NOESIS this model provides Kazakh language domain boost (KK x10 weight multiplier) for the DUB-LM and CHAT specialists during knowledge distillation.
KD note:
vocab_size=64000is incompatible with NOESIS student vocab (151 936). Soft label extraction requires a custom cross-vocab projection layer.
Provenance
A merge_provenance.json file ships alongside the model weights with the full merge trace:
source models, weights, densities, DARE-TIES parameters, and RNG seed.
License
This model is released under the Apache License 2.0, inherited from the upstream Qwen3
base and ekitil model checkpoints. See the LICENSE file for the full license text.
Citation
@misc{noesis_darwin_kz,
title = {NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16},
author = {Bolotnikov, Ilia},
year = {2026},
publisher = {AMAImedia},
url = {https://amaimedia.com}
}
- Downloads last month
- 305