NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16

DARE-TIES merge of three Kazakh/Russian Qwen3-600M models, producing a stronger KK/RU specialist.

Released as part of the NOESIS Professional Multilingual Dubbing Automation Platform (framework: DHCF-FNO -- Deterministic Hybrid Control Framework for Frozen Neural Operators).

Founder: Ilia Bolotnikov
Organization: AMAImedia.com
X (Twitter): @AMAImediacom
LinkedIn: Ilia Bolotnikov
Telegram: @djbionicl
NOESIS version: v14.7
Release date: 2026-04

Model summary

Property	Value
Architecture	`Qwen3ForCausalLM`
Parameters	~600M
Hidden size	1 280
Layers	28
Vocab size	64 000 (custom KK/RU tokenizer)
Precision	BF16
Disk footprint	~1.3 GB
Merge method	DARE-TIES (RNG seed 1729)
Primary language	Kazakh (KK)
Secondary language	Russian (RU)

Source models

Model	Role	Weight	Density
`ekitil-core-qwen3-600m-kkru-base-v1`	Base (foundation)	--	--
`ekitil-qwen3-600m-kk` (step 4500)	KK specialist	0.45	0.53
`ekitil-qwen3-600m-kkru` (step 2000)	KK+RU generalist	0.35	0.53

The base model provides the foundation architecture. DARE randomly drops (1 - density) of each task vector's weights and rescales survivors, then TIES elects majority-sign directions before summing into the merged model.

Why this merge

Each source model captures a different aspect of Kazakh language:

ekitil-core -- balanced KK+RU base
ekitil-kk -- maximally Kazakh-specialized (4 500 training steps)
ekitil-kkru -- bilingual KK+RU (2 000 training steps)

DARE-TIES combines all three while suppressing conflicting parameter updates, producing better KK coverage than any single model alone.

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "AMAImedia/NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [{"role": "user", "content": "Salem! Qazaq tilinde soylesesik."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Note: vocab_size=64000 -- custom Kazakh tokenizer, not Qwen3 standard 151 936. Not compatible with standard Qwen3 tokenizer pipelines.

NOESIS context

In NOESIS this model provides Kazakh language domain boost (KK x10 weight multiplier) for the DUB-LM and CHAT specialists during knowledge distillation.

KD note: vocab_size=64000 is incompatible with NOESIS student vocab (151 936). Soft label extraction requires a custom cross-vocab projection layer.

Provenance

A merge_provenance.json file ships alongside the model weights with the full merge trace: source models, weights, densities, DARE-TIES parameters, and RNG seed.

License

This model is released under the Apache License 2.0, inherited from the upstream Qwen3 base and ekitil model checkpoints. See the LICENSE file for the full license text.

Citation

@misc{noesis_darwin_kz,
  title     = {NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16},
  author    = {Bolotnikov, Ilia},
  year      = {2026},
  publisher = {AMAImedia},
  url       = {https://amaimedia.com}
}

Downloads last month: 305

Safetensors

Model size

0.7B params

Tensor type

BF16