NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16

DARE-TIES merge of three Kazakh/Russian Qwen3-600M models, producing a stronger KK/RU specialist.

Released as part of the NOESIS Professional Multilingual Dubbing Automation Platform (framework: DHCF-FNO -- Deterministic Hybrid Control Framework for Frozen Neural Operators).


Model summary

Property Value
Architecture Qwen3ForCausalLM
Parameters ~600M
Hidden size 1 280
Layers 28
Vocab size 64 000 (custom KK/RU tokenizer)
Precision BF16
Disk footprint ~1.3 GB
Merge method DARE-TIES (RNG seed 1729)
Primary language Kazakh (KK)
Secondary language Russian (RU)

Source models

Model Role Weight Density
ekitil-core-qwen3-600m-kkru-base-v1 Base (foundation) -- --
ekitil-qwen3-600m-kk (step 4500) KK specialist 0.45 0.53
ekitil-qwen3-600m-kkru (step 2000) KK+RU generalist 0.35 0.53

The base model provides the foundation architecture. DARE randomly drops (1 - density) of each task vector's weights and rescales survivors, then TIES elects majority-sign directions before summing into the merged model.


Why this merge

Each source model captures a different aspect of Kazakh language:

  • ekitil-core -- balanced KK+RU base
  • ekitil-kk -- maximally Kazakh-specialized (4 500 training steps)
  • ekitil-kkru -- bilingual KK+RU (2 000 training steps)

DARE-TIES combines all three while suppressing conflicting parameter updates, producing better KK coverage than any single model alone.


How to use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "AMAImedia/NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [{"role": "user", "content": "Salem! Qazaq tilinde soylesesik."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=True, temperature=0.7)
print(tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Note: vocab_size=64000 -- custom Kazakh tokenizer, not Qwen3 standard 151 936. Not compatible with standard Qwen3 tokenizer pipelines.


NOESIS context

In NOESIS this model provides Kazakh language domain boost (KK x10 weight multiplier) for the DUB-LM and CHAT specialists during knowledge distillation.

KD note: vocab_size=64000 is incompatible with NOESIS student vocab (151 936). Soft label extraction requires a custom cross-vocab projection layer.


Provenance

A merge_provenance.json file ships alongside the model weights with the full merge trace: source models, weights, densities, DARE-TIES parameters, and RNG seed.


License

This model is released under the Apache License 2.0, inherited from the upstream Qwen3 base and ekitil model checkpoints. See the LICENSE file for the full license text.


Citation

@misc{noesis_darwin_kz,
  title     = {NOESIS-Qwen3-0.6B-Darwin-Ekitil-Sozkz-KZ-BF16},
  author    = {Bolotnikov, Ilia},
  year      = {2026},
  publisher = {AMAImedia},
  url       = {https://amaimedia.com}
}
Downloads last month
305
Safetensors
Model size
0.7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support