qwen3-4b-uzbek-v2-bnb-4bit

bitsandbytes nf4 4-bit quant (~3.4 gb) of inspirebek/qwen3-4b-uzbek-v2. nvidia gpu only; easiest hf-native 4-bit load.

usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
)
tok = AutoTokenizer.from_pretrained("inspirebek/qwen3-4b-uzbek-v2-bnb-4bit")
model = AutoModelForCausalLM.from_pretrained(
    "inspirebek/qwen3-4b-uzbek-v2-bnb-4bit",
    quantization_config=bnb,
    device_map="auto",
)

quantization

  • method: bitsandbytes nf4 (4-bit normalfloat)
  • double quantization: enabled
  • compute dtype: bfloat16

datasets

stage a — fluency (continued pretraining):

stage b — instruct (sft):

⚠️ licensing note: saillab/alpaca_uzbek_taco is cc-by-nc-4.0, which restricts commercial use of derivative models. downstream users who need a fully permissive license should retrain without that subset.

sibling formats

Downloads last month
2
Safetensors
Model size
5B params
Tensor type
F32
·
F16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for inspirebek/qwen3-4b-uzbek-v2-bnb-4bit

Finetuned
Qwen/Qwen3-4B
Quantized
(3)
this model

Datasets used to train inspirebek/qwen3-4b-uzbek-v2-bnb-4bit

Collection including inspirebek/qwen3-4b-uzbek-v2-bnb-4bit