--- pipeline_tag: text-generation language: uig license: gemma tags: - trimmed library_name: transformers base_model: google/gemma-3-4b-it base_model_relation: quantized datasets: - lbourdois/fineweb-2-trimming --- # gemma-3-4b-it-uig-16384 This model is a **14.63%** smaller version of [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) optimized for Uyghur language via vocabulary size reduction using the [trimming](https://huggingface.co/blog/lbourdois/introduction-to-trimming) method. This trimmed model should perform similarly to the original model with only 16,384 tokens and a much smaller memory footprint. However, it may not perform well for other languages as tokens not commonly used in the selected languages were removed from the vocabulary. ## Model Statistics | Metric | Original | Trimmed | Reduction | |--------|----------|---------|-----------| | **Vocabulary size** | 262,144 tokens | 16,384 tokens | **93.75%** | | **Model size** | 4,300,079,472 params | 3,670,770,032 params | **14.63%** | ![image](https://raw.githubusercontent.com/lbourdois/blog/refs/heads/master/assets/images/Trimming/gemma-3-4b-it-16384.png) ## Mining Dataset Statistics - **Number of texts used for mining**: 24,729 texts - **Dataset**: [lbourdois/fineweb-2-trimming](https://huggingface.co/datasets/lbourdois/fineweb-2-trimming) ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "alphaedge-ai/gemma-3-4b-it-uig-16384" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) prompt = "Your prompt in Uyghur." messages = [{"role": "user", "content": prompt}] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate(**model_inputs, max_new_tokens=256) output_ids = generated_ids[0][len(model_inputs.input_ids[0]):] response = tokenizer.decode(output_ids, skip_special_tokens=True) print(response) ``` ## Citations #### Gemma 3 ```bibtex @misc{gemmateam2025gemma3technicalreport, title={Gemma 3 Technical Report}, author={Gemma Team}, year={2025}, eprint={2503.19786}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2503.19786}, } ``` #### Trimming blog post ``` @misc{hf_blogpost_trimming, title={Introduction to Trimming}, author={Loïck BOURDOIS and Tom AARSEN and Bram VANROY and Christopher AKIKI and Woojun JUNG and Manuel ROMERO and Prithiv SAKTHI}, year={2026}, url={https://huggingface.co/blog/lbourdois/introduction-to-trimming}, } ``` ### License This model is derived from [google/gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it). Use of this model is governed by the [Gemma Terms of Use](https://ai.google.dev/gemma/terms). By using this model, you agree to the Gemma Terms of Use. This model is not affiliated with or endorsed by Google.