--- license: gemma library_name: gguf base_model: Ayodele01/Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill tags: - gemma - gemma-4 - reasoning - gguf - llama-cpp - fine-tuned - flash-distill language: - en pipeline_tag: text-generation --- # Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill-GGUF GGUF quantized versions of [Ayodele01/Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill](https://huggingface.co/Ayodele01/Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill). ## Model Description This is Google's Gemma-4 12B instruction-tuned model, fine-tuned on the full 25,000 synthetic reasoning examples dataset [WithinUsAI/gemini_3.5_flash_distilled_25k](https://huggingface.co/datasets/WithinUsAI/gemini_3.5_flash_distilled_25k) using QLoRA via Unsloth. This GGUF model contains quantized versions of the merged model weights. ## Available Files and Quantizations | Filename | Quant Type | Size | Description | |----------|------------|------|-------------| | `Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill-bf16.gguf` | BF16 | ~24.4 GB | Full precision, best quality | | `Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill-Q8_0.gguf` | Q8_0 | ~12.2 GB | High quality, minimal degradation | | `Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill-Q5_K_M.gguf` | Q5_K_M | ~8.3 GB | Balanced (recommended) | | `Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill-Q4_K_M.gguf` | Q4_K_M | ~7.2 GB | Good quality, smaller size | ## Usage with llama.cpp You can run these files using [llama.cpp](https://github.com/ggerganov/llama.cpp). ```bash # Run with llama-cli ./llama-cli -m Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill-Q5_K_M.gguf \ -p "<|turn>user\nWhat is the sum of all prime numbers between 1 and 50?<|turn>model\n" \ -n 512 ``` ## Prompt Template Gemma-4 chat template format: ```text <|turn>user { prompt }<|turn>model ``` ## Training and Distillation Context For details on evaluations, training hyperparameters, and qualitative findings, please refer to the main repository model card: [Ayodele01/Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill](https://huggingface.co/Ayodele01/Gemma-4-12B-Gemini-3.5-flash-Reasoning-Distill).