File size: 11,746 Bytes
697bfb5 41214e1 464dcba 024b3c2 41214e1 024b3c2 697bfb5 f165766 41214e1 0e95764 f165766 41214e1 f165766 41214e1 f165766 41214e1 f165766 41214e1 f165766 41214e1 f165766 41214e1 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 | ---
library_name: vllm
language:
- en
- fr
- es
- de
- it
- pt
- nl
- zh
- ja
- ko
- ar
license: apache-2.0
inference: false
base_model:
- mistralai/Ministral-3-8B-Instruct-2512
extra_gated_description: >-
If you want to learn more about how we process your personal data, please read
our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
tags:
- mistral-common
- heretic
- uncensored
- decensored
- abliterated
pipeline_tag: image-text-to-text
---
This is a **Ministral-3-8B-Instruct-2512** fine-tune, produced through P-E-W's [Heretic](https://github.com/p-e-w/heretic) (v1.2.0) abliteration engine with [Magnitude-Preserving Orthogonal Ablation](https://github.com/p-e-w/heretic/pull/52) enabled.
**Note:** Results from previous attempts: [Click Here](https://huggingface.co/MuXodious/Ministral-3-8B-Instruct-2512-tainted-heresy/discussions/1#69762ea78d0b3e7429a38388)
---
<img src="https://img.shields.io/badge/RENEGADE_CHAPTER-PAPERWITCH-B85ADB?style=flat-square&labelColor=101010" align="right" width="300">
**Heretication Results**
| Score Metric | Value | Parameter | Value |
| :--- | :--- | :--- | :--- |
| **Refusals** | 8/100 | **direction_index** | per layer |
| **KL Divergence** | 0.0509 | **attn.o_proj.max_weight** | 1.97 |
| **Initial Refusals** | 91/100 | **attn.o_proj.max_weight_position** | 17.48 |
||| **attn.o_proj.min_weight** | 1.90 |
||| **attn.o_proj.min_weight_distance** | 10.79 |
||| **mlp.down_proj.max_weight** | 0.19 |
||| **mlp.down_proj.max_weight_position** | 8.56 |
||| **mlp.down_proj.min_weight** | 0.04 |
||| **mlp.down_proj.min_weight_distance** | 15.62 |
---
**Appendix**
<img src="Ministral-3-8B-Instruct-2512-BF16.gif" alt="PaCMAP projection"/>
```
» [Trial 407] Refusals: 8/100, KL divergence: 0.0509
[Trial 318] Refusals: 11/100, KL divergence: 0.0314
[Trial 253] Refusals: 14/100, KL divergence: 0.0278
[Trial 216] Refusals: 15/100, KL divergence: 0.0276
[Trial 401] Refusals: 19/100, KL divergence: 0.0255
[Trial 405] Refusals: 21/100, KL divergence: 0.0240
[Trial 149] Refusals: 31/100, KL divergence: 0.0232
[Trial 249] Refusals: 33/100, KL divergence: 0.0221
[Trial 244] Refusals: 38/100, KL divergence: 0.0214
[Trial 230] Refusals: 44/100, KL divergence: 0.0207
[Trial 153] Refusals: 46/100, KL divergence: 0.0198
[Trial 347] Refusals: 52/100, KL divergence: 0.0175
[Trial 154] Refusals: 62/100, KL divergence: 0.0160
[Trial 138] Refusals: 64/100, KL divergence: 0.0154
[Trial 392] Refusals: 65/100, KL divergence: 0.0134
[Trial 480] Refusals: 66/100, KL divergence: 0.0120
[Trial 29] Refusals: 73/100, KL divergence: 0.0113
[Trial 240] Refusals: 74/100, KL divergence: 0.0109
[Trial 612] Refusals: 75/100, KL divergence: 0.0102
[Trial 255] Refusals: 77/100, KL divergence: 0.0073
[Trial 378] Refusals: 79/100, KL divergence: 0.0059
[Trial 605] Refusals: 81/100, KL divergence: 0.0046
[Trial 1] Refusals: 82/100, KL divergence: 0.0042
[Trial 443] Refusals: 83/100, KL divergence: 0.0040
[Trial 486] Refusals: 84/100, KL divergence: 0.0038
[Trial 450] Refusals: 85/100, KL divergence: 0.0026
[Trial 343] Refusals: 86/100, KL divergence: 0.0022
[Trial 14] Refusals: 87/100, KL divergence: 0.0009
[Trial 336] Refusals: 88/100, KL divergence: 0.0008
[Trial 274] Refusals: 89/100, KL divergence: 0.0005
[Trial 418] Refusals: 90/100, KL divergence: 0.0004
[Trial 688] Refusals: 91/100, KL divergence: 0.0000
```
---
# Ministral 3 8B Instruct 2512 BF16
A balanced model in the Ministral 3 family, **Ministral 3 8B** is a powerful, efficient tiny language model with vision capabilities.
This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 24GB of VRAM in BF16, and less than 12GB of RAM/VRAM when quantized.
We provide a no-loss FP8 version [here](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512), you can find other formats and quantizations in the [Ministral 3 - Additional Checkpoints](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints) collection.
Learn more in our [blog post](https://mistral.ai/news/mistral-3) and [paper](https://arxiv.org/abs/2601.08584).
## Key Features
Ministral 3 8B consists of two main architectural components:
- **8.4B Language Model**
- **0.4B Vision Encoder**
The Ministral 3 8B Instruct model offers the following capabilities:
- **Vision**: Enables the model to analyze images and provide insights based on visual content, in addition to text.
- **Multilingual**: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
- **System Prompt**: Maintains strong adherence and support for system prompts.
- **Agentic**: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
- **Edge-Optimized**: Delivers best-in-class performance at a small scale, deployable anywhere.
- **Apache 2.0 License**: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
- **Large Context Window**: Supports a 256k context window.
### Use Cases
Perfect for balanced performance in local or embedded systems, combining versatility with efficiency.
- Chat interfaces in constrained environments
- Local daily-driver AI assistant
- Image/document description and understanding
- Translation and content generation
- Specialized agentic use cases
- Fine-tuning and specialization
- And more...
Bringing advanced AI capabilities to resource-constrained environments.
## Ministral 3 Family
| Model Name | Type | Precision | Link |
|--------------------------------|--------------------|-----------|------------------------------------------------------------------------------------------|
| Ministral 3 3B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Base-2512) |
| Ministral 3 3B Instruct 2512 | Instruct post-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) |
| Ministral 3 3B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512) |
| Ministral 3 8B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Base-2512) |
| **Ministral 3 8B Instruct 2512** | **Instruct post-trained** | **BF16** | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512) |
| Ministral 3 8B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512) |
| Ministral 3 14B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Base-2512) |
| Ministral 3 14B Instruct 2512 | Instruct post-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512) |
| Ministral 3 14B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512) |
Other formats available [here](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints).
## Benchmark Results
We compare Ministral 3 to similar sized models.
### Reasoning
| Model | AIME25 | AIME24 | GPQA Diamond | LiveCodeBench |
|---------------------------|-------------|-------------|--------------|---------------|
| **Ministral 3 14B** | <u>0.850</u>| <u>0.898</u>| <u>0.712</u> | <u>0.646</u> |
| Qwen3-14B (Thinking) | 0.737 | 0.837 | 0.663 | 0.593 |
| | | | | |
| **Ministral 3 8B** | 0.787 | <u>0.860</u>| 0.668 | <u>0.616</u> |
| Qwen3-VL-8B-Thinking | <u>0.798</u>| <u>0.860</u>| <u>0.671</u> | 0.580 |
| | | | | |
| **Ministral 3 3B** | <u>0.721</u>| <u>0.775</u>| 0.534 | <u>0.548</u> |
| Qwen3-VL-4B-Thinking | 0.697 | 0.729 | <u>0.601</u> | 0.513 |
### Instruct
| Model | Arena Hard | WildBench | MATH Maj@1 | MM MTBench |
|---------------------------|-------------|------------|-------------|------------------|
| **Ministral 3 14B** | <u>0.551</u>| <u>68.5</u>| <u>0.904</u>| <u>8.49</u> |
| Qwen3 14B (Non-Thinking) | 0.427 | 65.1 | 0.870 | NOT MULTIMODAL |
| Gemma3-12B-Instruct | 0.436 | 63.2 | 0.854 | 6.70 |
| | | | | |
| **Ministral 3 8B** | 0.509 | <u>66.8</u>| 0.876 | <u>8.08</u> |
| Qwen3-VL-8B-Instruct | <u>0.528</u>| 66.3 | <u>0.946</u>| 8.00 |
| | | | | |
| **Ministral 3 3B** | 0.305 | <u>56.8</u>| 0.830 | 7.83 |
| Qwen3-VL-4B-Instruct | <u>0.438</u>| <u>56.8</u>| <u>0.900</u>| <u>8.01</u> |
| Qwen3-VL-2B-Instruct | 0.163 | 42.2 | 0.786 | 6.36 |
| Gemma3-4B-Instruct | 0.318 | 49.1 | 0.759 | 5.23 |
### Base
| Model | Multilingual MMLU | MATH CoT 2-Shot | AGIEval 5-shot | MMLU Redux 5-shot | MMLU 5-shot | TriviaQA 5-shot |
|---------------------|-------------------|-----------------|----------------|-------------------|-------------|-----------------|
| **Ministral 3 14B** | 0.742 | <u>0.676</u> | 0.648 | 0.820 | 0.794 | 0.749 |
| Qwen3 14B Base | <u>0.754</u> | 0.620 | <u>0.661</u> | <u>0.837</u> | <u>0.804</u>| 0.703 |
| Gemma 3 12B Base | 0.690 | 0.487 | 0.587 | 0.766 | 0.745 | <u>0.788</u> |
| | | | | | | |
| **Ministral 3 8B** | <u>0.706</u> | <u>0.626</u> | 0.591 | 0.793 | <u>0.761</u>| <u>0.681</u> |
| Qwen 3 8B Base | 0.700 | 0.576 | <u>0.596</u> | <u>0.794</u> | 0.760 | 0.639 |
| | | | | | | |
| **Ministral 3 3B** | 0.652 | <u>0.601</u> | 0.511 | 0.735 | 0.707 | 0.592 |
| Qwen 3 4B Base | <u>0.677</u> | 0.405 | <u>0.570</u> | <u>0.759</u> | <u>0.713</u>| 0.530 |
| Gemma 3 4B Base | 0.516 | 0.294 | 0.430 | 0.626 | 0.589 | <u>0.640</u> |
## License
This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
*You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.* |