---
library_name: vllm
language:
- en
- fr
- es
- de
- it
- pt
- nl
- zh
- ja
- ko
- ar
license: apache-2.0
inference: false
base_model:
- mistralai/Ministral-3-8B-Instruct-2512
extra_gated_description: >-
If you want to learn more about how we process your personal data, please read
our Privacy Policy.
tags:
- mistral-common
- heretic
- uncensored
- decensored
- abliterated
pipeline_tag: image-text-to-text
---
This is a **Ministral-3-8B-Instruct-2512** fine-tune, produced through P-E-W's [Heretic](https://github.com/p-e-w/heretic) (v1.2.0) abliteration engine with [Magnitude-Preserving Orthogonal Ablation](https://github.com/p-e-w/heretic/pull/52) enabled.
**Note:** Results from previous attempts: [Click Here](https://huggingface.co/MuXodious/Ministral-3-8B-Instruct-2512-tainted-heresy/discussions/1#69762ea78d0b3e7429a38388)
---
**Heretication Results**
| Score Metric | Value | Parameter | Value |
| :--- | :--- | :--- | :--- |
| **Refusals** | 8/100 | **direction_index** | per layer |
| **KL Divergence** | 0.0509 | **attn.o_proj.max_weight** | 1.97 |
| **Initial Refusals** | 91/100 | **attn.o_proj.max_weight_position** | 17.48 |
||| **attn.o_proj.min_weight** | 1.90 |
||| **attn.o_proj.min_weight_distance** | 10.79 |
||| **mlp.down_proj.max_weight** | 0.19 |
||| **mlp.down_proj.max_weight_position** | 8.56 |
||| **mlp.down_proj.min_weight** | 0.04 |
||| **mlp.down_proj.min_weight_distance** | 15.62 |
---
**Appendix**
```
» [Trial 407] Refusals: 8/100, KL divergence: 0.0509
[Trial 318] Refusals: 11/100, KL divergence: 0.0314
[Trial 253] Refusals: 14/100, KL divergence: 0.0278
[Trial 216] Refusals: 15/100, KL divergence: 0.0276
[Trial 401] Refusals: 19/100, KL divergence: 0.0255
[Trial 405] Refusals: 21/100, KL divergence: 0.0240
[Trial 149] Refusals: 31/100, KL divergence: 0.0232
[Trial 249] Refusals: 33/100, KL divergence: 0.0221
[Trial 244] Refusals: 38/100, KL divergence: 0.0214
[Trial 230] Refusals: 44/100, KL divergence: 0.0207
[Trial 153] Refusals: 46/100, KL divergence: 0.0198
[Trial 347] Refusals: 52/100, KL divergence: 0.0175
[Trial 154] Refusals: 62/100, KL divergence: 0.0160
[Trial 138] Refusals: 64/100, KL divergence: 0.0154
[Trial 392] Refusals: 65/100, KL divergence: 0.0134
[Trial 480] Refusals: 66/100, KL divergence: 0.0120
[Trial 29] Refusals: 73/100, KL divergence: 0.0113
[Trial 240] Refusals: 74/100, KL divergence: 0.0109
[Trial 612] Refusals: 75/100, KL divergence: 0.0102
[Trial 255] Refusals: 77/100, KL divergence: 0.0073
[Trial 378] Refusals: 79/100, KL divergence: 0.0059
[Trial 605] Refusals: 81/100, KL divergence: 0.0046
[Trial 1] Refusals: 82/100, KL divergence: 0.0042
[Trial 443] Refusals: 83/100, KL divergence: 0.0040
[Trial 486] Refusals: 84/100, KL divergence: 0.0038
[Trial 450] Refusals: 85/100, KL divergence: 0.0026
[Trial 343] Refusals: 86/100, KL divergence: 0.0022
[Trial 14] Refusals: 87/100, KL divergence: 0.0009
[Trial 336] Refusals: 88/100, KL divergence: 0.0008
[Trial 274] Refusals: 89/100, KL divergence: 0.0005
[Trial 418] Refusals: 90/100, KL divergence: 0.0004
[Trial 688] Refusals: 91/100, KL divergence: 0.0000
```
---
# Ministral 3 8B Instruct 2512 BF16
A balanced model in the Ministral 3 family, **Ministral 3 8B** is a powerful, efficient tiny language model with vision capabilities.
This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 24GB of VRAM in BF16, and less than 12GB of RAM/VRAM when quantized.
We provide a no-loss FP8 version [here](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512), you can find other formats and quantizations in the [Ministral 3 - Additional Checkpoints](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints) collection.
Learn more in our [blog post](https://mistral.ai/news/mistral-3) and [paper](https://arxiv.org/abs/2601.08584).
## Key Features
Ministral 3 8B consists of two main architectural components:
- **8.4B Language Model**
- **0.4B Vision Encoder**
The Ministral 3 8B Instruct model offers the following capabilities:
- **Vision**: Enables the model to analyze images and provide insights based on visual content, in addition to text.
- **Multilingual**: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
- **System Prompt**: Maintains strong adherence and support for system prompts.
- **Agentic**: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
- **Edge-Optimized**: Delivers best-in-class performance at a small scale, deployable anywhere.
- **Apache 2.0 License**: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
- **Large Context Window**: Supports a 256k context window.
### Use Cases
Perfect for balanced performance in local or embedded systems, combining versatility with efficiency.
- Chat interfaces in constrained environments
- Local daily-driver AI assistant
- Image/document description and understanding
- Translation and content generation
- Specialized agentic use cases
- Fine-tuning and specialization
- And more...
Bringing advanced AI capabilities to resource-constrained environments.
## Ministral 3 Family
| Model Name | Type | Precision | Link |
|--------------------------------|--------------------|-----------|------------------------------------------------------------------------------------------|
| Ministral 3 3B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Base-2512) |
| Ministral 3 3B Instruct 2512 | Instruct post-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) |
| Ministral 3 3B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512) |
| Ministral 3 8B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Base-2512) |
| **Ministral 3 8B Instruct 2512** | **Instruct post-trained** | **BF16** | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512) |
| Ministral 3 8B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512) |
| Ministral 3 14B Base 2512 | Base pre-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Base-2512) |
| Ministral 3 14B Instruct 2512 | Instruct post-trained | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512) |
| Ministral 3 14B Reasoning 2512 | Reasoning capable | BF16 | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512) |
Other formats available [here](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints).
## Benchmark Results
We compare Ministral 3 to similar sized models.
### Reasoning
| Model | AIME25 | AIME24 | GPQA Diamond | LiveCodeBench |
|---------------------------|-------------|-------------|--------------|---------------|
| **Ministral 3 14B** | 0.850| 0.898| 0.712 | 0.646 |
| Qwen3-14B (Thinking) | 0.737 | 0.837 | 0.663 | 0.593 |
| | | | | |
| **Ministral 3 8B** | 0.787 | 0.860| 0.668 | 0.616 |
| Qwen3-VL-8B-Thinking | 0.798| 0.860| 0.671 | 0.580 |
| | | | | |
| **Ministral 3 3B** | 0.721| 0.775| 0.534 | 0.548 |
| Qwen3-VL-4B-Thinking | 0.697 | 0.729 | 0.601 | 0.513 |
### Instruct
| Model | Arena Hard | WildBench | MATH Maj@1 | MM MTBench |
|---------------------------|-------------|------------|-------------|------------------|
| **Ministral 3 14B** | 0.551| 68.5| 0.904| 8.49 |
| Qwen3 14B (Non-Thinking) | 0.427 | 65.1 | 0.870 | NOT MULTIMODAL |
| Gemma3-12B-Instruct | 0.436 | 63.2 | 0.854 | 6.70 |
| | | | | |
| **Ministral 3 8B** | 0.509 | 66.8| 0.876 | 8.08 |
| Qwen3-VL-8B-Instruct | 0.528| 66.3 | 0.946| 8.00 |
| | | | | |
| **Ministral 3 3B** | 0.305 | 56.8| 0.830 | 7.83 |
| Qwen3-VL-4B-Instruct | 0.438| 56.8| 0.900| 8.01 |
| Qwen3-VL-2B-Instruct | 0.163 | 42.2 | 0.786 | 6.36 |
| Gemma3-4B-Instruct | 0.318 | 49.1 | 0.759 | 5.23 |
### Base
| Model | Multilingual MMLU | MATH CoT 2-Shot | AGIEval 5-shot | MMLU Redux 5-shot | MMLU 5-shot | TriviaQA 5-shot |
|---------------------|-------------------|-----------------|----------------|-------------------|-------------|-----------------|
| **Ministral 3 14B** | 0.742 | 0.676 | 0.648 | 0.820 | 0.794 | 0.749 |
| Qwen3 14B Base | 0.754 | 0.620 | 0.661 | 0.837 | 0.804| 0.703 |
| Gemma 3 12B Base | 0.690 | 0.487 | 0.587 | 0.766 | 0.745 | 0.788 |
| | | | | | | |
| **Ministral 3 8B** | 0.706 | 0.626 | 0.591 | 0.793 | 0.761| 0.681 |
| Qwen 3 8B Base | 0.700 | 0.576 | 0.596 | 0.794 | 0.760 | 0.639 |
| | | | | | | |
| **Ministral 3 3B** | 0.652 | 0.601 | 0.511 | 0.735 | 0.707 | 0.592 |
| Qwen 3 4B Base | 0.677 | 0.405 | 0.570 | 0.759 | 0.713| 0.530 |
| Gemma 3 4B Base | 0.516 | 0.294 | 0.430 | 0.626 | 0.589 | 0.640 |
## License
This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).
*You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.*