File size: 11,746 Bytes

---
library_name: vllm
language:
- en
- fr
- es
- de
- it
- pt
- nl
- zh
- ja
- ko
- ar
license: apache-2.0
inference: false
base_model:
- mistralai/Ministral-3-8B-Instruct-2512
extra_gated_description: >-
  If you want to learn more about how we process your personal data, please read
  our <a href="https://mistral.ai/terms/">Privacy Policy</a>.
tags:
- mistral-common
- heretic
- uncensored
- decensored
- abliterated
pipeline_tag: image-text-to-text
---
This is a **Ministral-3-8B-Instruct-2512** fine-tune, produced through P-E-W's [Heretic](https://github.com/p-e-w/heretic) (v1.2.0) abliteration engine with [Magnitude-Preserving Orthogonal Ablation](https://github.com/p-e-w/heretic/pull/52) enabled.

**Note:** Results from previous attempts: [Click Here](https://huggingface.co/MuXodious/Ministral-3-8B-Instruct-2512-tainted-heresy/discussions/1#69762ea78d0b3e7429a38388)

---
<img src="https://img.shields.io/badge/RENEGADE_CHAPTER-PAPERWITCH-B85ADB?style=flat-square&labelColor=101010" align="right" width="300">

**Heretication Results**

| Score Metric | Value | Parameter | Value |
| :--- | :--- | :--- | :--- |
| **Refusals** | 8/100 | **direction_index** | per layer |
| **KL Divergence** | 0.0509  | **attn.o_proj.max_weight** | 1.97 |
| **Initial Refusals** | 91/100 | **attn.o_proj.max_weight_position** | 17.48 |
||| **attn.o_proj.min_weight** | 1.90 |
||| **attn.o_proj.min_weight_distance** | 10.79 |
||| **mlp.down_proj.max_weight** | 0.19 |
||| **mlp.down_proj.max_weight_position** | 8.56 |
||| **mlp.down_proj.min_weight** | 0.04 |
||| **mlp.down_proj.min_weight_distance** | 15.62 |


---
**Appendix**

<img src="Ministral-3-8B-Instruct-2512-BF16.gif" alt="PaCMAP projection"/>

```
 » [Trial 407] Refusals:  8/100, KL divergence: 0.0509
   [Trial 318] Refusals: 11/100, KL divergence: 0.0314
   [Trial 253] Refusals: 14/100, KL divergence: 0.0278
   [Trial 216] Refusals: 15/100, KL divergence: 0.0276
   [Trial 401] Refusals: 19/100, KL divergence: 0.0255
   [Trial 405] Refusals: 21/100, KL divergence: 0.0240
   [Trial 149] Refusals: 31/100, KL divergence: 0.0232
   [Trial 249] Refusals: 33/100, KL divergence: 0.0221
   [Trial 244] Refusals: 38/100, KL divergence: 0.0214
   [Trial 230] Refusals: 44/100, KL divergence: 0.0207
   [Trial 153] Refusals: 46/100, KL divergence: 0.0198
   [Trial 347] Refusals: 52/100, KL divergence: 0.0175
   [Trial 154] Refusals: 62/100, KL divergence: 0.0160
   [Trial 138] Refusals: 64/100, KL divergence: 0.0154
   [Trial 392] Refusals: 65/100, KL divergence: 0.0134
   [Trial 480] Refusals: 66/100, KL divergence: 0.0120
   [Trial  29] Refusals: 73/100, KL divergence: 0.0113
   [Trial 240] Refusals: 74/100, KL divergence: 0.0109
   [Trial 612] Refusals: 75/100, KL divergence: 0.0102
   [Trial 255] Refusals: 77/100, KL divergence: 0.0073
   [Trial 378] Refusals: 79/100, KL divergence: 0.0059
   [Trial 605] Refusals: 81/100, KL divergence: 0.0046
   [Trial   1] Refusals: 82/100, KL divergence: 0.0042
   [Trial 443] Refusals: 83/100, KL divergence: 0.0040
   [Trial 486] Refusals: 84/100, KL divergence: 0.0038
   [Trial 450] Refusals: 85/100, KL divergence: 0.0026
   [Trial 343] Refusals: 86/100, KL divergence: 0.0022
   [Trial  14] Refusals: 87/100, KL divergence: 0.0009
   [Trial 336] Refusals: 88/100, KL divergence: 0.0008
   [Trial 274] Refusals: 89/100, KL divergence: 0.0005
   [Trial 418] Refusals: 90/100, KL divergence: 0.0004
   [Trial 688] Refusals: 91/100, KL divergence: 0.0000
```

---
# Ministral 3 8B Instruct 2512 BF16

A balanced model in the Ministral 3 family, **Ministral 3 8B** is a powerful, efficient tiny language model with vision capabilities.

This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 24GB of VRAM in BF16, and less than 12GB of RAM/VRAM when quantized.

We provide a no-loss FP8 version [here](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512), you can find other formats and quantizations in the [Ministral 3 - Additional Checkpoints](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints) collection.

Learn more in our [blog post](https://mistral.ai/news/mistral-3) and [paper](https://arxiv.org/abs/2601.08584).

## Key Features
Ministral 3 8B consists of two main architectural components:
- **8.4B Language Model**
- **0.4B Vision Encoder**

The Ministral 3 8B Instruct model offers the following capabilities:
- **Vision**: Enables the model to analyze images and provide insights based on visual content, in addition to text.
- **Multilingual**: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
- **System Prompt**: Maintains strong adherence and support for system prompts.
- **Agentic**: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
- **Edge-Optimized**: Delivers best-in-class performance at a small scale, deployable anywhere.
- **Apache 2.0 License**: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
- **Large Context Window**: Supports a 256k context window.

### Use Cases
Perfect for balanced performance in local or embedded systems, combining versatility with efficiency.
- Chat interfaces in constrained environments
- Local daily-driver AI assistant
- Image/document description and understanding
- Translation and content generation
- Specialized agentic use cases
- Fine-tuning and specialization
- And more...
  
Bringing advanced AI capabilities to resource-constrained environments.

## Ministral 3 Family

| Model Name                     | Type               | Precision | Link                                                                                     |
|--------------------------------|--------------------|-----------|------------------------------------------------------------------------------------------|
| Ministral 3 3B Base 2512       | Base pre-trained   | BF16      | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Base-2512)                |
| Ministral 3 3B Instruct 2512   | Instruct post-trained | BF16   | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512)            |
| Ministral 3 3B Reasoning 2512  | Reasoning capable  | BF16      | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512)           |
| Ministral 3 8B Base 2512       | Base pre-trained   | BF16      | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Base-2512)                |
| **Ministral 3 8B Instruct 2512**   | **Instruct post-trained** | **BF16**    | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512)            |
| Ministral 3 8B Reasoning 2512  | Reasoning capable  | BF16      | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512)           |
| Ministral 3 14B Base 2512      | Base pre-trained   | BF16      | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Base-2512)               |
| Ministral 3 14B Instruct 2512  | Instruct post-trained | BF16    | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512)           |
| Ministral 3 14B Reasoning 2512 | Reasoning capable  | BF16      | [Hugging Face](https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512)          |

Other formats available [here](https://huggingface.co/collections/mistralai/ministral-3-additional-checkpoints).

## Benchmark Results

We compare Ministral 3 to similar sized models.

### Reasoning

| Model                     | AIME25      | AIME24      | GPQA Diamond | LiveCodeBench |
|---------------------------|-------------|-------------|--------------|---------------|
| **Ministral 3 14B**       | <u>0.850</u>| <u>0.898</u>| <u>0.712</u> | <u>0.646</u>  |
| Qwen3-14B (Thinking)      | 0.737       | 0.837       | 0.663        | 0.593         |
|                           |             |             |              |               |
| **Ministral 3 8B**        | 0.787       | <u>0.860</u>| 0.668        | <u>0.616</u>  |
| Qwen3-VL-8B-Thinking      | <u>0.798</u>| <u>0.860</u>| <u>0.671</u> | 0.580         |
|                           |             |             |              |               |
| **Ministral 3 3B**        | <u>0.721</u>| <u>0.775</u>| 0.534        | <u>0.548</u>  |
| Qwen3-VL-4B-Thinking      | 0.697       | 0.729       | <u>0.601</u> | 0.513         |

### Instruct

| Model                     | Arena Hard  | WildBench  | MATH Maj@1  | MM MTBench       |
|---------------------------|-------------|------------|-------------|------------------|
| **Ministral 3 14B**       | <u>0.551</u>| <u>68.5</u>| <u>0.904</u>| <u>8.49</u>      |
| Qwen3 14B (Non-Thinking)  | 0.427       | 65.1       | 0.870       | NOT MULTIMODAL   |
| Gemma3-12B-Instruct       | 0.436       | 63.2       | 0.854       | 6.70             |
|                           |             |            |             |                  |
| **Ministral 3 8B**        | 0.509       | <u>66.8</u>| 0.876       | <u>8.08</u>      |
| Qwen3-VL-8B-Instruct      | <u>0.528</u>| 66.3       | <u>0.946</u>| 8.00             |
|                           |             |            |             |                  |
| **Ministral 3 3B**        | 0.305       | <u>56.8</u>| 0.830       | 7.83             |
| Qwen3-VL-4B-Instruct      | <u>0.438</u>| <u>56.8</u>| <u>0.900</u>| <u>8.01</u>      |
| Qwen3-VL-2B-Instruct      | 0.163       | 42.2       | 0.786       | 6.36             |
| Gemma3-4B-Instruct        | 0.318       | 49.1       | 0.759       | 5.23             |

### Base

| Model               | Multilingual MMLU | MATH CoT 2-Shot | AGIEval 5-shot | MMLU Redux 5-shot | MMLU 5-shot | TriviaQA 5-shot |
|---------------------|-------------------|-----------------|----------------|-------------------|-------------|-----------------|
| **Ministral 3 14B** | 0.742             | <u>0.676</u>    | 0.648          | 0.820             | 0.794       | 0.749           |
| Qwen3 14B Base      | <u>0.754</u>      | 0.620           | <u>0.661</u>   | <u>0.837</u>      | <u>0.804</u>| 0.703           |
| Gemma 3 12B Base    | 0.690             | 0.487           | 0.587          | 0.766             | 0.745       | <u>0.788</u>    |
|                     |                   |                 |                |                   |             |                 |
| **Ministral 3 8B**  | <u>0.706</u>      | <u>0.626</u>    | 0.591          | 0.793             | <u>0.761</u>| <u>0.681</u>    |
| Qwen 3 8B Base      | 0.700             | 0.576           | <u>0.596</u>   | <u>0.794</u>      | 0.760       | 0.639           |
|                     |                   |                 |                |                   |             |                 |
| **Ministral 3 3B**  | 0.652             | <u>0.601</u>    | 0.511          | 0.735             | 0.707       | 0.592           |
| Qwen 3 4B Base      | <u>0.677</u>      | 0.405           | <u>0.570</u>   | <u>0.759</u>      | <u>0.713</u>| 0.530           |
| Gemma 3 4B Base     | 0.516             | 0.294           | 0.430          | 0.626             | 0.589       | <u>0.640</u>    |

## License

This model is licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).

*You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.*