This is a Ministral-3-3B-Instruct-2512 fine-tune, produced at the request of redaihf through P-E-W's Heretic (v1.3.0) abliteration engine with Arbitrary-Rank Ablation enabled.

Heretication Results

Score Metric	Value	Parameter	Value
Refusals	2/416	start_layer_index	10
KL Divergence	0.0216	end_layer_index	25
Initial Refusals	401/416	preserve_good_behavior_weight	0.9095
		steer_bad_behavior_weight	0.0001
		overcorrect_relative_weight	1.0111
		neighbor_count	8

Appendix

Empty system prompt.

Heretication Rituals

   [Trial 140] Refusals:  0/416, KL divergence: 3.5422
   [Trial 257] Refusals:  1/416, KL divergence: 0.0255
 » [Trial 233] Refusals:  2/416, KL divergence: 0.0216
   [Trial 149] Refusals: 10/416, KL divergence: 0.0190
   [Trial 187] Refusals: 16/416, KL divergence: 0.0172
   [Trial  90] Refusals: 21/416, KL divergence: 0.0144
   [Trial   3] Refusals: 45/416, KL divergence: 0.0136
   [Trial 295] Refusals: 98/416, KL divergence: 0.0090
   [Trial 292] Refusals: 107/416, KL divergence: 0.0075
   [Trial 202] Refusals: 165/416, KL divergence: 0.0054
   [Trial  48] Refusals: 270/416, KL divergence: 0.0048
   [Trial 274] Refusals: 284/416, KL divergence: 0.0048
   [Trial 201] Refusals: 309/416, KL divergence: 0.0024
   [Trial 221] Refusals: 380/416, KL divergence: 0.0014
   [Trial 168] Refusals: 401/416, KL divergence: 0.0000

PIQA Benchmarks

PIQA benchmarks are considered in final trial selection for release.

┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA Base │ acc,none             │ 0.7720 │
│           │ acc_stderr,none      │ 0.0098 │
│           │ acc_norm,none        │ 0.7753 │
│           │ acc_norm_stderr,none │ 0.0097 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T233 │ acc,none             │ 0.7758 │
│           │ acc_stderr,none      │ 0.0097 │
│           │ acc_norm,none        │ 0.7829 │
│           │ acc_norm_stderr,none │ 0.0096 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T257 │ acc,none             │ 0.7742 │
│           │ acc_stderr,none      │ 0.0098 │
│           │ acc_norm,none        │ 0.7748 │
│           │ acc_norm_stderr,none │ 0.0097 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T149 │ acc,none             │ 0.7715 │
│           │ acc_stderr,none      │ 0.0098 │
│           │ acc_norm,none        │ 0.7791 │
│           │ acc_norm_stderr,none │ 0.0097 │
└───────────┴──────────────────────┴────────┘

Ministral 3 3B Instruct 2512 BF16

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, capable of fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized.

We provide a no-loss FP8 version here, you can find other formats and quantizations in the Ministral 3 - Additional Checkpoints collection.

Learn more in our blog post and paper.

Key Features

Ministral 3 3B consists of two main architectural components:

3.4B Language Model
0.4B Vision Encoder

The Ministral 3 3B Instruct model offers the following capabilities:

Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
System Prompt: Maintains strong adherence and support for system prompts.
Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
Large Context Window: Supports a 256k context window.

Use Cases

Ideal for lightweight, real-time applications on edge or low-resource devices, such as:

Image captioning
Text classification
Real-time efficient translation
Data extraction
Short content generation
Fine-tuning and specialization
And more...

Bringing advanced AI capabilities to edge and distributed environments for embedded systems.

Ministral 3 Family

Model Name	Type	Precision	Link
Ministral 3 3B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 3B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 3B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 8B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 8B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 8B Reasoning 2512	Reasoning capable	BF16	Hugging Face
Ministral 3 14B Base 2512	Base pre-trained	BF16	Hugging Face
Ministral 3 14B Instruct 2512	Instruct post-trained	BF16	Hugging Face
Ministral 3 14B Reasoning 2512	Reasoning capable	BF16	Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model	AIME25	AIME24	GPQA Diamond	LiveCodeBench
Ministral 3 14B	0.850	0.898	0.712	0.646
Qwen3-14B (Thinking)	0.737	0.837	0.663	0.593

Ministral 3 8B	0.787	0.860	0.668	0.616
Qwen3-VL-8B-Thinking	0.798	0.860	0.671	0.580

Ministral 3 3B	0.721	0.775	0.534	0.548
Qwen3-VL-4B-Thinking	0.697	0.729	0.601	0.513

Instruct

Model	Arena Hard	WildBench	MATH Maj@1	MM MTBench
Ministral 3 14B	0.551	68.5	0.904	8.49
Qwen3 14B (Non-Thinking)	0.427	65.1	0.870	NOT MULTIMODAL
Gemma3-12B-Instruct	0.436	63.2	0.854	6.70

Ministral 3 8B	0.509	66.8	0.876	8.08
Qwen3-VL-8B-Instruct	0.528	66.3	0.946	8.00

Ministral 3 3B	0.305	56.8	0.830	7.83
Qwen3-VL-4B-Instruct	0.438	56.8	0.900	8.01
Qwen3-VL-2B-Instruct	0.163	42.2	0.786	6.36
Gemma3-4B-Instruct	0.318	49.1	0.759	5.23

Base

Model	Multilingual MMLU	MATH CoT 2-Shot	AGIEval 5-shot	MMLU Redux 5-shot	MMLU 5-shot	TriviaQA 5-shot
Ministral 3 14B	0.742	0.676	0.648	0.820	0.794	0.749
Qwen3 14B Base	0.754	0.620	0.661	0.837	0.804	0.703
Gemma 3 12B Base	0.690	0.487	0.587	0.766	0.745	0.788

Ministral 3 8B	0.706	0.626	0.591	0.793	0.761	0.681
Qwen 3 8B Base	0.700	0.576	0.596	0.794	0.760	0.639

Ministral 3 3B	0.652	0.601	0.511	0.735	0.707	0.592
Qwen 3 4B Base	0.677	0.405	0.570	0.759	0.713	0.530
Gemma 3 4B Base	0.516	0.294	0.430	0.626	0.589	0.640

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.