This is a Ministral-3-3B-Instruct-2512 fine-tune, produced at the request of redaihf through P-E-W's Heretic (v1.3.0) abliteration engine with Arbitrary-Rank Ablation enabled.


Heretication Results

Score Metric Value Parameter Value
Refusals 2/416 start_layer_index 10
KL Divergence 0.0216 end_layer_index 25
Initial Refusals 401/416 preserve_good_behavior_weight 0.9095
steer_bad_behavior_weight 0.0001
overcorrect_relative_weight 1.0111
neighbor_count 8

Appendix

Empty system prompt.

Heretication Rituals
   [Trial 140] Refusals:  0/416, KL divergence: 3.5422
   [Trial 257] Refusals:  1/416, KL divergence: 0.0255
 » [Trial 233] Refusals:  2/416, KL divergence: 0.0216
   [Trial 149] Refusals: 10/416, KL divergence: 0.0190
   [Trial 187] Refusals: 16/416, KL divergence: 0.0172
   [Trial  90] Refusals: 21/416, KL divergence: 0.0144
   [Trial   3] Refusals: 45/416, KL divergence: 0.0136
   [Trial 295] Refusals: 98/416, KL divergence: 0.0090
   [Trial 292] Refusals: 107/416, KL divergence: 0.0075
   [Trial 202] Refusals: 165/416, KL divergence: 0.0054
   [Trial  48] Refusals: 270/416, KL divergence: 0.0048
   [Trial 274] Refusals: 284/416, KL divergence: 0.0048
   [Trial 201] Refusals: 309/416, KL divergence: 0.0024
   [Trial 221] Refusals: 380/416, KL divergence: 0.0014
   [Trial 168] Refusals: 401/416, KL divergence: 0.0000
PIQA Benchmarks

PIQA benchmarks are considered in final trial selection for release.

┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA Base │ acc,none             │ 0.7720 │
│           │ acc_stderr,none      │ 0.0098 │
│           │ acc_norm,none        │ 0.7753 │
│           │ acc_norm_stderr,none │ 0.0097 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T233 │ acc,none             │ 0.7758 │
│           │ acc_stderr,none      │ 0.0097 │
│           │ acc_norm,none        │ 0.7829 │
│           │ acc_norm_stderr,none │ 0.0096 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T257 │ acc,none             │ 0.7742 │
│           │ acc_stderr,none      │ 0.0098 │
│           │ acc_norm,none        │ 0.7748 │
│           │ acc_norm_stderr,none │ 0.0097 │
└───────────┴──────────────────────┴────────┘
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓
┃ Benchmark ┃ Metric               ┃  Value ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩
│ PIQA T149 │ acc,none             │ 0.7715 │
│           │ acc_stderr,none      │ 0.0098 │
│           │ acc_norm,none        │ 0.7791 │
│           │ acc_norm_stderr,none │ 0.0097 │
└───────────┴──────────────────────┴────────┘

Ministral 3 3B Instruct 2512 BF16

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

This model is the instruct post-trained version, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, capable of fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized.

We provide a no-loss FP8 version here, you can find other formats and quantizations in the Ministral 3 - Additional Checkpoints collection.

Learn more in our blog post and paper.

Key Features

Ministral 3 3B consists of two main architectural components:

  • 3.4B Language Model
  • 0.4B Vision Encoder

The Ministral 3 3B Instruct model offers the following capabilities:

  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Use Cases

Ideal for lightweight, real-time applications on edge or low-resource devices, such as:

  • Image captioning
  • Text classification
  • Real-time efficient translation
  • Data extraction
  • Short content generation
  • Fine-tuning and specialization
  • And more...

Bringing advanced AI capabilities to edge and distributed environments for embedded systems.

Ministral 3 Family

Model Name Type Precision Link
Ministral 3 3B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 3B Instruct 2512 Instruct post-trained BF16 Hugging Face
Ministral 3 3B Reasoning 2512 Reasoning capable BF16 Hugging Face
Ministral 3 8B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 8B Instruct 2512 Instruct post-trained BF16 Hugging Face
Ministral 3 8B Reasoning 2512 Reasoning capable BF16 Hugging Face
Ministral 3 14B Base 2512 Base pre-trained BF16 Hugging Face
Ministral 3 14B Instruct 2512 Instruct post-trained BF16 Hugging Face
Ministral 3 14B Reasoning 2512 Reasoning capable BF16 Hugging Face

Other formats available here.

Benchmark Results

We compare Ministral 3 to similar sized models.

Reasoning

Model AIME25 AIME24 GPQA Diamond LiveCodeBench
Ministral 3 14B 0.850 0.898 0.712 0.646
Qwen3-14B (Thinking) 0.737 0.837 0.663 0.593
Ministral 3 8B 0.787 0.860 0.668 0.616
Qwen3-VL-8B-Thinking 0.798 0.860 0.671 0.580
Ministral 3 3B 0.721 0.775 0.534 0.548
Qwen3-VL-4B-Thinking 0.697 0.729 0.601 0.513

Instruct

Model Arena Hard WildBench MATH Maj@1 MM MTBench
Ministral 3 14B 0.551 68.5 0.904 8.49
Qwen3 14B (Non-Thinking) 0.427 65.1 0.870 NOT MULTIMODAL
Gemma3-12B-Instruct 0.436 63.2 0.854 6.70
Ministral 3 8B 0.509 66.8 0.876 8.08
Qwen3-VL-8B-Instruct 0.528 66.3 0.946 8.00
Ministral 3 3B 0.305 56.8 0.830 7.83
Qwen3-VL-4B-Instruct 0.438 56.8 0.900 8.01
Qwen3-VL-2B-Instruct 0.163 42.2 0.786 6.36
Gemma3-4B-Instruct 0.318 49.1 0.759 5.23

Base

Model Multilingual MMLU MATH CoT 2-Shot AGIEval 5-shot MMLU Redux 5-shot MMLU 5-shot TriviaQA 5-shot
Ministral 3 14B 0.742 0.676 0.648 0.820 0.794 0.749
Qwen3 14B Base 0.754 0.620 0.661 0.837 0.804 0.703
Gemma 3 12B Base 0.690 0.487 0.587 0.766 0.745 0.788
Ministral 3 8B 0.706 0.626 0.591 0.793 0.761 0.681
Qwen 3 8B Base 0.700 0.576 0.596 0.794 0.760 0.639
Ministral 3 3B 0.652 0.601 0.511 0.735 0.707 0.592
Qwen 3 4B Base 0.677 0.405 0.570 0.759 0.713 0.530
Gemma 3 4B Base 0.516 0.294 0.430 0.626 0.589 0.640

License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Downloads last month
105
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MuXodious/Ministral-3-3B-Instruct-2512-ARA-heresy

Finetuned
(28)
this model
Finetunes
1 model
Quantizations
1 model

Collection including MuXodious/Ministral-3-3B-Instruct-2512-ARA-heresy

Paper for MuXodious/Ministral-3-3B-Instruct-2512-ARA-heresy