How to use from
Ollama
ollama run hf.co/GreyManul/MiniMax-M2.7-Abliterated-Heretic-GGUF-UD:IQ4_XS
Quick Links

I'm using the Youssofal/MiniMax-M2.7-Abliterated-Heretic-GGUF model, but the q4 quantizations provided in the repository don't fit my system configuration (128GB RAM + 16GB VRAM), so I took the imatrix from unsloth (https://huggingface.co/unsloth/MiniMax-M2.7-GGUF/tree/main) and used their weights for quantization. Thanks a lot!

The advantage of unsloth's quantization is that it encodes different tensors with different precisions depending on their importance, so more important tensors are encoded with less loss. In the original (Youssofal) GGUF repository, all tensors are encoded with the same precision, which results in a significant loss of quality compared to unsloth's gguf.

So I tried to combine the best of both worlds.

Downloads last month
612
GGUF
Model size
229B params
Architecture
minimax-m2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for GreyManul/MiniMax-M2.7-Abliterated-Heretic-GGUF-UD

Quantized
(2)
this model