--- base_model: - Youssofal/MiniMax-M2.7-Abliterated-Heretic-GGUF library_name: gguf pipeline_tag: text-generation license: other license_name: non-commercial license_link: https://github.com/MiniMax-AI/MiniMax-M2.7/blob/main/LICENSE tags: - gguf - minimax - minimax_m2 - moe - mixture-of-experts - abliterated - uncensored - heretic - ara - llama-cpp quantized_by: GreyManul --- I'm using the Youssofal/MiniMax-M2.7-Abliterated-Heretic-GGUF model, but the q4 quantizations provided in the repository don't fit my system configuration (128GB RAM + 16GB VRAM), so I took the imatrix from unsloth (https://huggingface.co/unsloth/MiniMax-M2.7-GGUF/tree/main) and used their weights for quantization. Thanks a lot! The advantage of unsloth's quantization is that it encodes different tensors with different precisions depending on their importance, so more important tensors are encoded with less loss. In the original (Youssofal) GGUF repository, all tensors are encoded with the same precision, which results in a significant loss of quality compared to unsloth's gguf. So I tried to combine the best of both worlds.