GreyManul's picture
Add files using upload-large-folder tool
57152bb verified
|
Raw
History Blame Contribute Delete
1.16 kB
metadata
base_model:
  - Youssofal/MiniMax-M2.7-Abliterated-Heretic-GGUF
library_name: gguf
pipeline_tag: text-generation
license: other
license_name: non-commercial
license_link: https://github.com/MiniMax-AI/MiniMax-M2.7/blob/main/LICENSE
tags:
  - gguf
  - minimax
  - minimax_m2
  - moe
  - mixture-of-experts
  - abliterated
  - uncensored
  - heretic
  - ara
  - llama-cpp
quantized_by: GreyManul

I'm using the Youssofal/MiniMax-M2.7-Abliterated-Heretic-GGUF model, but the q4 quantizations provided in the repository don't fit my system configuration (128GB RAM + 16GB VRAM), so I took the imatrix from unsloth (https://huggingface.co/unsloth/MiniMax-M2.7-GGUF/tree/main) and used their weights for quantization. Thanks a lot!

The advantage of unsloth's quantization is that it encodes different tensors with different precisions depending on their importance, so more important tensors are encoded with less loss. In the original (Youssofal) GGUF repository, all tensors are encoded with the same precision, which results in a significant loss of quality compared to unsloth's gguf.

So I tried to combine the best of both worlds.