Do these quants suffer the same issue as the Unsloth quants?

#2
by JJ404GO - opened

Do these quants suffer the same issue as the Unsloth quants?

https://huggingface.co/unsloth/Mistral-Medium-3.5-128B-GGUF

"We’re working with Mistral on llama.cpp GGUF implementation. Testing shows that this behavior occurs regardless of who or how the model was converted GGUF. The model initially responds correctly, but over long context, does not work properly.
Mistral has now labeled GGUF support as a WIP (work in progress). The issue appears most likely to be with the current GGUF parser. Will update once resolved."

I'm assuming so. It's interesting that there's nothing about it on the llama.cpp github as far as I can tell.

We worked with Mistral and fixed our GGUFs - it was sadly a nasty YaRN parsing error which propagated to llama.cpp and transformers - see https://huggingface.co/mistralai/Mistral-Medium-3.5-128B/discussions/18

hey sorry I've been out of commission for a few days, will be fixing this today

bartowski changed discussion status to closed

Sign up or log in to comment