Coherence issue

#1
by jesterb32 - opened

First off, thanks for these!

Just pulled Q6_K today and ran it with latest llama.cpp release. It seems to be generating repeated \ and that's it. I confirmed that llama.cpp is working. Anyone else seeing this?

be generating repeated \ and that's it

Same for me. I used Q4_K_M instead.

This comment has been hidden (marked as Resolved)

Hi thank for reporting, I'm invertigating
Edit: After running all the models under llama.cpp, here's a quantization bug on scheme that's sym=False, data_type=int_asym_dq with super_bits >= 6. So 3 quants (Q4_K_M, Q5_K_M, Q6_K) need to be re-quantized. I'd update the repo shortly. Sorry for the inconvenience.

This seems like a upstream issue of how AutoRound react with MTP, I'm filing a issue to them now. Meanwhile please use other unaffected scheme instead.

This seems like a upstream issue of how AutoRound react with MTP, I'm filing a issue to them now. Meanwhile please use other unaffected scheme instead.

Does this issue occur with the 27B AutoRound models as well ?

This seems like a upstream issue of how AutoRound react with MTP, I'm filing a issue to them now. Meanwhile please use other unaffected scheme instead.

Does this issue occur with the 27B AutoRound models as well ?

This issue only affect MoE model so the dense 27B is not affected!

Upstream is fixed with #1908 and I'm currently rerunning the quantization. Hold tight!

@sphaela Thank mate, the fix works now!

This is fixed now :3

Think you can make a GGUF format of this https://huggingface.co/webhie/Qwen3.6-27B-int4-AutoRound-Code ?

Think you can make a GGUF format of this https://huggingface.co/webhie/Qwen3.6-27B-int4-AutoRound-Code ?

I could do that, but basically I'll need to fine tune from scratch. Since working on their model means dequant back to F16 first and then requant back to GGUF, that will create a lot of loss. If I do it I would start with a better dataset, maybe with the Opus dataset? Or you just want a code specific variant?

Thank you @sphaela !

Sign up or log in to comment