Request: Qwen3.6-35B-A3B quantization (2.5bpw)
Hi UnstableLlama,
I really appreciate your high-quality EXL3 conversions. If you have some spare compute, could you please consider converting the new "Qwen3.6-35B-A3B" model to the exl3 format at 2.5bpw?
This specific bitrate would be very helpful for those of us with limited VRAM. Thank you so much for your great work!
For sure! Thanks for the tip, always happy to see which quants people actually want. I'll try to get that done tonight and uploaded tomorrow, when I do I will let you know here. If you want, you can join the exllama discord for more quant requests or general help.
Hi,
I'd like to ask if there's a lower boundary, where it's considred the modell is too "dumb" with delta PPL above +- 0.2, delta KLD ??, Same Top % below 93% or what ever?
If the task is all day long context coding and text and number understanding and such.
Hey ghit, I don’t have hard numbers but Turboderp has said that 0.05 and below is considered “imperceptible” and he thinks anything 0.2 and below is good enough to use.
Also, uploading a 2.49bpw right now, will be done within a couple hours.
This is a 2.46bpw quantized with the -hq argument for extra MoE precision, bringing the total up to 2.49 (plus head)
https://huggingface.co/UnstableLlama/Qwen3.6-35B-A3B-exl3-2.49bpw