Does MLX support smaller quantizations?

#1
by robbintt - opened

Does MLX support smaller quantizations?

I am using https://huggingface.co/mradermacher/Qwen3.6-28B-REAP-i1-GGUF?show_file_info=Qwen3.6-28B-REAP.i1-IQ3_XXS.gguf which is 11.2 GB on a 16GB mac mini. It's working great but I would like to check out a MLX quant if possible.

Yes, it does. I’m working on JANG compression style tweaks for this model. I tested it on a Gemma REAP ( https://huggingface.co/stamsam/Gemma-4-21B-REAP-JANG3M-MLX ) to make them smaller. Take a peek at it. Let me know if it works on your Mac while I work on this one.

Thank you, that will definitely fit. I will test it while I wait for that one. I think about 11.3 GB is solid, even smaller is better so I can still use browser and terminal. But, its hard to know what's possible and what the quality tradeoffs are.

The quality in the one I recommend you try Gemma real above out should be way better than the Q1 you’re ruining, and I’ll let you know when I finish the Qwen jang reap. Do you use X ? If so, I’ll message you there. My profile is @stamatiou justdm

Thanks, followed you there, got your followback.

Sign up or log in to comment