Gemma-4 QAT Special Unsloth quants

#1
by danielhanchen - opened
Unsloth AI org

Hey folks! We converted Gemma-4 QAT quants in a different way since a direct llama.cpp Q4_0 loses accuracy when converting from BF16 QAT directly.

E2B for example has a mean KLD of 0.00173 vs 0.05109 (29x better relatively) for a naive Q4_0 quantization, and ours is even 22% smaller!

See https://unsloth.ai/docs/models/gemma-4/qat#qat-analysis

image

Does it support MTP? Please provide more examples.

This comment has been hidden

Use o mmproj-F32.gguf aqui do modelo funcionou aqui usando o Llama.cpp, funcionando com imagens.

This comment has been hidden

Sign up or log in to comment