kyaky's picture
NVFP4 self-quant (llm-compressor): FP8 attn/GDN + NVFP4-W4A16 experts; beats redhat/unsloth on quality+speed+size
894cdfa verified