noctrex's picture
Update README.md
e11cf13 verified
|
Raw
History Blame
473 Bytes
metadata
pipeline_tag: text-generation
base_model:
  - nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16

This is a MXFP4_MOE imatrix quantization of the model NVIDIA-Nemotron-3-Nano-30B-A3B, based on the imatrix from unsloth.

As this is not yet supported in the mainline llama.cpp yet, you'll need to compile a special merge of it from here in order to run it:
https://github.com/ggml-org/llama.cpp/pull/18058