GGUF
imatrix
How to use from
Lemonade
Pull the model
# Download Lemonade from https://lemonade-server.ai/
lemonade pull coughmedicine/NVIDIA-Nemotron-3-Super-120B-A12B-Base-GGUF
Run and chat with the model
lemonade run user.NVIDIA-Nemotron-3-Super-120B-A12B-Base-GGUF-{{QUANT_TAG}}
List all available models
lemonade list
Quick Links

It will follow the instruct template, but it won't work with chat completions. YMMV, purely experimental.

Downloads last month
17
GGUF
Model size
121B params
Architecture
nemotron_h_moe
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for coughmedicine/NVIDIA-Nemotron-3-Super-120B-A12B-Base-GGUF

Quantized
(4)
this model