How to use from
Hermes Agent
Start the MLX server
# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "RepublicOfKorokke/Nemotron-Cascade-2-30B-A3B-oQ3.5"
Configure Hermes
# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default RepublicOfKorokke/Nemotron-Cascade-2-30B-A3B-oQ3.5
Run Hermes
hermes
Quick Links

Nemotron-Cascade-2-30B-A3B-oQ3.5

This model was quantized using oQ mixed-precision quantization.

Quantization details

  • Model type: nemotron_h
  • Bits: 3
  • Group size: 64
  • Format: MLX safetensors

Benchmark

Model File size MMLU JMMLU HELLASWAG ARC_CHALLENGE GSM8K
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 15.65 GB 45.7% 33.7% 28.3% 72.3% 82.3%
Nemotron-Cascade-2-30B-A3B-oQ3.5 13.32 GB 65.7% 61.0% 76.0% 85.7% 89.7%
Nemotron-Cascade-2-30B-A3B-oQ4 16.91 GB 68.3% 61.7% - - -

Detail

Model Benchmark Accuracy Correct Total Time(s)
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 MMLU 45.7% 137 300 946.7
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 JMMLU 33.7% 101 300 558.5
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 HELLASWAG 28.3% 85 300 647
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 GSM8K 82.3% 247 300 1781.6
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 ARC_CHALLENGE 72.3% 217 300 462.5
Nemotron-Cascade-2-30B-A3B-oQ3.5 MMLU 65.7% 197 300 696
Nemotron-Cascade-2-30B-A3B-oQ3.5 JMMLU 61.0% 183 300 294.7
Nemotron-Cascade-2-30B-A3B-oQ3.5 HELLASWAG 76.0% 228 300 314.4
Nemotron-Cascade-2-30B-A3B-oQ3.5 GSM8K 89.7% 269 300 992.6
Nemotron-Cascade-2-30B-A3B-oQ3.5 ARC_CHALLENGE 85.7% 257 300 204.5
Nemotron-Cascade-2-30B-A3B-oQ4 MMLU 68.3% 205 300 572.7
Nemotron-Cascade-2-30B-A3B-oQ4 JMMLU 61.7% 185 300 239.2
Downloads last month
25
Safetensors
Model size
4B params
Tensor type
BF16
·
U32
·
F32
·
U8
·
MLX
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RepublicOfKorokke/Nemotron-Cascade-2-30B-A3B-oQ3.5

Quantized
(30)
this model