Nemotron-Cascade-2-30B-A3B-oQ3.5

This model was quantized using oQ mixed-precision quantization.

Quantization details

  • Model type: nemotron_h
  • Bits: 3
  • Group size: 64
  • Format: MLX safetensors

Benchmark

Model File size MMLU JMMLU HELLASWAG ARC_CHALLENGE GSM8K
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 15.65 GB 45.7% 33.7% 28.3% 72.3% 82.3%
Nemotron-Cascade-2-30B-A3B-oQ3.5 13.32 GB 65.7% 61.0% 76.0% 85.7% 89.7%
Nemotron-Cascade-2-30B-A3B-oQ4 16.91 GB 68.3% 61.7% - - -

Detail

Model Benchmark Accuracy Correct Total Time(s)
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 MMLU 45.7% 137 300 946.7
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 JMMLU 33.7% 101 300 558.5
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 HELLASWAG 28.3% 85 300 647
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 GSM8K 82.3% 247 300 1781.6
Nemotron-Cascade-2-30B-A3B-mlx-mxfp4 ARC_CHALLENGE 72.3% 217 300 462.5
Nemotron-Cascade-2-30B-A3B-oQ3.5 MMLU 65.7% 197 300 696
Nemotron-Cascade-2-30B-A3B-oQ3.5 JMMLU 61.0% 183 300 294.7
Nemotron-Cascade-2-30B-A3B-oQ3.5 HELLASWAG 76.0% 228 300 314.4
Nemotron-Cascade-2-30B-A3B-oQ3.5 GSM8K 89.7% 269 300 992.6
Nemotron-Cascade-2-30B-A3B-oQ3.5 ARC_CHALLENGE 85.7% 257 300 204.5
Nemotron-Cascade-2-30B-A3B-oQ4 MMLU 68.3% 205 300 572.7
Nemotron-Cascade-2-30B-A3B-oQ4 JMMLU 61.7% 185 300 239.2
Downloads last month
113
Safetensors
Model size
4B params
Tensor type
BF16
·
U32
·
F32
·
U8
·
MLX
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RepublicOfKorokke/Nemotron-Cascade-2-30B-A3B-oQ3.5

Quantized
(30)
this model