Melvin56/kanana-nano-2.1b-instruct-abliterated-GGUF

Original Model : huihui-ai/kanana-nano-2.1b-instruct-abliterated

All quants are made using the imatrix dataset.

Model Size (GB)
Q2_K_S 0.914
Q2_K 0.931
Q3_K_M 1.138
Q4_K_M 1.385
Q5_K_M 1.568
Q6_K 1.826
Q8_0 2.223
F16 4.177
F32 8.342
CPU (AVX2) CPU (ARM NEON) Metal cuBLAS rocBLAS SYCL CLBlast Vulkan Kompute
K-quants โœ… โœ… โœ… โœ… โœ… โœ… โœ… ๐Ÿข5 โœ… ๐Ÿข5 โŒ
I-quants โœ… ๐Ÿข4 โœ… ๐Ÿข4 โœ… ๐Ÿข4 โœ… โœ… Partialยน โŒ โŒ โŒ
โœ…: feature works
๐Ÿšซ: feature does not work
โ“: unknown, please contribute if you can test it youself
๐Ÿข: feature is slow
ยน: IQ3_S and IQ1_S, see #5886
ยฒ: Only with -ngl 0
ยณ: Inference is 50% slower
โด: Slower than K-quants of comparable size
โต: Slower than cuBLAS/rocBLAS on similar cards
โถ: Only q8_0 and iq4_nl
Downloads last month
32
GGUF
Model size
2B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Melvin56/kanana-nano-2.1b-instruct-abliterated-GGUF

Collection including Melvin56/kanana-nano-2.1b-instruct-abliterated-GGUF