๐Ÿšจ UPLOADS IN PROGRESS ๐Ÿšจ

Some files and metadata in this repository are still being uploaded and verified.

Kimi-K2.6-GGUF

This is a GGUF release of Moonshot AI's Kimi-K2.6.

The release preserves Kimi-K2.6's native multimodal architecture and is intended as the canonical llama.cpp-compatible GGUF ladder for the original model.

Quick Benchmarks

Check Original Kimi-K2.6 Kimi-K2.6 GGUF
Official 25-prompt refusal check Pending Pending
PPL / KLD reference drift Pending Pending

Methodology & Model Notes

Kimi-K2.6 is a large sparse-MoE vision-language model in the Kimi K2 family, exposed through the KimiK25ForConditionalGeneration wrapper with a DeepSeek V3-style text stack.

This release is built directly from moonshotai/Kimi-K2.6 and is intended to provide a clean original-model GGUF ladder without altering the base refusal behavior.

Quant Benchmarks

Quant Official 25-prompt refusal check Perplexity KL divergence
Q8_0 Pending Pending Pending
Q6_K Pending Pending Pending
Q4_K_M Pending Pending Pending
Q2_K Pending Pending Pending

Files

  • Kimi-K2.6-Q8_0/: highest-fidelity quant
  • Kimi-K2.6-Q6_K/: near-lossless practical quant
  • Kimi-K2.6-Q4_K_M/: smaller general-use quant
  • Kimi-K2.6-Q2_K/: lowest standard quant in this ladder
  • mmproj-Kimi-K2.6.gguf: matching multimodal projector file for llama.cpp vision use

Running

llama-server \
  -m <quant-file.gguf> \
  --mmproj <mmproj-file.gguf> \
  -ngl 999 -c 32768 --jinja -fa

Model Architecture

Spec Value
Architecture Wrapper KimiK25ForConditionalGeneration
Text Family DeepSeek V3-style sparse MoE
Text Layers 61
Hidden Size 7168
Experts 384 routed, 8 active per token
Modality Vision-language
Base Model moonshotai/Kimi-K2.6

Disclaimer

This is the original Kimi-K2.6 model converted to GGUF. It is not an abliterated release.

Credits

License

This release inherits the base Kimi-K2.6 license.

Modified MIT License.

Downloads last month
399
GGUF
Model size
1T params
Architecture
deepseek2
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Youssofal/Kimi-K2.6-GGUF

Quantized
(34)
this model