Kimi K2.6 DFlash Baseinit GGUF

This repository contains base-initialized DFlash draft GGUF files for Kimi K2.6.

These are not the gated z-lab trained Kimi K2.6 DFlash weights. They were constructed from a local moonshotai/Kimi-K2.6 checkpoint by initializing a DFlash-shaped draft model with Kimi K2.6-compatible dimensions and base checkpoint components.

Files

File Quantization Approx size
Kimi-K2.6-DFlash-baseinit-F32.gguf F32 13 GB
Kimi-K2.6-DFlash-baseinit-Q8_0.gguf Q8_0 3.5 GB
Kimi-K2.6-DFlash-baseinit-Q5_0.gguf Q5_0 2.3 GB
Kimi-K2.6-DFlash-baseinit-Q4_0.gguf Q4_0 1.9 GB

Verification metadata

The GGUFs include DFlash/Kimi K2.6 metadata:

  • general.architecture = dflash-draft
  • general.name = Kimi-K2.6-DFlash-baseinit
  • dflash-draft.hidden_size = 7168
  • dflash-draft.num_hidden_layers = 6
  • dflash-draft.num_attention_heads = 64
  • dflash-draft.num_key_value_heads = 8
  • dflash-draft.vocab_size = 163840
  • dflash-draft.block_size = 8
  • dflash-draft.mask_token_id = 163838
  • dflash-draft.target_layer_ids = [1, 12, 24, 35, 47, 58]

Intended use

These files are intended for experimentation with Oxidize DFlash draft-model loading and benchmarking.

Example:

oxidize-bench --model Kimi-K2.6-DFlash-baseinit-Q4_0.gguf --engine dflash --draft-tokens 8 --iterations 100 --verbose

Important note

Because this is base-initialized, it should not be expected to match the acceptance rate or quality of a fully trained DFlash draft model. Treat it as a reproducible Kimi K2.6-compatible DFlash scaffold/experiment, not as a trained production drafter.

Downloads last month
51
GGUF
Model size
3B params
Architecture
dflash-draft
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for freakyskittle/Kimi-K2.6-DFlash-baseinit

Quantized
(34)
this model