Kimi K2.6 DFlash Baseinit GGUF

This repository contains base-initialized DFlash draft GGUF files for Kimi K2.6.

These are not the gated z-lab trained Kimi K2.6 DFlash weights. They were constructed from a local moonshotai/Kimi-K2.6 checkpoint by initializing a DFlash-shaped draft model with Kimi K2.6-compatible dimensions and base checkpoint components.

Files

File	Quantization	Approx size
`Kimi-K2.6-DFlash-baseinit-F32.gguf`	F32	13 GB
`Kimi-K2.6-DFlash-baseinit-Q8_0.gguf`	Q8_0	3.5 GB
`Kimi-K2.6-DFlash-baseinit-Q5_0.gguf`	Q5_0	2.3 GB
`Kimi-K2.6-DFlash-baseinit-Q4_0.gguf`	Q4_0	1.9 GB

Verification metadata

The GGUFs include DFlash/Kimi K2.6 metadata:

general.architecture = dflash-draft
general.name = Kimi-K2.6-DFlash-baseinit
dflash-draft.hidden_size = 7168
dflash-draft.num_hidden_layers = 6
dflash-draft.num_attention_heads = 64
dflash-draft.num_key_value_heads = 8
dflash-draft.vocab_size = 163840
dflash-draft.block_size = 8
dflash-draft.mask_token_id = 163838
dflash-draft.target_layer_ids = [1, 12, 24, 35, 47, 58]

Intended use

These files are intended for experimentation with Oxidize DFlash draft-model loading and benchmarking.

Example:

oxidize-bench --model Kimi-K2.6-DFlash-baseinit-Q4_0.gguf --engine dflash --draft-tokens 8 --iterations 100 --verbose

Important note

Because this is base-initialized, it should not be expected to match the acceptance rate or quality of a fully trained DFlash draft model. Treat it as a reproducible Kimi K2.6-compatible DFlash scaffold/experiment, not as a trained production drafter.

Downloads last month: 51

GGUF

Model size

3B params

Architecture

dflash-draft

Hardware compatibility

4-bit

5-bit

8-bit

32-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for freakyskittle/Kimi-K2.6-DFlash-baseinit

Base model

moonshotai/Kimi-K2.6

Quantized

(34)

this model