Kimi K2.6 DFlash Baseinit GGUF
This repository contains base-initialized DFlash draft GGUF files for Kimi K2.6.
These are not the gated z-lab trained Kimi K2.6 DFlash weights. They were constructed from a local moonshotai/Kimi-K2.6 checkpoint by initializing a DFlash-shaped draft model with Kimi K2.6-compatible dimensions and base checkpoint components.
Files
| File | Quantization | Approx size |
|---|---|---|
Kimi-K2.6-DFlash-baseinit-F32.gguf |
F32 | 13 GB |
Kimi-K2.6-DFlash-baseinit-Q8_0.gguf |
Q8_0 | 3.5 GB |
Kimi-K2.6-DFlash-baseinit-Q5_0.gguf |
Q5_0 | 2.3 GB |
Kimi-K2.6-DFlash-baseinit-Q4_0.gguf |
Q4_0 | 1.9 GB |
Verification metadata
The GGUFs include DFlash/Kimi K2.6 metadata:
general.architecture = dflash-draftgeneral.name = Kimi-K2.6-DFlash-baseinitdflash-draft.hidden_size = 7168dflash-draft.num_hidden_layers = 6dflash-draft.num_attention_heads = 64dflash-draft.num_key_value_heads = 8dflash-draft.vocab_size = 163840dflash-draft.block_size = 8dflash-draft.mask_token_id = 163838dflash-draft.target_layer_ids = [1, 12, 24, 35, 47, 58]
Intended use
These files are intended for experimentation with Oxidize DFlash draft-model loading and benchmarking.
Example:
oxidize-bench --model Kimi-K2.6-DFlash-baseinit-Q4_0.gguf --engine dflash --draft-tokens 8 --iterations 100 --verbose
Important note
Because this is base-initialized, it should not be expected to match the acceptance rate or quality of a fully trained DFlash draft model. Treat it as a reproducible Kimi K2.6-compatible DFlash scaffold/experiment, not as a trained production drafter.
- Downloads last month
- 51
4-bit
5-bit
8-bit
32-bit
Model tree for freakyskittle/Kimi-K2.6-DFlash-baseinit
Base model
moonshotai/Kimi-K2.6