summonsoftware's picture
Update README.md
421bcda verified
|
Raw
History Blame
848 Bytes
metadata
license: apache-2.0
library_name: gguf
tags:
  - gguf
  - qwen3.6
  - mtp
  - llama.cpp
  - coding
  - uncensored
  - speculative-decoding

Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-MTP-GGUF

This repository contains a GGUF derivative of HauhauCS/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive with MTP tensors grafted from 27B_MTP.gguf using the public conversion method demonstrated by havenoammo/croton-style Qwen3.6 MTP GGUF workflows.

All credit for the base model goes to the original model authors. This upload provides the MTP-grafted GGUF artifact and runtime notes for llama.cpp CUDA 13.1.

Run

llama-server.exe -m "Qwen3.6-27B-Uncensored-HauhauCS-Aggressive-MTP-Q4_K_P.gguf" --spec-type draft-mtp --spec-draft-n-max 1 --jinja -ngl 100 -c 262144 -np 1 --flash-attn on --cache-type-k q4_0 --cache-type-v q4_0 --host 127.0.0.1 --port 8033