Will you also make a Qwen 3.6 27B MTP version?

#13
by Hofmannsdream - opened

currently, people are talking about multi-token prediction, which was recently supported by llama.cpp, but I guess it needs a new GGUF version.

please make the mtp version

MTP has merged in llama.cpp, speed up about 2x. It's very useful for local llm.

Sign up or log in to comment