question: can not ignore thinking(reasoning) by llama.cpp

#6
by function2026 - opened

tried many times, just can not ignore thinking when using llama-server, I add all these: --reasoning off --reasoning-budget 0 --chat-template-kwargs "{"enable_thinking": false}" --jinja , but no use, other Qwen model, the --chat-template-kwargs "{"enable_thinking": false}" is enough.

Chat Template kwargs is deprecated in llamaserver. You should update your build and use reasoning on or off, or use a chat template jinja, or use reasoning format-deepseek

Sign up or log in to comment