missing tensor 'blk.40.ssm_conv1d.weight'

#4
by craigandrews - opened

It seems there's a problem with unsloth/Qwen3.6-35B-A3B-MTP-GGUF:Q4_K_M

$ llama-cli --prompt '/exit' -hf unsloth/Qwen3.6-35B-A3B-MTP-GGUF:Q4_K_M

Loading model... \llama_model_load: error loading model: missing tensor 'blk.40.ssm_conv1d.weight'
llama_model_load_from_file_impl: failed to load model
common_fit_params: encountered an error while trying to fit params to free device memory: failed to load model                                                                                                    -llama_model_load: error loading model: missing tensor 'blk.40.ssm_conv1d.weight'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/var/lib/llama-swap/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-MTP-GGUF/snapshots/65e39d386b2098e11630765bbe0ac8e21e50ac2f/Qwen3.6-35B-A3B-UD-Q4_K_M.gguf'
srv    load_model: failed to load model, '/var/lib/llama-swap/huggingface/hub/models--unsloth--Qwen3.6-35B-A3B-MTP-GGUF/snapshots/65e39d386b2098e11630765bbe0ac8e21e50ac2f/Qwen3.6-35B-A3B-UD-Q4_K_M.gguf'         
Failed to load the model

I get the exact same error with MXFP4_MOE.

Unsloth AI org
edited May 13

Hey - you need to re-compile llama.cpp with the MTP branch - see https://huggingface.co/unsloth/Qwen3.6-35B-A3B-MTP-GGUF/discussions/3

Stock mainline llama.cpp will not function - I re-confirmed and ran all 35B and 27B quants and they work with the PR

Sign up or log in to comment