vllm部署有问题啊

#3
by xing120226 - opened

一直报错部署不了,说视觉那边有什么问题

Hi! Thank you for reporting this.

We identified a vision tower key-naming issue (model.language_model.visual.* → model.visual.*) and have already pushed a fix. Please re-download the model and try again.

If the issue persists after re-downloading, could you share:

The full error message
Your vLLM version
Your deployment command
This should be resolved now — sorry for the trouble!

This comment has been hidden (marked as Resolved)

Update — the vision-tower key-naming fix from this thread is in the file. For new deployments going forward, also consider the Text-NVFP4-MTP sibling, which is the cleaner deployment target on vLLM:

sakamakismile/Qwen3.6-27B-Text-NVFP4-MTP

更新 — vision tower の prefix 修正は既に push 済みです。新規 deployment では、より整理された Text-NVFP4-MTP sibling を推奨:

vllm serve sakamakismile/Qwen3.6-27B-Text-NVFP4-MTP \
    --quantization modelopt --language-model-only \
    --reasoning-parser qwen3 \
    --speculative-config '{"method":"qwen3_5_mtp","num_speculative_tokens":3}'

vision を使わない場合 --language-model-only、speculative decoding で 100+ tok/s 可能 (RTX PRO 6000 で 132/105/106 tok/s 実測) です。

— Tonoken3 / Lna-Lab

Sign up or log in to comment