vllm部署有问题啊

by xing120226 - opened Apr 23

Discussion

xing120226

Apr 23

一直报错部署不了，说视觉那边有什么问题

sakamakismile

Owner Apr 23

Hi! Thank you for reporting this.

We identified a vision tower key-naming issue (model.language_model.visual.* → model.visual.*) and have already pushed a fix. Please re-download the model and try again.

If the issue persists after re-downloading, could you share:

The full error message
Your vLLM version
Your deployment command
This should be resolved now — sorry for the trouble!

xing120226

Apr 23

This comment has been hidden (marked as Resolved)

sakamakismile

Owner Apr 26

Update — the vision-tower key-naming fix from this thread is in the file. For new deployments going forward, also consider the Text-NVFP4-MTP sibling, which is the cleaner deployment target on vLLM:

→ sakamakismile/Qwen3.6-27B-Text-NVFP4-MTP

更新 — vision tower の prefix 修正は既に push 済みです。新規 deployment では、より整理された Text-NVFP4-MTP sibling を推奨:

vllm serve sakamakismile/Qwen3.6-27B-Text-NVFP4-MTP \
    --quantization modelopt --language-model-only \
    --reasoning-parser qwen3 \
    --speculative-config '{"method":"qwen3_5_mtp","num_speculative_tokens":3}'

vision を使わない場合 --language-model-only、speculative decoding で 100+ tok/s 可能 (RTX PRO 6000 で 132/105/106 tok/s 実測) です。

— Tonoken3 / Lna-Lab

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment