Use this model with vLLM on L40 or 4090 (SM89)

by Mephisto1484 - opened Nov 14, 2025

Nov 14, 2025

•

edited Nov 14, 2025

The vLLM provided by Alibaba's team has a bug for SM89 machines such as L40 or 4090, as it does not include the required Marlin-AWQ-Moe kernel.

Solution: Refer to (https://github.com/vllm-project/vllm/pull/26755) and (https://github.com/vllm-project/vllm/pull/28294), modify [CMakeLists.txt] to add SM89 support, then recompile vLLM and install it.

The official vLLM team has merged the fix for this bug, but Alibaba's team has not yet updated it in their vLLM. It has not been tested, so it is unclear whether installing from the official vLLM source code would avoid this issue.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment