Instructions to use cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Use this model with vLLM on L40 or 4090 (SM89)
#7
by Mephisto1484 - opened
The vLLM provided by Alibaba's team has a bug for SM89 machines such as L40 or 4090, as it does not include the required Marlin-AWQ-Moe kernel.
Solution: Refer to (https://github.com/vllm-project/vllm/pull/26755) and (https://github.com/vllm-project/vllm/pull/28294), modify [CMakeLists.txt] to add SM89 support, then recompile vLLM and install it.
The official vLLM team has merged the fix for this bug, but Alibaba's team has not yet updated it in their vLLM. It has not been tested, so it is unclear whether installing from the official vLLM source code would avoid this issue.