Instructions to use cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Memory issues
Dear fellows. I have to ask that is there any way to use this model with the memory less than 42 GB. ? as the 48 GB GPU after installing ends up roughly 43 to - 44 GB VRAM. I am trying to fix the issue. But, there is memory problems. Maybe I am making some mistake. But if you have any point to fix please let me know. The repo I am using is the main github repo from here https://github.com/QwenLM/Qwen3-Omni. The example I am using is https://github.com/QwenLM/Qwen3-Omni/blob/main/web_demo.py. But model I am using is this quantized. Still it loads the model. But at inference stage, it exceeds the memory even with 3 minutes of video. Please if there is anyone who has same issue and fixed it would be a great help.