Memory issues

by Tortoise17 - opened Dec 3, 2025

Dec 3, 2025

Dear fellows. I have to ask that is there any way to use this model with the memory less than 42 GB. ? as the 48 GB GPU after installing ends up roughly 43 to - 44 GB VRAM. I am trying to fix the issue. But, there is memory problems. Maybe I am making some mistake. But if you have any point to fix please let me know. The repo I am using is the main github repo from here https://github.com/QwenLM/Qwen3-Omni. The example I am using is https://github.com/QwenLM/Qwen3-Omni/blob/main/web_demo.py. But model I am using is this quantized. Still it loads the model. But at inference stage, it exceeds the memory even with 3 minutes of video. Please if there is anyone who has same issue and fixed it would be a great help.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment