Instructions to use cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("cyankiwi/Qwen3-Omni-30B-A3B-Instruct-AWQ-4bit", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Strange warning on first completion w/ vLLM 0.11.0
#6
by whoisjeremylam - opened
Not sure what these warnings mean but in case they were worth mentioning:
(Worker_PP0 pid=326032) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:929: UserWarning: Input
tensor shape suggests potential format mismatch: seq_len (9) < num_heads (16). This may indicate the inputs were passed in head-f
irst format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the expected shape
[B, T, H, ...].
(Worker_PP0 pid=326032) return fn(*args, **kwargs)
(Worker_PP0 pid=326032) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/vllm/model_executor/layers/fla/ops/utils.py:105: Us
erWarning: Input tensor shape suggests potential format mismatch: seq_len (9) < num_heads (32). This may indicate the inputs were
passed in head-first format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the
expected shape [B, T, H, ...].
(Worker_PP0 pid=326032) return fn(*contiguous_args, **contiguous_kwargs)
(Worker_PP0 pid=326032) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/vllm/distributed/parallel_state.py:516: UserWarning
: The given buffer is not writable, and PyTorch does not support non-writable tensors. This means you can write to the underlying (supposedly non-writable) buffer using the tensor. You may want to copy the buffer to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_new.cpp:1578.)
(Worker_PP0 pid=326032) object_tensor = torch.frombuffer(pickle.dumps(obj), dtype=torch.uint8)
(Worker_PP1 pid=326033) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:929: UserWarning: Input tensor shape suggests potential format mismatch: seq_len (9) < num_heads (16). This may indicate the inputs were passed in head-first format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the expected shape [B, T, H, ...].
(Worker_PP1 pid=326033) return fn(*args, **kwargs)
(Worker_PP1 pid=326033) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/vllm/model_executor/layers/fla/ops/utils.py:105: UserWarning: Input tensor shape suggests potential format mismatch: seq_len (9) < num_heads (32). This may indicate the inputs were passed in head-first format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the expected shape [B, T, H, ...].
(Worker_PP1 pid=326033) return fn(*contiguous_args, **contiguous_kwargs)