Strange warning on first completion w/ vLLM 0.11.0

by whoisjeremylam - opened Nov 2, 2025

Nov 2, 2025

Not sure what these warnings mean but in case they were worth mentioning:

(Worker_PP0 pid=326032) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:929: UserWarning: Input
 tensor shape suggests potential format mismatch: seq_len (9) < num_heads (16). This may indicate the inputs were passed in head-f
irst format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the expected shape
[B, T, H, ...].
(Worker_PP0 pid=326032)   return fn(*args, **kwargs)
(Worker_PP0 pid=326032) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/vllm/model_executor/layers/fla/ops/utils.py:105: Us
erWarning: Input tensor shape suggests potential format mismatch: seq_len (9) < num_heads (32). This may indicate the inputs were
passed in head-first format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the
 expected shape [B, T, H, ...].
(Worker_PP0 pid=326032)   return fn(*contiguous_args, **contiguous_kwargs)
(Worker_PP0 pid=326032) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/vllm/distributed/parallel_state.py:516: UserWarning
: The given buffer is not writable, and PyTorch does not support non-writable tensors. This means you can write to the underlying (supposedly non-writable) buffer using the tensor. You may want to copy the buffer to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_new.cpp:1578.)
(Worker_PP0 pid=326032)   object_tensor = torch.frombuffer(pickle.dumps(obj), dtype=torch.uint8)
(Worker_PP1 pid=326033) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:929: UserWarning: Input tensor shape suggests potential format mismatch: seq_len (9) < num_heads (16). This may indicate the inputs were passed in head-first format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the expected shape [B, T, H, ...].
(Worker_PP1 pid=326033)   return fn(*args, **kwargs)
(Worker_PP1 pid=326033) /home/ai/vllm-0.11.0/venv/lib/python3.10/site-packages/vllm/model_executor/layers/fla/ops/utils.py:105: UserWarning: Input tensor shape suggests potential format mismatch: seq_len (9) < num_heads (32). This may indicate the inputs were passed in head-first format [B, H, T, ...] when head_first=False was specified. Please verify your input tensor format matches the expected shape [B, T, H, ...].
(Worker_PP1 pid=326033)   return fn(*contiguous_args, **contiguous_kwargs)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment