Problem at notebook running
Dear All,
When running this cell:
generate_kwargs = {"max_new_tokens": 100, "do_sample": True, "top_p": 0.9}
output = model.generate(**inputs, **generate_kwargs)
generated_text = processor.batch_decode(output, skip_special_tokens=True)
I got this error:
Expanding inputs for image.video tokens in LLaVa-NeXT-Video should be done in processing. Please add patch_size
and vision_feature_select_strategy
to the model's processing config or set directly with processor.patch_size = {{patch_size}}
and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.47.
RuntimeError Traceback (most recent call last)
in <cell line: 3>()
1 generate_kwargs = {"max_new_tokens": 100, "do_sample": True, "top_p": 0.9}
2
----> 3 output = model.generate(**inputs, **generate_kwargs)
4 generated_text = processor.batch_decode(output, skip_special_tokens=True)
22 frames
/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py in forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, cache_position, position_embeddings, **kwargs)
600 is_causal = True if causal_mask is None and q_len > 1 else False
601
--> 602 attn_output = torch.nn.functional.scaled_dot_product_attention(
603 query_states,
604 key_states,
RuntimeError: The expanded size of the tensor (1172) must match the existing size (21) at non-singleton dimension 3. Target sizes: [2, 32, 1, 1172]. Tensor sizes: [2, 1, 1, 21]
Actually, it's quite weird that a couple of days ago I could run this Jupyter notebook without any problem.
Any help or hint would be much appreciated..
Hey! Are you using the demo code from model page? Which transformers version you are using? You can try to update to the latest patch release by pip install -U transformers