llava-hf/LLaVA-NeXT-Video-7B-hf · Problem at notebook running

17 days ago

Dear All,

When running this cell:
generate_kwargs = {"max_new_tokens": 100, "do_sample": True, "top_p": 0.9}

output = model.generate(inputs, generate_kwargs)
generated_text = processor.batch_decode(output, skip_special_tokens=True)

I got this error:

Expanding inputs for image.video tokens in LLaVa-NeXT-Video should be done in processing. Please add patch_size and vision_feature_select_strategy to the model's processing config or set directly with processor.patch_size = {{patch_size}} and processor.vision_feature_select_strategy = {{vision_feature_select_strategy}}`. Using processors without these attributes in the config is deprecated and will throw an error in v4.47.

RuntimeError Traceback (most recent call last)

in <cell line: 3>()
1 generate_kwargs = {"max_new_tokens": 100, "do_sample": True, "top_p": 0.9}
2
----> 3 output = model.generate(**inputs, **generate_kwargs)
4 generated_text = processor.batch_decode(output, skip_special_tokens=True)

22 frames

/usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py in forward(self, hidden_states, attention_mask, position_ids, past_key_value, output_attentions, use_cache, cache_position, position_embeddings, **kwargs)
600 is_causal = True if causal_mask is None and q_len > 1 else False
601
--> 602 attn_output = torch.nn.functional.scaled_dot_product_attention(
603 query_states,
604 key_states,

RuntimeError: The expanded size of the tensor (1172) must match the existing size (21) at non-singleton dimension 3. Target sizes: [2, 32, 1, 1172]. Tensor sizes: [2, 1, 1, 21]

Kata5

17 days ago

Actually, it's quite weird that a couple of days ago I could run this Jupyter notebook without any problem.
Any help or hint would be much appreciated..

RaushanTurganbay

Llava Hugging Face org 12 days ago

Hey! Are you using the demo code from model page? Which transformers version you are using? You can try to update to the latest patch release by pip install -U transformers

Problem at notebook running

output = model.generate(**inputs, **generate_kwargs)generated_text = processor.batch_decode(output, skip_special_tokens=True)

output = model.generate(inputs, generate_kwargs)
generated_text = processor.batch_decode(output, skip_special_tokens=True)