Visual Question Answering
Transformers
Safetensors
English
videollama2_mistral
text-generation
multimodal large language model
large video-language model
Inference Endpoints
ClownRat commited on
Commit
f139819
1 Parent(s): 2eb90b2

Update config.json

Browse files
Files changed (1) hide show
  1. config.json +1 -1
config.json CHANGED
@@ -22,7 +22,7 @@
22
  "mm_vision_select_feature": "patch",
23
  "mm_vision_select_layer": -2,
24
  "mm_vision_tower": "openai/clip-vit-large-patch14-336",
25
- "model_type": "mistral",
26
  "num_attention_heads": 32,
27
  "num_frames": 16,
28
  "num_hidden_layers": 32,
 
22
  "mm_vision_select_feature": "patch",
23
  "mm_vision_select_layer": -2,
24
  "mm_vision_tower": "openai/clip-vit-large-patch14-336",
25
+ "model_type": "videollama2_mistral",
26
  "num_attention_heads": 32,
27
  "num_frames": 16,
28
  "num_hidden_layers": 32,