File size: 892 Bytes
beefb8e
 
 
 
 
 
 
 
 
 
0b66d3a
 
5ff2646
 
beefb8e
 
5ff2646
beefb8e
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# AskVideos-7B-Instruct-v0.1

## Model details

**Model type:**
AskVideos-7B-Instruct-v0.1 is an open-source chatbot trained by fine-tuning a Video-LLaMA variant on additional video Q&A data.
It uses a frozen Vicuna 7B v1.1 LLM to answer Video-Text queries and a frozen BLIP style image encoder.
A video feature is derived from the encoded image using a video-QFormer and the result is projected onto the LLM space.

**Github repo for demo:**
https://github.com/AskYoutubeAI/AskVideos

**Acknowledgement**
This model is based on Video-LLaMA. Check out the original work here: https://github.com/DAMO-NLP-SG/Video-LLaMA

## License
AskVideos-7B-Instruct-v0.1 code and models are distributed under the Apache License 2.0.

## Training dataset
- 50K video synthetic Q&A pairs mined from videos.
- Trained with 16 images sampled over 30s clips per Q&A pair.
- Finetuned on Video-LLaAMA Vicuna 7B.