Add model card
Browse filesThis PR adds a model card, linking to the paper [VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction](https://huggingface.co/papers/2501.01957).
It also adds the `pipeline_tag`, ensuring people can find it at https://huggingface.co/models?pipeline_tag=video-text-to-text as well as a link to the Github repository.
README.md
ADDED
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
pipeline_tag: video-text-to-text
|
3 |
+
---
|
4 |
+
This repository contains the model of the paper [VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction](https://huggingface.co/papers/2501.01957).
|
5 |
+
|
6 |
+
Code: https://github.com/VITA-MLLM/VITA
|