VITA-MLLM
/

VITA-1.5

Video-Text-to-Text

Model card Files Files and versions Community

Add model card

#2

by nielsr HF staff - opened 27 days ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

nielsr

27 days ago

This PR adds a model card, linking to the paper VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction.

It also adds the pipeline_tag, ensuring people can find it at https://huggingface.co/models?pipeline_tag=video-text-to-text as well as a link to the Github repository.

Add model card34ce5f2e

lxysl changed pull request status to merged 18 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment