Add model card

#2
by nielsr HF staff - opened

This PR adds a model card, linking to the paper VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction.

It also adds the pipeline_tag, ensuring people can find it at https://huggingface.co/models?pipeline_tag=video-text-to-text as well as a link to the Github repository.

lxysl changed pull request status to merged

Sign up or log in to comment