LLaMA-VID - a YanweiLi Collection

YanweiLi 's Collections

LLaMA-VID

updated Dec 3, 2023

LLaMA-VID checkpoints. Please refer to project page for more detail: https://llama-vid.github.io/

YanweiLi/llama-vid-7b-pretrain-224

Text Generation • Updated Dec 3, 2023 • 13
YanweiLi/llama-vid-7b-pretrain-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 9 • 2
YanweiLi/llama-vid-7b-pretrain-336

Text Generation • Updated Dec 3, 2023 • 10
YanweiLi/llama-vid-13b-pretrain-336

Text Generation • Updated Dec 3, 2023 • 11
YanweiLi/llama-vid-13b-pretrain-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 8
YanweiLi/llama-vid-7b-full-336

Text Generation • Updated Dec 2, 2023 • 10
YanweiLi/llama-vid-7b-full-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 22 • 9
YanweiLi/llama-vid-7b-full-224

Text Generation • Updated Dec 3, 2023 • 7 • 1
YanweiLi/llama-vid-13b-full-224-video-fps-1

Text Generation • Updated Dec 3, 2023 • 12 • 2
YanweiLi/llama-vid-13b-full-336

Text Generation • Updated Dec 2, 2023 • 9
LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models

Paper • 2311.17043 • Published Nov 28, 2023