A collection of multimodal models for the gpu poor
-
google/paligemma-3b-pt-896
Image-Text-to-Text • Updated • 4.5k • 111 -
OpenGVLab/InternVL-Chat-V1-5
Image-Text-to-Text • Updated • 3.65k • 401 -
alexshengzhili/llava-v1.5-13b-dpo
Text Generation • Updated • 13 • 5 -
llava-hf/llava-v1.6-mistral-7b-hf
Image-Text-to-Text • Updated • 1.09M • 234