Collections of multimodal (image+text) instruction finetuning datasets tailored for visual language models like LlaVA, Fuyu, or IDEFICS.
Victor Sanh PRO
VictorSanh
AI & ML interests
None yet
Recent Activity
liked
a model
6 days ago
EssentialAI/rnj-1
upvoted
a
collection
6 days ago
rnj-1
liked
a Space
16 days ago
dlouapre/eiffel-tower-llama