Ramzan Shaheen
iamramzan
AI & ML interests
GenAI, Vision & Co
Recent Activity
updated
a collection
about 14 hours ago
Shaheen Collection π¦
liked
a dataset
1 day ago
iamramzan/Global-Population-Data
updated
a dataset
1 day ago
iamramzan/Global-Population-Data
Organizations
Collections
5
A curated list of papers on vision-language models, with the most influential ones at the top.
-
Improved Baselines with Visual Instruction Tuning
Paper β’ 2310.03744 β’ Published β’ 37 -
DeepSeek-VL: Towards Real-World Vision-Language Understanding
Paper β’ 2403.05525 β’ Published β’ 41 -
Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities
Paper β’ 2308.12966 β’ Published β’ 8 -
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model
Paper β’ 2404.01331 β’ Published β’ 25
spaces
3
models
None public yet