Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring Paper β’ 2403.09333 β’ Published Mar 14 β’ 14 β’ 3
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring Paper β’ 2403.09333 β’ Published Mar 14 β’ 14 β’ 3
Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models Paper β’ 2404.07973 β’ Published Apr 11 β’ 30 β’ 3