Multimodal Foundation Models: From Specialists to General-Purpose Assistants Paper • 2309.10020 • Published Sep 18, 2023 • 40
Language as the Medium: Multimodal Video Classification through text only Paper • 2309.10783 • Published Sep 19, 2023 • 1
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27 • 44
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper • 2402.19427 • Published Feb 29 • 52