-
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
Paper • 2404.13013 • Published • 30 -
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
Paper • 2404.12253 • Published • 53 -
Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity
Paper • 2403.12267 • Published
Oliver Wei
Oliver2021
AI & ML interests
None yet
Recent Activity
liked
a model
9 days ago
Falconsai/nsfw_image_detection
upvoted
a
paper
9 days ago
NVILA: Efficient Frontier Visual Language Models
liked
a model
14 days ago
mistralai/Pixtral-12B-2409
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet