TinyLLaVA Collection TinyLLaVA: A Framework of Small-scale Large Multimodal Models • 7 items • Updated Mar 19 • 5
Vision Language Models Papers 🖼️💬📝 Collection Papers about vision-language models, most important ones are on top of the list. • 27 items • Updated Apr 30 • 33
Design2Code: How Far Are We From Automating Front-End Engineering? Paper • 2403.03163 • Published Mar 5 • 93
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression Paper • 2311.10794 • Published Nov 17, 2023 • 24
Contrastive Feature Masking Open-Vocabulary Vision Transformer Paper • 2309.00775 • Published Sep 2, 2023 • 8
Multi-Modal Classifiers for Open-Vocabulary Object Detection Paper • 2306.05493 • Published Jun 8, 2023 • 6
One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization Paper • 2306.16928 • Published Jun 29, 2023 • 38