Playground v3: Improving Text-to-Image Alignment with Deep-Fusion Large Language Models Paper • 2409.10695 • Published Sep 16 • 2
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion Paper • 2208.01618 • Published Aug 2, 2022 • 1
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18 • 74
Imagine yourself: Tuning-Free Personalized Image Generation Paper • 2409.13346 • Published Sep 20 • 67
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models Paper • 2409.17146 • Published Sep 25 • 103
ControlNet-XS: Designing an Efficient and Effective Architecture for Controlling Text-to-Image Diffusion Models Paper • 2312.06573 • Published Dec 11, 2023 • 1
Common Diffusion Noise Schedules and Sample Steps are Flawed Paper • 2305.08891 • Published May 15, 2023 • 8