DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper • 2412.07589 • Published 8 days ago • 43
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published 7 days ago • 37
DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing Paper • 2312.07409 • Published Dec 12, 2023 • 21
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition Paper • 2312.07536 • Published Dec 12, 2023 • 16
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs Paper • 2307.08581 • Published Jul 17, 2023 • 27
JourneyDB: A Benchmark for Generative Image Understanding Paper • 2307.00716 • Published Jul 3, 2023 • 19
ChessGPT: Bridging Policy Learning and Language Modeling Paper • 2306.09200 • Published Jun 15, 2023 • 9