DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation Paper โข 2412.07589 โข Published 9 days ago โข 44
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper โข 2412.08443 โข Published 8 days ago โข 38
DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing Paper โข 2312.07409 โข Published Dec 12, 2023 โข 21
FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition Paper โข 2312.07536 โข Published Dec 12, 2023 โข 16
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs Paper โข 2307.08581 โข Published Jul 17, 2023 โข 27
JourneyDB: A Benchmark for Generative Image Understanding Paper โข 2307.00716 โข Published Jul 3, 2023 โข 19
Zero-shot spatial layout conditioning for text-to-image diffusion models Paper โข 2306.13754 โข Published Jun 23, 2023 โข 6 โข 1