Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling Paper • 2405.21048 • Published May 31 • 12
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
Aligning Large Multimodal Models with Factually Augmented RLHF Paper • 2309.14525 • Published Sep 25, 2023 • 29
Dual-Stream Diffusion Net for Text-to-Video Generation Paper • 2308.08316 • Published Aug 16, 2023 • 23
TeCH: Text-guided Reconstruction of Lifelike Clothed Humans Paper • 2308.08545 • Published Aug 16, 2023 • 33