GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs Paper • 2412.11258 • Published 3 days ago • 11
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator Paper • 2412.12094 • Published 2 days ago • 8
ColorFlow: Retrieval-Augmented Image Sequence Colorization Paper • 2412.11815 • Published 2 days ago • 22
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper • 2412.09645 • Published 8 days ago • 30
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 6 days ago • 53
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation Paper • 2412.11919 • Published 2 days ago • 29
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models Paper • 2412.11605 • Published 2 days ago • 11
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations Paper • 2412.12083 • Published 2 days ago • 12
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers Paper • 2412.09611 • Published 6 days ago • 9
FireFlow: Fast Inversion of Rectified Flow for Image Semantic Editing Paper • 2412.07517 • Published 8 days ago • 11
ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven Generation Paper • 2412.08645 • Published 7 days ago • 11
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption Paper • 2412.09283 • Published 6 days ago • 19
BiMediX2: Bio-Medical EXpert LMM for Diverse Medical Modalities Paper • 2412.07769 • Published 8 days ago • 25
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published 6 days ago • 35
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 5 days ago • 121
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published 6 days ago • 19