WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences Paper • 2406.11069 • Published Jun 16 • 13
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos Paper • 2406.08407 • Published Jun 12 • 24
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense? Paper • 2406.07546 • Published Jun 11 • 8