Low-level Visual Assistants! Collection Multi-purpose Assistant for Low-level Visual Perception, from [Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models] • 6 items • Updated Dec 2, 2024 • 2
Visual Scorers! Collection Variants of Visual Evaluation Models proposed by [Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-defined Levels]. Use by `model.score()`! • 10 items • Updated Dec 2, 2024 • 3
Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare Paper • 2405.19298 • Published May 29, 2024
VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation Paper • 2411.13281 • Published Nov 20, 2024 • 18
Aria: An Open Multimodal Native Mixture-of-Experts Model Paper • 2410.05993 • Published Oct 8, 2024 • 108
Q-Ground: Image Quality Grounding with Large Multi-modality Models Paper • 2407.17035 • Published Jul 24, 2024 • 1
LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding Paper • 2407.15754 • Published Jul 22, 2024 • 20