ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations Paper • 2501.14607 • Published Jan 24, 2025
PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild Paper • 2504.11326 • Published Apr 15, 2025 • 5
Progressive Pretext Task Learning for Human Trajectory Prediction Paper • 2407.11588 • Published Jul 16, 2024
Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation Paper • 2505.12702 • Published May 19, 2025
Seg-ReSearch: Segmentation with Interleaved Reasoning and External Search Paper • 2602.04454 • Published 5 days ago • 1
ObjEmbed: Towards Universal Multimodal Object Embeddings Paper • 2602.01753 • Published 7 days ago • 5
ObjEmbed: Towards Universal Multimodal Object Embeddings Paper • 2602.01753 • Published 7 days ago • 5
WeDetect: Fast Open-Vocabulary Object Detection as Retrieval Paper • 2512.12309 • Published Dec 13, 2025 • 3
IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation Paper • 2512.10730 • Published Dec 11, 2025 • 3
IRG-MotionLLM: Interleaving Motion Generation, Assessment and Refinement for Text-to-Motion Generation Paper • 2512.10730 • Published Dec 11, 2025 • 3