A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges Paper • 2311.05112 • Published Nov 9, 2023 • 1
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning Paper • 2303.14369 • Published Mar 25, 2023
Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment Paper • 2305.12218 • Published May 20, 2023
A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges Paper • 2311.05112 • Published Nov 9, 2023 • 1
LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference Paper • 2406.18139 • Published Jun 26, 2024 • 2
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension Paper • 2411.13093 • Published Nov 20, 2024 • 2
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension Paper • 2503.08689 • Published Mar 11 • 4