ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models Paper • 2411.10867 • Published 8 days ago • 6
Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types Paper • 2409.09269 • Published Sep 14 • 7
Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders Paper • 2409.00391 • Published Aug 31 • 4
RoundTable: Leveraging Dynamic Schema and Contextual Autocomplete for Enhanced Query Precision in Tabular Question Answering Paper • 2408.12369 • Published Aug 22 • 3
Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs Paper • 2408.12060 • Published Aug 22 • 5
Unboxing Occupational Bias: Grounded Debiasing LLMs with U.S. Labor Data Paper • 2408.11247 • Published Aug 20 • 4
Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification Paper • 2408.11237 • Published Aug 20 • 5
The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks Paper • 2408.10446 • Published Aug 19 • 6