RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models Paper • 2411.04097 • Published 15 days ago • 5
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published 16 days ago • 43
CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis Paper • 2407.13301 • Published Jul 18 • 54
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Paper • 2404.07972 • Published Apr 11 • 46
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts Paper • 2309.07430 • Published Sep 14, 2023 • 27
CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Paper • 2401.12208 • Published Jan 22 • 22