LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents Paper • 2311.05437 • Published Nov 9, 2023 • 47
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Paper • 2406.18521 • Published Jun 26 • 28
Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks Paper • 2403.04814 • Published Mar 7 • 1
CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models Paper • 2408.01605 • Published Aug 2 • 1
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers Paper • 2401.04695 • Published Jan 9 • 11