From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models Paper • 2508.13491 • Published Aug 19 • 58
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published Jun 16 • 93
FinAudio: A Benchmark for Audio Large Language Models in Financial Applications Paper • 2503.20990 • Published Mar 26 • 19