-
T^2-RAGBench: Text-and-Table Benchmark for Evaluating Retrieval-Augmented Generation
Paper • 2506.12071 • Published • 2 -
MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space
Paper • 2506.11684 • Published • 2 -
LEMUR: A Corpus for Robust Fine-Tuning of Multilingual Law Embedding Models for Retrieval
Paper • 2602.09570 • Published • 1 -
Review Arcade: On the Human Alignment and Gameability of LLM Reviews
Paper • 2605.28897 • Published • 1
AI & ML interests
None defined yet.
Recent Activity
View all activity
-
T^2-RAGBench: Text-and-Table Benchmark for Evaluating Retrieval-Augmented Generation
Paper • 2506.12071 • Published • 2 -
MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space
Paper • 2506.11684 • Published • 2 -
LEMUR: A Corpus for Robust Fine-Tuning of Multilingual Law Embedding Models for Retrieval
Paper • 2602.09570 • Published • 1 -
Review Arcade: On the Human Alignment and Gameability of LLM Reviews
Paper • 2605.28897 • Published • 1
Collection of popular open-source RAG datasets for evaluation