From RAG to Agentic RAG for Faithful Islamic Question Answering Paper • 2601.07528 • Published Jan 12 • 1
Prototypicality Bias Reveals Blindspots in Multimodal Evaluation Metrics Paper • 2601.04946 • Published Jan 8
MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources Paper • 2509.25531 • Published Sep 29, 2025 • 9
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published Oct 9, 2025 • 39
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models Paper • 2510.06107 • Published Oct 7, 2025 • 3
EmbeddingGemma: Powerful and Lightweight Text Representations Paper • 2509.20354 • Published Sep 24, 2025 • 47
Adding LLMs to the psycholinguistic norming toolbox: A practical guide to getting the most out of human ratings Paper • 2509.14405 • Published Sep 17, 2025 • 2
Psycholinguistic Word Features: a New Approach for the Evaluation of LLMs Alignment with Humans Paper • 2506.22439 • Published May 29, 2025 • 3
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments Paper • 2509.14233 • Published Sep 17, 2025 • 16
La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America Paper • 2507.00999 • Published Jul 1, 2025 • 1
view post Post 7875 We're kick-starting the process of Transformers v5, with @ArthurZ and @cyrilvallez !v5 should be significant: we're using it as a milestone for performance optimizations, saner defaults, and a much cleaner code base worthy of 2025.Fun fact: v4.0.0-rc-1 came out on Nov 19, 2020, nearly five years ago! See translation 6 replies · 🚀 18 18 👍 9 9 🔥 6 6 + Reply
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper • 2507.06261 • Published Jul 7, 2025 • 67
Multilingual State Space Models for Structured Question Answering in Indic Languages Paper • 2502.01673 • Published Feb 1, 2025 • 2
Leveraging Vision-Language Pre-training for Human Activity Recognition in Still Images Paper • 2506.13458 • Published Jun 16, 2025
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Paper • 2504.07072 • Published Apr 9, 2025 • 9
It's the same but not the same: Do LLMs distinguish Spanish varieties? Paper • 2504.20049 • Published Apr 8, 2025
Date Fragments: A Hidden Bottleneck of Tokenization for Temporal Reasoning Paper • 2505.16088 • Published May 22, 2025 • 3