view article Article Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks +2 14 days ago β’ 19
AraLingBench A Human-Annotated Benchmark for Evaluating Arabic Linguistic Capabilities of Large Language Models Paper β’ 2511.14295 β’ Published 17 days ago β’ 71
Baseer: A Vision-Language Model for Arabic Document-to-Markdown OCR Paper β’ 2509.18174 β’ Published Sep 17 β’ 127
Qari-OCR: A High-Accuracy Model for Arabic Optical Character Collection π΅π’πππ‘ ππ π‘βπ πππ€ππππ’π ππ€ππ2 ππΏ 2π΅ πππ ππππ-π‘π’πππ ππ ππ π΄πππππ ππΆπ πππ‘ππ ππ‘, ππππ π£0.1 ππ β’ 7 items β’ Updated Jun 25 β’ 11
Pearl Collection PEARL: A Multimodal Culturally-Aware Arabic Instruction Dataset β’ 4 items β’ Updated Oct 27 β’ 2
QARI-OCR: High-Fidelity Arabic Text Recognition through Multimodal Large Language Model Adaptation Paper β’ 2506.02295 β’ Published Jun 2 β’ 8
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment Paper β’ 2507.20984 β’ Published Jul 28 β’ 56
NeuralOS: Towards Simulating Operating Systems via Neural Generative Models Paper β’ 2507.08800 β’ Published Jul 11 β’ 80
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper β’ 2505.17612 β’ Published May 23 β’ 81