Ahmed Masry's picture

17 5 4

Ahmed Masry PRO

ahmed-masry

·

https://ahmedmasryku.github.io/

Ahmed_Masry97

AI & ML interests

Multimodal Chart Understanding, Multimodal Document AI, Multimodal Vision - Language Models,

Recent Activity

published a Space 12 days ago

ahmed-masry/Label-Studio

updated a Space 12 days ago

ahmed-masry/Label-Studio

updated a model about 2 months ago

ahmed-masry/UI-TARS-2B-SFT

View all activity

Organizations

authored a paper 5 months ago

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Paper • 2504.05506 • Published Apr 7 • 24

authored a paper 7 months ago

AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Paper • 2502.01341 • Published Feb 3 • 39

authored a paper 9 months ago

BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Paper • 2412.04626 • Published Dec 5, 2024 • 14

authored 6 papers about 1 year ago

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

Paper • 2203.06486 • Published Mar 12, 2022

ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

Paper • 2203.10244 • Published Mar 19, 2022

ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

Paper • 2407.04172 • Published Jul 4, 2024 • 27

UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

Paper • 2305.14761 • Published May 24, 2023

Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization

Paper • 2312.10610 • Published Dec 17, 2023 • 1

ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning

Paper • 2403.09028 • Published Mar 14, 2024