4 24 26

Andrea Gemelli

andreagemelli

https://www.andreagemelli.me

AI & ML interests

Natural Language Processing, Computer Vision, Generative Models, Document Analysis

Recent Activity

new activity 24 days ago

letxbe/DocExplainer:Which license is it?

updated a model 24 days ago

letxbe/DocExplainer

upvoted a collection 26 days ago

Holo1.5

View all activity

Organizations

New activity in letxbe/DocExplainer 24 days ago

Which license is it?

#1 opened 25 days ago by

plamb

updated a model 24 days ago

letxbe/DocExplainer

Visual Question Answering • Updated 24 days ago • 63

upvoted a collection 26 days ago

Holo1.5

Collection

Holo1.5 - Open Foundation Models for Computer Use Agents • 5 items • Updated 29 days ago • 33

liked a model 26 days ago

blowing-up-groundhogs/emuru

Text-to-Image • 0.7B • Updated Jul 31 • 258 • 10

authored a paper 27 days ago

Towards Reliable and Interpretable Document Question Answering via VLMs

Paper • 2509.10129 • Published Sep 12

liked 2 Spaces about 1 month ago

MCP Server Template

🛠

Pokemon Showdown

⚡

Display web content in a full-screen iframe

published a model about 1 month ago

letxbe/DocExplainer

Visual Question Answering • Updated 24 days ago • 63

liked a Space about 2 months ago

1.09k

FineWeb: decanting the web for the finest text data at scale

🍷

Generate high-quality text data for LLMs using FineWeb

commented on SmolLM3: smol, multilingual, long-context reasoner 3 months ago

Amazing 🚀

upvoted an article 3 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 695

upvoted a paper 3 months ago

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

Paper • 2507.01955 • Published Jul 2 • 35

upvoted a collection 5 months ago

Qwen3

Collection

84 items • Updated Aug 6 • 1.33k

liked 2 models 6 months ago

letxbe/qwen2-7b-BoundingDocs-rephrased

Image-to-Text • 8B • Updated May 8 • 3

letxbe/mistral-7b-v03-BoundingDocs-rephrased

Text Generation • 7B • Updated May 6 • 1 • 2

liked a Space 6 months ago

AlfredAgent

🏢

Generate answers using web searches and tools

upvoted a paper 6 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 200

upvoted 2 articles 7 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

Feb 20

• 307

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 465

upvoted a collection 8 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 543

Andrea Gemelli

AI & ML interests

Recent Activity

Organizations

andreagemelli's activity

Which license is it?

MCP Server Template

Pokemon Showdown

FineWeb: decanting the web for the finest text data at scale

SmolLM3: smol, multilingual, long-context reasoner

AlfredAgent

SmolVLM2: Bringing Video Understanding to Every Device

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM