Datasets used to train SmolDocling
HuggingFaceM4
Enterprise
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
HuggingFaceM4 is the multimodal team at Hugging Face, working on vision-language models.
Within this organization on the Hugging Face hub, you can access the Idefics models (version 1 IDEFICS, version 2 Idefics2, version 3 Idefics3), datasets used for the training like OBELICS, WebSight, The Cauldron or Docmatix, and interactive tools to visualize the results.
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
167
IDEFICS2 Playground
🐨Chat with an AI assistant using text and images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 62.5k • 615 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 269 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 928 • 28
Datasets used to train SmolDocling
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
167
IDEFICS2 Playground
🐨Chat with an AI assistant using text and images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 62.5k • 615 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 269 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 928 • 28
spaces
16
pinned
Build error
377
IDEFICS Playground
🐨
Running
144
FineVision: Open Data is All You Need
📝
A new open-source dataset for training VLMs
Paused
102
Idefics3
📊
Generate text based on an image and prompt
Running
on
Zero
16
Florence 2
📉
Answer questions about images using text prompts
Running
on
Zero
906
Screenshot to HTML
⚡
Convert screenshots to HTML code
models
34

HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text
•
8B
•
Updated
•
30.6k
•
294

HuggingFaceM4/Florence-2-DocVQA
Image-Text-to-Text
•
0.8B
•
Updated
•
1.72k
•
62

HuggingFaceM4/idefics2-8b
Image-Text-to-Text
•
8B
•
Updated
•
62.5k
•
615

HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text
•
8B
•
Updated
•
928
•
28

HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text
•
8B
•
Updated
•
269
•
95

HuggingFaceM4/siglip-so400m-14-364-flash-attn2-navit
Zero-Shot Image Classification
•
0.9B
•
Updated
•
21
•
1

HuggingFaceM4/siglip-so400m-14-700-flash-attn2-navit
Zero-Shot Image Classification
•
0.9B
•
Updated
•
20
•
2

HuggingFaceM4/siglip-so400m-14-384-flash-attn2-navit
Zero-Shot Image Classification
•
0.9B
•
Updated
•
31
•
1

HuggingFaceM4/idefics2-8b-chatty-AWQ
Image-Text-to-Text
•
2B
•
Updated
•
12
•
5

HuggingFaceM4/idefics2-8b-AWQ
Image-Text-to-Text
•
2B
•
Updated
•
39
•
26
datasets
81
HuggingFaceM4/FineVision
Viewer
•
Updated
•
24.2M
•
246k
•
336
HuggingFaceM4/lmms-eval-embeddings
Updated
•
759
•
1
HuggingFaceM4/DoclingMatix
Viewer
•
Updated
•
1.27M
•
5.15k
•
40
HuggingFaceM4/Caltech-101
Updated
•
348
•
3
HuggingFaceM4/Docmatix
Viewer
•
Updated
•
2.55M
•
19.1k
•
290
HuggingFaceM4/the_cauldron
Viewer
•
Updated
•
1.88M
•
226k
•
496
HuggingFaceM4/FairFace
Viewer
•
Updated
•
195k
•
1.37k
•
21
HuggingFaceM4/MMBench
Viewer
•
Updated
•
11k
•
178
•
3
HuggingFaceM4/WebSight
Viewer
•
Updated
•
2.75M
•
10.4k
•
366
HuggingFaceM4/debug_MMMU_mcq_to_remove
Viewer
•
Updated
•
10.9k
•
82