Datasets used to train SmolDocling
HuggingFaceM4
Team
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
WebSight is a dataset of 823,000 HTML/CSS codes representing synthetically generated English websites, each accompanied by a corresponding screenshot.
-
HuggingFaceM4/WebSight
Viewer • Updated • 2.75M • 17.3k • 372 -
HuggingFaceM4/VLM_WebSight_finetuned
Text Generation • 8B • Updated • 598 • 190 -
910
Screenshot to HTML
⚡Convert screenshots to HTML code
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55
Collection gathering artifacts related to OBELICS
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
168
IDEFICS2 Playground
🐨Chat with an AI assistant using text and images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 5.65k • 617 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 131 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 772 • 28
Collection assembling all the models and spaces related to IDEFICS
Datasets used to train SmolDocling
Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation.
-
168
IDEFICS2 Playground
🐨Chat with an AI assistant using text and images
-
HuggingFaceM4/idefics2-8b
Image-Text-to-Text • 8B • Updated • 5.65k • 617 -
HuggingFaceM4/idefics2-8b-chatty
Image-Text-to-Text • 8B • Updated • 131 • 95 -
HuggingFaceM4/idefics2-8b-base
Image-Text-to-Text • 8B • Updated • 772 • 28
WebSight is a dataset of 823,000 HTML/CSS codes representing synthetically generated English websites, each accompanied by a corresponding screenshot.
-
HuggingFaceM4/WebSight
Viewer • Updated • 2.75M • 17.3k • 372 -
HuggingFaceM4/VLM_WebSight_finetuned
Text Generation • 8B • Updated • 598 • 190 -
910
Screenshot to HTML
⚡Convert screenshots to HTML code
-
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 55
Collection assembling all the models and spaces related to IDEFICS
Collection gathering artifacts related to OBELICS