Afonso Diela

afondiel

https://afondiel.github.io

AI & ML interests

AI & Robotics: vision, perception, cultural AI.

Recent Activity

upvoted a collection 1 day ago

🪐 SmolLM

updated a collection 2 days ago

Multimodality

liked a Space 2 days ago

EPFL-VILAB/4M

View all activity

Organizations

afondiel's activity

upvoted a collection 1 day ago

🪐 SmolLM

Collection

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 200

updated a collection 2 days ago

Multimodality

Collection

2 items • Updated 2 days ago

liked a Space 2 days ago

Running on Zero

196

⚡

4M Demo

4M: Massively Multimodal Masked Modeling

liked a model 21 days ago

Svngoku/ancient-africans

Text-to-Image • Updated 22 days ago • 137 • • 9

Reacted to merve's post with 🔥 21 days ago

Post

5358

Another great week in open ML!
Here's a small recap 🫰🏻

Model releases
⏯️ Video Language Models
AI at Meta released Vision-CAIR/LongVU_Qwen2_7B, a new state-of-the-art long video LM model based on DINOv2, SigLIP, Qwen2 and Llama 3.2

💬 Small language models
Hugging Face released HuggingFaceTB/SmolLM2-1.7B, a family of new smol language models with Apache 2.0 license that come in sizes 135M, 360M and 1.7B, along with datasets.
Meta released facebook/MobileLLM-1B, a new family of on-device LLMs of sizes 125M, 350M and 600M

🖼️ Image Generation
Stability AI released stabilityai/stable-diffusion-3.5-medium, a 2B model with commercially permissive license

🖼️💬Any-to-Any
gpt-omni/mini-omni2 is closest reproduction to GPT-4o, a new LLM that can take image-text-audio input and output speech is released!

Dataset releases
🖼️ Spawning/PD12M, a new captioning dataset of 12.4 million examples generated using Florence-2