6 8 21

Ivelin Ivanov PRO

ivelin

AI & ML interests

computer vision, vision-language models, multi modal transformers

Recent Activity

upvoted an article 3 days ago

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

upvoted an article 10 days ago

Open-R1: a fully open reproduction of DeepSeek-R1

updated a model about 2 months ago

ivelin/SmolVLM-Instruct-vqav2

View all activity

Organizations

ivelin's activity

upvoted an article 3 days ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

4 days ago

• 76

upvoted an article 10 days ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

11 days ago

• 660

updated a model about 2 months ago

ivelin/SmolVLM-Instruct-vqav2

Updated Dec 14, 2024 • 2

liked a Space about 2 months ago

EduScape

🌍

reacted to merve's post with 😎 2 months ago

Post

2673

small but mighty 🔥
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM 🫰🏻 also with gradient accumulation simulated batch size is 16 ✨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work 💝 https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb