Post
1759
small but mighty π₯
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM π«°π» also with gradient accumulation simulated batch size is 16 β¨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work π https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
you can fine-tune SmolVLM on an L4 with batch size of 4 and it will only take 16.4 GB VRAM π«°π» also with gradient accumulation simulated batch size is 16 β¨
I made a notebook that includes all the goodies: QLoRA, gradient accumulation, gradient checkpointing with explanations on how they work π https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb