Fine-tuning options?
How to fine-tune those models on a custom dataset?
@yukiarimo We have instructions for "instruction tuning" and "parameter-efficient finetuning" here: https://github.com/apple/corenet/tree/main/projects/openelm
@mchorton The PEFT fine-tuning configs in the mentioned repo are heavily customized for the commonsense reasoning datasets. Are there instructions on fine-tuning the models on a different dataset?
this one talks about instruct tuning: https://github.com/apple/corenet/blob/main/projects/openelm/README-instruct.md
How to fine-tune those models on a custom dataset?
tried a full finetune with HuggingFace SFTTrainer, took 10' for 3 epochs of 4k conversational dataset (Open Assistant) on a 3090. loss looks good, trained model behaves as expected in my quick vibe check
code: https://github.com/geronimi73/3090_shorts/blob/main/nb_finetune-full_OpenELM-450M.ipynb
How to fine-tune those models on a custom dataset?
tried a full finetune with HuggingFace SFTTrainer, took 10' for 3 epochs of 4k conversational dataset (Open Assistant) on a 3090. loss looks good, trained model behaves as expected in my quick vibe check
code: https://github.com/geronimi73/3090_shorts/blob/main/nb_finetune-full_OpenELM-450M.ipynb
Thanks. That is what I need. Will try it later today.
Thanks g-Ronimo, how much did this cost? (roughly)
10 minutes on a 3090? nothing, it's my own GPU at home