Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

Posts 17

view post
Post
794
Just released NVAMP Loss!

βœ”οΈ modification of the cross-entropy loss function designed specifically for training LLMs.
βœ”οΈ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
βœ”οΈ more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! πŸ”₯πŸ€–

https://github.com/mkurman/nvamp-loss