Elie Bakouch

eliebak

AI & ML interests

Training LLM's @ πŸ€—

Recent Activity

reacted to cfahlgren1's post with ❀️ about 15 hours ago
upvoted a paper about 19 hours ago
upvoted a paper 6 days ago

Articles

Organizations

Posts 1

view post
Post
1037
Wow, impressive 340B model by nvidia with a nice permissive license! πŸš€ The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! πŸ‘€

nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911