Michael Pieler

MicPie

AI & ML interests

ML

Recent Activity

liked a Space about 1 month ago
k-mktr/gpu-poor-llm-arena
View all activity

Organizations

MicPie's activity

Reacted to m-ric's post with ๐Ÿ‘€ 4 months ago
view post
Post
2251
๐—ง๐—ต๐—ฒ ๐—ต๐˜‚๐—ด๐—ฒ ๐—ฐ๐—ผ๐˜€๐˜ ๐—ผ๐—ณ ๐—ฟ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต ๐—ผ๐—ป ๐—ณ๐—ฟ๐—ผ๐—ป๐˜๐—ถ๐—ฒ๐—ฟ ๐—Ÿ๐—Ÿ๐— ๐˜€ ๐Ÿ’ธ

Google DeepMind recently released a great paper that shows optimal hyperparameters to train across different regimes: Scaling Exponents Across Parameterizations and Optimizers, with data from 10,000 training runs.

One engineer decided to quantify the price of such a large-scale experiment.

๐Ÿ˜ฌ And the bill is hefty: ~13M USD

This exact number is to take with a grain of salt because many approximations were necessary to get the final result.

โ›”๏ธ But still this ballpark means that for this sole experiment, the price is way over what most startups or research labs could afford.

This means that open-sourcing research is more important than ever, to put everyone in the ecosystem on a roughly equal footing. Don't let OpenAI run first, they'll keep everything for themselves!

Read the full post that quantifies the paper's cost ๐Ÿ‘‰ https://152334h.github.io/blog/scaling-exponents/
  • 1 reply
ยท
New activity in sfairXC/FsfairX-LLaMA3-RM-v0.1 7 months ago

Training details?

1
#2 opened 7 months ago by MicPie
New activity in JeanKaddour/minipile about 1 year ago

Domain and provenance annotation

9
#1 opened about 1 year ago by haukur
New activity in allenai/peS2o over 1 year ago