Ansh Gupta

thisisanshgupta
Β·

AI & ML interests

Pytorch | NLP

Recent Activity

liked a Space about 2 months ago
haoheliu/audioldm2-text2audio-text2music
liked a Space about 2 months ago
lamm-mit/PDF2Audio
View all activity

Organizations

thisisanshgupta's activity

Reacted to dylanebert's post with πŸ”₯ 3 months ago
view post
Post
2517
Here's a 1-minute video tutorial on how to fine-tune unsloth/llama-3-8b-bnb-4bit with unsloth

Using Roller Coaster Tycoon peep thoughts as an example
Reacted to singhsidhukuldeep's post with πŸ”₯ 3 months ago
view post
Post
2627
It took Google’s Transformer model from 2017 a whopping $900 to train. πŸ’Έ

This in contrast to the $191 million Google spent on Gemini Ultra sounds like a bargain! πŸ’°

Gemini Ultra required 50 billion petaFLOPS (one petaFLOP equals one quadrillion FLOPs). πŸ€–
Compared to OpenAI’s GPT-4, which required 21 billion petaFLOPS, at a cost of $78 million. πŸ’‘

2017: Original Transformer Model: $930 [@Google ] πŸ’»
2018: BERT-Large: $3,288 [@Google] πŸ“š
2019: RoBERTa Large: 160k [@Meta] 🌐
2020: GPT-3(175B): $4.32M [@OpenAI] 🧠
2023: Llama 2 70B: $3.93M [@Meta] πŸ‘
2023: GPT-4: $78.35M [@OpenAI] 🌟
Now, Gemini Ultra: $191.4M [@Google] πŸš€

This forms an exponential curve! 🀯

But, why? πŸ€”
Compute, data, and expertise. All three come at a great cost! βš™οΈπŸ“ŠπŸ’‘

Google recently made Gemini-1.5-Flash fine-tuning free, as it's almost impossible for regular businesses to justify an in-house trained foundational model! πŸ†“

This barrier of cost is going to result in fewer new foundational models/less competition and more fine-tunes! πŸ“‰πŸ”„

Data [Stanford University’s 2024 AI Index Report]: https://aiindex.stanford.edu/report/
Graphic: https://voronoiapp.com/technology/Googles-Gemini-Ultra-Cost-191M-to-Develop--1088

Many thanks to everyone spending tons of resources and open-sourcing the models! πŸ€—
Β·