Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
5.2
TFLOPS
2
5
13
Brian Guo
PRO
JoyboyBrian
Follow
MiaHawthorne's profile picture
qiqiWav's profile picture
KennyVechov's profile picture
21 followers
·
18 following
brianatnexa
JoyboyBrian
zhengyang-guo
AI & ML interests
None yet
Recent Activity
Reacted to
thomwolf
's
post
with 🚀
1 day ago
A Little guide to building Large Language Models in 2024 This is a post-recording of a 75min lecture I gave two weeks ago on how to train a LLM from scratch in 2024. I tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports. In the lecture, I introduce the students to all the important concepts/tools/techniques for training good performance LLM: * finding, preparing and evaluating web scale data * understanding model parallelism and efficient training * fine-tuning/aligning models * fast inference There is of course many things and details missing and that I should have added to it, don't hesitate to tell me you're most frustrating omission and I'll add it in a future part. In particular I think I'll add more focus on how to filter topics well and extensively and maybe more practical anecdotes and details. Now that I recorded it I've been thinking this could be part 1 of a two-parts series with a 2nd fully hands-on video on how to run all these steps with some libraries and recipes we've released recently at HF around LLM training (and could be easily adapted to your other framework anyway): *`datatrove` for all things web-scale data preparation: https://github.com/huggingface/datatrove *`nanotron` for lightweight 4D parallelism LLM training: https://github.com/huggingface/nanotron *`lighteval` for in-training fast parallel LLM evaluations: https://github.com/huggingface/lighteval Here is the link to watch the lecture on Youtube: https://www.youtube.com/watch?v=2-SPH9hIKT8 And here is the link to the Google slides: https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit#slide=id.p Enjoy and happy to hear feedback on it and what to add, correct, extend in a second part.
Reacted to
thomwolf
's
post
with ❤️
1 day ago
A Little guide to building Large Language Models in 2024 This is a post-recording of a 75min lecture I gave two weeks ago on how to train a LLM from scratch in 2024. I tried to keep it short and comprehensive – focusing on concepts that are crucial for training good LLM but often hidden in tech reports. In the lecture, I introduce the students to all the important concepts/tools/techniques for training good performance LLM: * finding, preparing and evaluating web scale data * understanding model parallelism and efficient training * fine-tuning/aligning models * fast inference There is of course many things and details missing and that I should have added to it, don't hesitate to tell me you're most frustrating omission and I'll add it in a future part. In particular I think I'll add more focus on how to filter topics well and extensively and maybe more practical anecdotes and details. Now that I recorded it I've been thinking this could be part 1 of a two-parts series with a 2nd fully hands-on video on how to run all these steps with some libraries and recipes we've released recently at HF around LLM training (and could be easily adapted to your other framework anyway): *`datatrove` for all things web-scale data preparation: https://github.com/huggingface/datatrove *`nanotron` for lightweight 4D parallelism LLM training: https://github.com/huggingface/nanotron *`lighteval` for in-training fast parallel LLM evaluations: https://github.com/huggingface/lighteval Here is the link to watch the lecture on Youtube: https://www.youtube.com/watch?v=2-SPH9hIKT8 And here is the link to the Google slides: https://docs.google.com/presentation/d/1IkzESdOwdmwvPxIELYJi8--K3EZ98_cL6c5ZcLKSyVg/edit#slide=id.p Enjoy and happy to hear feedback on it and what to add, correct, extend in a second part.
View all activity
Organizations
models
2
Sort: Recently updated
JoyboyBrian/Phi-3-mini-128k-instruct-Q4_0-GGUF
Text Generation
•
Updated
May 4
•
44
JoyboyBrian/gemma-2b-Q4_0-GGUF
Updated
May 4
•
11
•
1
datasets
3
Sort: Recently updated
JoyboyBrian/nexa-audiolm-benchmark
Viewer
•
Updated
9 days ago
•
200
•
27
JoyboyBrian/hp_lora_training_data
Viewer
•
Updated
10 days ago
•
36.4k
•
27
JoyboyBrian/tests
Viewer
•
Updated
13 days ago
•
3.74k
•
39