Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
kirch
's Collections
Scotch & SOTA 🥃 Pt. 1: Big Boi LLM 🚛
Scotch & SOTA 🥃 Pt. 2: Quantized Small Boi LLM 👉👈
Scotch & SOTA 🥃 Pt. 3: Image Sorcery 🔮
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
Scotch & SOTA 🥃 Pt. 5: Instruction Tuning Datasets 👩🏫
Scotch & SOTA 🥃 Pt. 6: Dialogue Tuning Datasets 💬
Scotch & SOTA 🥃 Pt. 7: Human Feedback Datasets 🫣
Scotch & SOTA 🥃 Pt. 4: Multi-Modal 🔀
Scotch & SOTA 🥃 Pt. 4: Pre-Training Datasets 📜
updated
Sep 25, 2023
We gotta start somewhere, these jsonl's aren't gonna train themselves.
Upvote
-
allenai/dolma
Updated
Apr 17
•
1.33k
•
852
allenai/peS2o
Updated
Oct 13
•
2.63k
•
152
tiiuae/falcon-refinedweb
Viewer
•
Updated
Jun 20, 2023
•
968M
•
29.5k
•
815
CarperAI/pilev2-dev
Preview
•
Updated
Mar 13, 2023
•
35
•
23
AlgorithmicResearchGroup/arxiv_cplusplus_research_code
Viewer
•
Updated
Sep 4
•
1.63M
•
1.21k
•
4
bigcode/the-stack
Viewer
•
Updated
Apr 13, 2023
•
546M
•
9.19k
•
741
bigcode/starcoderdata
Viewer
•
Updated
May 16, 2023
•
207M
•
6.91k
•
403
cerebras/SlimPajama-627B
Preview
•
Updated
Jul 7, 2023
•
29.8k
•
429
euirim/goodwiki
Viewer
•
Updated
Sep 11, 2023
•
44.8k
•
119
•
50
nampdn-ai/tiny-textbooks
Viewer
•
Updated
Jul 3
•
420k
•
88
•
147
nampdn-ai/tiny-codes
Viewer
•
Updated
Sep 30, 2023
•
1.63M
•
196
•
232
roneneldan/TinyStories
Viewer
•
Updated
Aug 12
•
2.14M
•
17.5k
•
563
nampdn-ai/tiny-bridgedict
Viewer
•
Updated
Aug 4, 2023
•
17.6k
•
46
•
17
nampdn-ai/tiny-webtext
Viewer
•
Updated
Aug 27, 2023
•
2.32M
•
49
•
32
Upvote
-
Share collection
View history
Collection guide
Browse collections