Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
nyuuzyouΒ 
posted an update Jul 6, 2024
Post
1114
Just released the GitVerse Code Dataset - nyuuzyou/gitverse-code.

πŸ“Š Dataset highlights:
- 30 GB of unique code extracted from over 400 GB of analyzed data
- 9,014 repositories
- 2,804,216 unique code files
- 419 different file types
- Multilingual: various programming languages

🌐 Sourced from GitVerse, a Russian GitHub alternative opened in 2024.

Let me know your thoughts.
In this post