Post
1114
Just released the GitVerse Code Dataset -
nyuuzyou/gitverse-code.
π Dataset highlights:
- 30 GB of unique code extracted from over 400 GB of analyzed data
- 9,014 repositories
- 2,804,216 unique code files
- 419 different file types
- Multilingual: various programming languages
π Sourced from GitVerse, a Russian GitHub alternative opened in 2024.
Let me know your thoughts.
π Dataset highlights:
- 30 GB of unique code extracted from over 400 GB of analyzed data
- 9,014 repositories
- 2,804,216 unique code files
- 419 different file types
- Multilingual: various programming languages
π Sourced from GitVerse, a Russian GitHub alternative opened in 2024.
Let me know your thoughts.