reading list GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages Paper • 2410.23825 • Published Oct 31, 2024 • 3
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages Paper • 2410.23825 • Published Oct 31, 2024 • 3
translation facebook/m2m100_418M Text2Text Generation • Updated Feb 29, 2024 • 1.23M • 276 facebook/nllb-200-distilled-600M Translation • Updated Feb 14, 2024 • 483k • 560 google/mt5-small Text2Text Generation • Updated Sep 18, 2023 • 137k • 130 facebook/mbart-large-50 Text2Text Generation • Updated Mar 28, 2023 • 26.3k • 143