view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais • 10 days ago • 94
Large Scale Transfer Learning for Tabular Data via Language Modeling Paper • 2406.12031 • Published Jun 17 • 9
TabuLa-8B Collection Training, eval suite, and model from the paper "Large Scale Transfer Learning for Tabular Data via Language Modeling" https://arxiv.org/abs/2406.12031 • 4 items • Updated Jun 19 • 10
MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens Paper • 2406.11271 • Published Jun 17 • 20
🍃 MINT-1T Collection Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 54
📚 FineWeb-Edu Collection FineWeb-Edu datasets, classifier and ablation model • 5 items • Updated Jun 12 • 11
Switch-Transformers release Collection This release included various MoE (Mixture of expert) models, based on the T5 architecture . The base models use from 8 to 256 experts. • 9 items • Updated Jul 31 • 15
Adaptive Caching for Faster Video Generation with Diffusion Transformers Paper • 2411.02397 • Published 19 days ago • 20
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3 • 50
Model Depot Collection Leading generative models packaged in OpenVino format optimized for use on AI PCs • 50 items • Updated 27 days ago • 5
Functionary V3.2 Collection Fine-tuning Llama-3.1 using own our prompt template for function calling • 3 items • Updated Oct 16 • 1
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 10 items • Updated 3 days ago • 176