Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset Paper • 2412.02595 • Published Dec 3, 2024 • 1
Mind the Gap! Static and Interactive Evaluations of Large Audio Models Paper • 2502.15919 • Published 21 days ago • 3
Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference Paper • 2110.05362 • Published Oct 11, 2021
Mind the Gap! Static and Interactive Evaluations of Large Audio Models Paper • 2502.15919 • Published 21 days ago • 3
Mind the Gap! Static and Interactive Evaluations of Large Audio Models Paper • 2502.15919 • Published 21 days ago • 3 • 2
view article Article Optimizing Pretraining Data Mixes with LLM-Estimated Utility By WillHeld • Jan 22 • 3
view article Article Optimizing Pretraining Data Mixes with LLM-Estimated Utility By WillHeld • Jan 22 • 3