The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis Paper • 2404.01204 • Published Apr 1
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper • 2410.20424 • Published Oct 27 • 37
Aria: An Open Multimodal Native Mixture-of-Experts Model Paper • 2410.05993 • Published Oct 8 • 107
Data Engineering for Scaling Language Models to 128K Context Paper • 2402.10171 • Published Feb 15 • 23