Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Paper • 1909.08053 • Published Sep 17, 2019 • 2
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Paper • 2405.17428 • Published May 27 • 17