K2-Spike-1 / README.md
victormiller's picture
Update README.md
d01d003 verified
|
raw
history blame
651 Bytes
metadata
license: apache-2.0

K2 Loss Spike 1

During the first K2 training phase, we encountered two loss spikes.

k2 spike 1

Uses

Loss spikes are still a relatively unknown phenomena. By making these spikes and associated training details available, we hope others use these artifacts to further the worlds knowledge on this topic.

About the LLM360 Research Suite

The LLM360 Research Suite is a comprehensive set of large language model (LLM) artifacts from Amber, CrystalCoder, and K2 for academic and industry researchers to explore LLM training dynamics. Additional resources can be found at llm360.ai.