Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs
Paper
•
2411.08719
•
Published
•
1
Formerly, MDEL, we have renamed ourselves after the model we deployed, Aurora-M. Visit us here: https://huggingface.co/aurora-m