skip some flash attn patches unless explicitly enabled (#643) 895f0a0 unverified winglian commited on Sep 27, 2023
Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified Glavin001 commited on Sep 13, 2023
fix eval regression caused in 13f7efaf74fcd3c4514277ccb71914c589873f6a a213d99 tmm1 commited on Aug 21, 2023
Attention mask and position id fixes for packing (#285) 2bb0b78 unverified winglian commited on Aug 12, 2023