gracefully handle length feature used for group by (#565) e7aa7b1 unverified winglian commited on Sep 13, 2023
workaround so training doesn't hang when packed dataloader batches aren't even (#461) c69faee unverified winglian commited on Aug 23, 2023
Attention mask and position id fixes for packing (#285) 2bb0b78 unverified winglian commited on Aug 12, 2023