additional logging to get maximum token length of a sequence in the dataset (#1066) [skip ci] 2f2582e unverified winglian commited on Jan 10
Efficiently get the length of the tokenized docs (#1063) 81d3845 unverified ricdomolm winglian commited on Jan 8
streaming multipack for pretraining dataset (#959) 553c80f unverified jinwonkim93 jinwonkim93@github.com winglian commited on Jan 6
Determine FSDP/deepspeed settings on device select. (#883) 71b7ea3 unverified user735 Karl-Johan Alm winglian commited on Nov 29, 2023
Threaded MultipackDistributedDataloader with prefetched samples (#759) 05bd6f1 unverified casperhansen commited on Oct 26, 2023
refactor setup trainer so we can add more hooks (#773) 6c81c61 unverified winglian commited on Oct 23, 2023
fixes for alpaca w chatml, and don't include attention_mask w mistral for flash attention (#728) 3553172 unverified winglian commited on Oct 14, 2023
Save Axolotl config as WandB artifact (#716) 490923f unverified Jan Philipp Harries commited on Oct 11, 2023
refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662) 2642cae unverified winglian commited on Oct 3, 2023
Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified Nanobit commited on Sep 28, 2023
chore(callback): Remove old peft saving code (#510) d5f8589 unverified Nanobit commited on Sep 22, 2023
run eval on the first step to get a baseline (#617) 2844eb2 unverified winglian commited on Sep 22, 2023
gather/broadcast the max value of the packing efficiency automatically (#463) b15b19e unverified winglian commited on Sep 17, 2023
optionally configure sample packing for evals (#589) 21ec195 unverified winglian commited on Sep 16, 2023
fix save_steps so it doesn't get duplicated (#567) 3fbde76 unverified winglian commited on Sep 14, 2023
improve how we setup eval/save strategies and steps (#547) 36e53c7 unverified winglian commited on Sep 13, 2023
Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified Glavin001 commited on Sep 13, 2023
Add support for GPTQ using native transformers/peft (#468) 3355706 unverified winglian commited on Sep 5, 2023
Added advanced DDP args (#515) 396a7a7 unverified Jan Philipp Harries Jan Philipp Harries commited on Aug 31, 2023