Revert "run PR e2e docker CI tests in Modal" (#1220) [skip ci] 8da1633 unverified winglian commited on Jan 26
upgrade deepspeed to 0.13.1 for mixtral fixes (#1189) [skip ci] 8a49309 unverified winglian commited on Jan 24
Remove fused-dense-lib from requirements.txt (#1087) 91502b9 unverified casperhansen commited on Jan 10
Separate AutoGPTQ dep to `pip install -e .[auto-gptq]` (#1077) 9be92d1 unverified casperhansen commited on Jan 9
Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified Johan Hansson winglian commited on Jan 9
bump transformers and update attention class map name (#1023) bcc78d8 unverified winglian commited on Jan 3
update transformers to fix checkpoint saving (#963) f28e755 unverified dumpmemory commited on Dec 16, 2023
update to latest transformers for mixstral support (#929) 35f9b0f unverified winglian commited on Dec 10, 2023
update datasets version to cut down the warnings due to pyarrow arg change (#897) 6a4562a unverified winglian commited on Nov 25, 2023
try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867) 0de1457 unverified winglian commited on Nov 16, 2023
add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified winglian commited on Nov 16, 2023
don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified winglian commited on Nov 9, 2023
chore: bump transformers to v4.34.1 to fix tokenizer issue (#745) 8966a6f unverified Nanobit commited on Oct 20, 2023
Fix(version): Update FA to work with Mistral SWA (#673) 43856c0 unverified Nanobit commited on Oct 4, 2023
Feat: Allow usage of native Mistral FA when no sample_packing (#669) 697c50d unverified Nanobit commited on Oct 4, 2023