Commits · Dovakiins/qwerrwe

Add support for Gemma chat template (#1530)

60f5ce0
unverified

Haoxiang-Wang

winglian commited on Apr 21

ORPO Trainer replacement (#1551)

7d1d22f
unverified

winglian commited on Apr 19

fix(packages): lock datasets version (#1545)

59ef254
unverified

Nanobit commited on Apr 19

DBRX Model Support (#1462)

132eb74
unverified

winglian commited on Apr 12

Pretrain multipack v2 (#1470)

5aa5097
unverified

winglian commited on Apr 2

qwen2_moe support w multipack (#1455)

6086be8
unverified

winglian commited on Mar 29

fix some of the edge cases for Jamba (#1452)

05b398a
unverified

winglian commited on Mar 29

strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428)

2a1589f
unverified

winglian commited on Mar 21

support galore once upstreamed into transformers (#1409)

dd449c5
unverified

winglian commited on Mar 19

FDSP + QLoRA (#1378)

9b6ee83
unverified

winglian commited on Mar 8

update flash attention for gemma support: (#1368)

58b0d4b
unverified

winglian commited on Mar 6

support for DoRA w/ PEFT (#1363)

0cfdb2c
unverified

winglian commited on Mar 6

run tests again on Modal (#1289) [skip ci]

0001862
unverified

winglian commited on Feb 29

fix: checkpoint saving with deepspeed (#1321)

5be8b55
unverified

Nanobit commited on Feb 27

Pydantic 2.x cfg (#1239)

cc3cebf
unverified

winglian commited on Feb 26

make mlflow optional (#1317)

5894f0e
unverified

winglian commited on Feb 26

multipack for gemma (#1313)

2752d5f
unverified

winglian commited on Feb 22

Add seq2seq eval benchmark callback (#1274)

5a5d474
unverified

LeonardoEmili commited on Feb 13

add support for https remote yamls (#1277)

9bca7db
unverified

hamel commited on Feb 9

Peft deepspeed resume (#1227)

c67fb71
unverified

winglian commited on Jan 31

Peft lotfq (#1222)

4cb7900
unverified

winglian commited on Jan 28

Revert "run PR e2e docker CI tests in Modal" (#1220) [skip ci]

8da1633
unverified

winglian commited on Jan 26

run PR e2e docker CI tests in Modal (#1217) [skip ci]

36d053f
unverified

winglian commited on Jan 26

Update deps 202401 (#1204) [skip ci]

a01b998
unverified

winglian commited on Jan 25

upgrade deepspeed to 0.13.1 for mixtral fixes (#1189) [skip ci]

8a49309
unverified

winglian commited on Jan 24

Qwen2 (#1166)

f5a828a
unverified

winglian commited on Jan 22

Remove fused-dense-lib from requirements.txt (#1087)

91502b9
unverified

casperhansen commited on Jan 10

fix: warn user to install mamba_ssm package (#1019)

d69ba2b
unverified

Nanobit commited on Jan 10

pin accelerate for deepspeed fix (#1080)

9e3f0cb
unverified

winglian commited on Jan 10

Separate AutoGPTQ dep to `pip install -e .[auto-gptq]` (#1077)

9be92d1
unverified

casperhansen commited on Jan 9

paired kto support (#1069)

d7057cc
unverified

winglian commited on Jan 9

update peft to 0.7.0 (#1073)

768d348
unverified

marktenenholtz commited on Jan 9

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Johan Hansson

winglian commited on Jan 9

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8

RL/DPO (#935)

f243c21

winglian commited on Jan 4

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

winglian commited on Jan 3

chore: Update transformers to latest (#986)

7d4185f
unverified

Nanobit commited on Dec 22, 2023

update transformers to fix checkpoint saving (#963)

f28e755
unverified

dumpmemory commited on Dec 16, 2023

Mixtral official (#942)

7fabc4d
unverified

winglian commited on Dec 12, 2023

Update requirements.txt (#940)

9a5eb39
unverified

tokestermw commited on Dec 12, 2023

update to latest transformers for mixstral support (#929)

35f9b0f
unverified

winglian commited on Dec 10, 2023

update datasets version to cut down the warnings due to pyarrow arg change (#897)

6a4562a
unverified

winglian commited on Nov 25, 2023

try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867)

0de1457
unverified

winglian commited on Nov 16, 2023

Feat: Add dataset loading from S3, GCS (#765)

3cc67d2
unverified

Nanobit commited on Nov 16, 2023

add e2e tests for checking functionality of resume from checkpoint (#865)

b3a61e8
unverified

winglian commited on Nov 16, 2023

Pin optimum package (#838)

105d0b3
unverified

Bryan Thornbury commited on Nov 10, 2023

don't compile deepspeed or bitsandbytes from source (#837)

f544ab2
unverified

winglian commited on Nov 9, 2023

Feat: Added Gradio support (#812)

738a057
unverified

stillerman commited on Nov 5, 2023

fix: pin autogptq (#818)

6459ac7
unverified

Nanobit commited on Nov 3, 2023

chore: bump transformers to v4.34.1 to fix tokenizer issue (#745)

8966a6f
unverified

Nanobit commited on Oct 20, 2023

Commit History

Add support for Gemma chat template (#1530) 60f5ce0 unverified

ORPO Trainer replacement (#1551) 7d1d22f unverified

fix(packages): lock datasets version (#1545) 59ef254 unverified

DBRX Model Support (#1462) 132eb74 unverified

Pretrain multipack v2 (#1470) 5aa5097 unverified

qwen2_moe support w multipack (#1455) 6086be8 unverified

fix some of the edge cases for Jamba (#1452) 05b398a unverified

strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428) 2a1589f unverified

support galore once upstreamed into transformers (#1409) dd449c5 unverified

FDSP + QLoRA (#1378) 9b6ee83 unverified

update flash attention for gemma support: (#1368) 58b0d4b unverified

support for DoRA w/ PEFT (#1363) 0cfdb2c unverified

run tests again on Modal (#1289) [skip ci] 0001862 unverified

fix: checkpoint saving with deepspeed (#1321) 5be8b55 unverified

Pydantic 2.x cfg (#1239) cc3cebf unverified

make mlflow optional (#1317) 5894f0e unverified

multipack for gemma (#1313) 2752d5f unverified

Add seq2seq eval benchmark callback (#1274) 5a5d474 unverified

add support for https remote yamls (#1277) 9bca7db unverified

Peft deepspeed resume (#1227) c67fb71 unverified

Peft lotfq (#1222) 4cb7900 unverified

Revert "run PR e2e docker CI tests in Modal" (#1220) [skip ci] 8da1633 unverified

run PR e2e docker CI tests in Modal (#1217) [skip ci] 36d053f unverified

Update deps 202401 (#1204) [skip ci] a01b998 unverified

upgrade deepspeed to 0.13.1 for mixtral fixes (#1189) [skip ci] 8a49309 unverified

Qwen2 (#1166) f5a828a unverified

Remove fused-dense-lib from requirements.txt (#1087) 91502b9 unverified

fix: warn user to install mamba_ssm package (#1019) d69ba2b unverified

pin accelerate for deepspeed fix (#1080) 9e3f0cb unverified

Separate AutoGPTQ dep to `pip install -e .[auto-gptq]` (#1077) 9be92d1 unverified

paired kto support (#1069) d7057cc unverified

update peft to 0.7.0 (#1073) 768d348 unverified

Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified

Phi2 rewrite (#1058) 732851f unverified

RL/DPO (#935) f243c21

bump transformers and update attention class map name (#1023) bcc78d8 unverified

chore: Update transformers to latest (#986) 7d4185f unverified

update transformers to fix checkpoint saving (#963) f28e755 unverified

Mixtral official (#942) 7fabc4d unverified

Update requirements.txt (#940) 9a5eb39 unverified

update to latest transformers for mixstral support (#929) 35f9b0f unverified

update datasets version to cut down the warnings due to pyarrow arg change (#897) 6a4562a unverified

try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867) 0de1457 unverified

Feat: Add dataset loading from S3, GCS (#765) 3cc67d2 unverified

add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified

Pin optimum package (#838) 105d0b3 unverified

don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified

Feat: Added Gradio support (#812) 738a057 unverified

fix: pin autogptq (#818) 6459ac7 unverified

chore: bump transformers to v4.34.1 to fix tokenizer issue (#745) 8966a6f unverified