Commits · Dovakiins/qwerrwe

lora+ support (#1352)

decb66e
unverified

winglian commited on Mar 5

add lion-pytorch optimizer (#1299) [skip ci]

1648279
unverified

Maxime

winglian commited on Feb 26

make mlflow optional (#1317)

5894f0e
unverified

winglian commited on Feb 26

Allow load_best_model_at_end to be configured for early stopping on custom evaluation datasets (#1291)

3c00f40
unverified

David Meikle commited on Feb 21

Add seq2seq eval benchmark callback (#1274)

5a5d474
unverified

LeonardoEmili commited on Feb 13

Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273)

8430db2
unverified

jinwonkim93 commited on Feb 13

allow the optimizer prune ratio for ReLoRA to be configurable (#1287)

4b997c3
unverified

winglian commited on Feb 12

simplify haldning for newer multipack patches so they can be added in a single place (#1270)

5698943
unverified

winglian commited on Feb 7

Add more save strategies for DPO training. (#1255)

13eea21
unverified

Philip May commited on Feb 6

relora: magnitude pruning of the optimizer (#1245)

8c2e05a
unverified

winglian commited on Feb 6

support for true batches with multipack (#1230)

00568c1
unverified

winglian commited on Feb 1

Fix and document test_datasets (#1228)

5787e1a
unverified

DreamGenX

winglian commited on Jan 31

FEAT: add tagging support to axolotl for DPOTrainer (#1209)

18f8119
unverified

Filippo Broggini

winglian commited on Jan 27

precompute dpo logprobs setting and fixes (#1199) [skip ci]

33e1170
unverified

winglian commited on Jan 25

fix learning rate scheduler's warnings (#1135) [skip ci]

b4ac96a
unverified

ricdomolm

winglian commited on Jan 25

more dpo fixes for dataset loading and docs (#1185) [skip ci]

5bce45f
unverified

winglian commited on Jan 24

DPO fixes v2 (#1174)

59a31fe
unverified

winglian commited on Jan 23

Phi2 multipack (#1173)

814aee6
unverified

winglian commited on Jan 23

DPO cleanup (#1126)

7523d1f
unverified

winglian

plaguss HF staff commited on Jan 23

Add mlflow callback for pushing config to mlflow artifacts (#1125)

b8e5603
unverified

JohanWork commited on Jan 22

jupyter lab fixes (#1139) [skip ci]

eaaeefc
unverified

winglian commited on Jan 22

Qwen2 (#1166)

f5a828a
unverified

winglian commited on Jan 22

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

winglian commited on Jan 18

swap the data collator for evals if not using sample packing (#1076)

ead34c5
unverified

winglian commited on Jan 10

paired kto support (#1069)

d7057cc
unverified

winglian commited on Jan 9

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Johan Hansson

winglian commited on Jan 9

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

ricdomolm

winglian commited on Jan 9

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

ricdomolm

winglian commited on Jan 8

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 jinwonkim93@github.com

winglian commited on Jan 6

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

Nanobit commited on Jan 5

RL/DPO (#935)

f243c21

winglian commited on Jan 4

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

winglian commited on Jan 2

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

winglian commited on Dec 28, 2023

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

Younes Belkada

winglian commited on Dec 27, 2023

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

Nanobit commited on Dec 17, 2023

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

dg-kalle commited on Dec 13, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

user735 Karl-Johan Alm commited on Dec 4, 2023

Feat: Add warmup_ratio (#893)

fb12895
unverified

Nanobit commited on Nov 25, 2023

don't train if eval split is too small (#873)

797f3dd
unverified

winglian commited on Nov 16, 2023

various bugfixes (#856)

1470650
unverified

winglian commited on Nov 15, 2023

cleanup the old multipack dataloader (#841)

1a6309c
unverified

winglian commited on Nov 12, 2023

multipack w batch sampler (#795)

641e6f7
unverified

winglian commited on Nov 8, 2023

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

casperhansen commited on Oct 26, 2023

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified

winglian commited on Oct 23, 2023

Commit History

lora+ support (#1352) decb66e unverified

add lion-pytorch optimizer (#1299) [skip ci] 1648279 unverified

make mlflow optional (#1317) 5894f0e unverified

Allow load_best_model_at_end to be configured for early stopping on custom evaluation datasets (#1291) 3c00f40 unverified

Add seq2seq eval benchmark callback (#1274) 5a5d474 unverified

Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273) 8430db2 unverified

allow the optimizer prune ratio for ReLoRA to be configurable (#1287) 4b997c3 unverified

simplify haldning for newer multipack patches so they can be added in a single place (#1270) 5698943 unverified

Add more save strategies for DPO training. (#1255) 13eea21 unverified

relora: magnitude pruning of the optimizer (#1245) 8c2e05a unverified

support for true batches with multipack (#1230) 00568c1 unverified

Fix and document test_datasets (#1228) 5787e1a unverified

FEAT: add tagging support to axolotl for DPOTrainer (#1209) 18f8119 unverified

precompute dpo logprobs setting and fixes (#1199) [skip ci] 33e1170 unverified

fix learning rate scheduler's warnings (#1135) [skip ci] b4ac96a unverified

more dpo fixes for dataset loading and docs (#1185) [skip ci] 5bce45f unverified

DPO fixes v2 (#1174) 59a31fe unverified

Phi2 multipack (#1173) 814aee6 unverified

DPO cleanup (#1126) 7523d1f unverified

Add mlflow callback for pushing config to mlflow artifacts (#1125) b8e5603 unverified

jupyter lab fixes (#1139) [skip ci] eaaeefc unverified

Qwen2 (#1166) f5a828a unverified

Multipack simplify for Mixtral (#1142) 6910e6a unverified

swap the data collator for evals if not using sample packing (#1076) ead34c5 unverified

paired kto support (#1069) d7057cc unverified

Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified

Cosine learning rate schedule - minimum learning rate (#1062) 04b978b unverified

Efficiently get the length of the tokenized docs (#1063) 81d3845 unverified

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

feat: always push checkpoint to hub if set (#1049) [skip ci] cbdbf9e unverified

RL/DPO (#935) f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021) 4d2e842 unverified

remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified

FEAT: add tagging support to axolotl (#1004) db9094d unverified

fix: add lr scheduler kwargs to Trainer (#972) 13e9381 unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified

Feat: Add warmup_ratio (#893) fb12895 unverified

don't train if eval split is too small (#873) 797f3dd unverified

various bugfixes (#856) 1470650 unverified

cleanup the old multipack dataloader (#841) 1a6309c unverified

multipack w batch sampler (#795) 641e6f7 unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759) 05bd6f1 unverified

refactor setup trainer so we can add more hooks (#773) 6c81c61 unverified

lora+ support (#1352)

decb66e
unverified

add lion-pytorch optimizer (#1299) [skip ci]

1648279
unverified

make mlflow optional (#1317)

5894f0e
unverified

Allow load_best_model_at_end to be configured for early stopping on custom evaluation datasets (#1291)

3c00f40
unverified

Add seq2seq eval benchmark callback (#1274)

5a5d474
unverified

Scheduler implementation of Continual Pre-Training of Large Language Models: How to (re)warm your model? (#1273)

8430db2
unverified

allow the optimizer prune ratio for ReLoRA to be configurable (#1287)

4b997c3
unverified

simplify haldning for newer multipack patches so they can be added in a single place (#1270)

5698943
unverified

Add more save strategies for DPO training. (#1255)

13eea21
unverified

relora: magnitude pruning of the optimizer (#1245)

8c2e05a
unverified

support for true batches with multipack (#1230)

00568c1
unverified

Fix and document test_datasets (#1228)

5787e1a
unverified

FEAT: add tagging support to axolotl for DPOTrainer (#1209)

18f8119
unverified

precompute dpo logprobs setting and fixes (#1199) [skip ci]

33e1170
unverified

fix learning rate scheduler's warnings (#1135) [skip ci]

b4ac96a
unverified

more dpo fixes for dataset loading and docs (#1185) [skip ci]

5bce45f
unverified

DPO fixes v2 (#1174)

59a31fe
unverified

Phi2 multipack (#1173)

814aee6
unverified

DPO cleanup (#1126)

7523d1f
unverified

Add mlflow callback for pushing config to mlflow artifacts (#1125)

b8e5603
unverified

jupyter lab fixes (#1139) [skip ci]

eaaeefc
unverified

Qwen2 (#1166)

f5a828a
unverified

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

swap the data collator for evals if not using sample packing (#1076)

ead34c5
unverified

paired kto support (#1069)

d7057cc
unverified

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

RL/DPO (#935)

f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

don't train if eval split is too small (#873)

797f3dd
unverified

various bugfixes (#856)

1470650
unverified

cleanup the old multipack dataloader (#841)

1a6309c
unverified

multipack w batch sampler (#795)

641e6f7
unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified