Commits · Dovakiins/qwerrwe

precompute dpo logprobs setting and fixes (#1199) [skip ci]

33e1170
unverified

winglian commited on Jan 25

fix learning rate scheduler's warnings (#1135) [skip ci]

b4ac96a
unverified

ricdomolm

winglian commited on Jan 25

more dpo fixes for dataset loading and docs (#1185) [skip ci]

5bce45f
unverified

winglian commited on Jan 24

DPO fixes v2 (#1174)

59a31fe
unverified

winglian commited on Jan 23

Phi2 multipack (#1173)

814aee6
unverified

winglian commited on Jan 23

DPO cleanup (#1126)

7523d1f
unverified

winglian

plaguss HF staff commited on Jan 23

Add mlflow callback for pushing config to mlflow artifacts (#1125)

b8e5603
unverified

JohanWork commited on Jan 22

jupyter lab fixes (#1139) [skip ci]

eaaeefc
unverified

winglian commited on Jan 22

Qwen2 (#1166)

f5a828a
unverified

winglian commited on Jan 22

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

winglian commited on Jan 18

swap the data collator for evals if not using sample packing (#1076)

ead34c5
unverified

winglian commited on Jan 10

paired kto support (#1069)

d7057cc
unverified

winglian commited on Jan 9

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Johan Hansson

winglian commited on Jan 9

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

ricdomolm

winglian commited on Jan 9

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

ricdomolm

winglian commited on Jan 8

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 jinwonkim93@github.com

winglian commited on Jan 6

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

Nanobit commited on Jan 5

RL/DPO (#935)

f243c21

winglian commited on Jan 4

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

winglian commited on Jan 2

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

winglian commited on Dec 28, 2023

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

Younes Belkada

winglian commited on Dec 27, 2023

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

Nanobit commited on Dec 17, 2023

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

dg-kalle commited on Dec 13, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

user735 Karl-Johan Alm commited on Dec 4, 2023

Feat: Add warmup_ratio (#893)

fb12895
unverified

Nanobit commited on Nov 25, 2023

don't train if eval split is too small (#873)

797f3dd
unverified

winglian commited on Nov 16, 2023

various bugfixes (#856)

1470650
unverified

winglian commited on Nov 15, 2023

cleanup the old multipack dataloader (#841)

1a6309c
unverified

winglian commited on Nov 12, 2023

multipack w batch sampler (#795)

641e6f7
unverified

winglian commited on Nov 8, 2023

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

casperhansen commited on Oct 26, 2023

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified

winglian commited on Oct 23, 2023

Spaces:

Dovakiins
/

qwerrwe

Build error

Commit History

precompute dpo logprobs setting and fixes (#1199) [skip ci]

33e1170
unverified

fix learning rate scheduler's warnings (#1135) [skip ci]

b4ac96a
unverified

more dpo fixes for dataset loading and docs (#1185) [skip ci]

5bce45f
unverified

DPO fixes v2 (#1174)

59a31fe
unverified

Phi2 multipack (#1173)

814aee6
unverified

DPO cleanup (#1126)

7523d1f
unverified

Add mlflow callback for pushing config to mlflow artifacts (#1125)

b8e5603
unverified

jupyter lab fixes (#1139) [skip ci]

eaaeefc
unverified

Qwen2 (#1166)

f5a828a
unverified

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

swap the data collator for evals if not using sample packing (#1076)

ead34c5
unverified

paired kto support (#1069)

d7057cc
unverified

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

RL/DPO (#935)

f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

don't train if eval split is too small (#873)

797f3dd
unverified

various bugfixes (#856)

1470650
unverified

cleanup the old multipack dataloader (#841)

1a6309c
unverified

multipack w batch sampler (#795)

641e6f7
unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified

Commit History

precompute dpo logprobs setting and fixes (#1199) [skip ci] 33e1170 unverified

fix learning rate scheduler's warnings (#1135) [skip ci] b4ac96a unverified

more dpo fixes for dataset loading and docs (#1185) [skip ci] 5bce45f unverified

DPO fixes v2 (#1174) 59a31fe unverified

Phi2 multipack (#1173) 814aee6 unverified

DPO cleanup (#1126) 7523d1f unverified

Add mlflow callback for pushing config to mlflow artifacts (#1125) b8e5603 unverified

jupyter lab fixes (#1139) [skip ci] eaaeefc unverified

Qwen2 (#1166) f5a828a unverified

Multipack simplify for Mixtral (#1142) 6910e6a unverified

swap the data collator for evals if not using sample packing (#1076) ead34c5 unverified

paired kto support (#1069) d7057cc unverified

Add: mlflow for experiment tracking (#1059) [skip ci] 090c24d unverified

Cosine learning rate schedule - minimum learning rate (#1062) 04b978b unverified

Efficiently get the length of the tokenized docs (#1063) 81d3845 unverified

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

feat: always push checkpoint to hub if set (#1049) [skip ci] cbdbf9e unverified

RL/DPO (#935) f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021) 4d2e842 unverified

remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified

FEAT: add tagging support to axolotl (#1004) db9094d unverified

fix: add lr scheduler kwargs to Trainer (#972) 13e9381 unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified

Feat: Add warmup_ratio (#893) fb12895 unverified

don't train if eval split is too small (#873) 797f3dd unverified

various bugfixes (#856) 1470650 unverified

cleanup the old multipack dataloader (#841) 1a6309c unverified

multipack w batch sampler (#795) 641e6f7 unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759) 05bd6f1 unverified

refactor setup trainer so we can add more hooks (#773) 6c81c61 unverified

precompute dpo logprobs setting and fixes (#1199) [skip ci]

33e1170
unverified

fix learning rate scheduler's warnings (#1135) [skip ci]

b4ac96a
unverified

more dpo fixes for dataset loading and docs (#1185) [skip ci]

5bce45f
unverified

DPO fixes v2 (#1174)

59a31fe
unverified

Phi2 multipack (#1173)

814aee6
unverified

DPO cleanup (#1126)

7523d1f
unverified

Add mlflow callback for pushing config to mlflow artifacts (#1125)

b8e5603
unverified

jupyter lab fixes (#1139) [skip ci]

eaaeefc
unverified

Qwen2 (#1166)

f5a828a
unverified

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

swap the data collator for evals if not using sample packing (#1076)

ead34c5
unverified

paired kto support (#1069)

d7057cc
unverified

Add: mlflow for experiment tracking (#1059) [skip ci]

090c24d
unverified

Cosine learning rate schedule - minimum learning rate (#1062)

04b978b
unverified

Efficiently get the length of the tokenized docs (#1063)

81d3845
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

feat: always push checkpoint to hub if set (#1049) [skip ci]

cbdbf9e
unverified

RL/DPO (#935)

f243c21

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

FEAT: add tagging support to axolotl (#1004)

db9094d
unverified

fix: add lr scheduler kwargs to Trainer (#972)

13e9381
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

don't train if eval split is too small (#873)

797f3dd
unverified

various bugfixes (#856)

1470650
unverified

cleanup the old multipack dataloader (#841)

1a6309c
unverified

multipack w batch sampler (#795)

641e6f7
unverified

Threaded MultipackDistributedDataloader with prefetched samples (#759)

05bd6f1
unverified

refactor setup trainer so we can add more hooks (#773)

6c81c61
unverified