Commits · Dovakiins/qwerrwe

Upload Dockerfile

65c38b0
verified

Dovakiins commited on Jun 20

Delete Dockerfile

fa92474
verified

Dovakiins commited on Jun 20

Rename Dockerfile-cloud to Dockerfile

8f9180d
verified

Dovakiins commited on Jun 20

Upload Dockerfile-cloud

58a9a38
verified

Dovakiins commited on Jun 20

Update README.md

9ea8e9a
verified

Dovakiins commited on Jun 20

Update README.md

d304116
verified

Dovakiins commited on Jun 20

drop length column for issues with eval without packing (#1711)

3f1f5e3
unverified

winglian commited on Jun 19

download model weights on preprocess step (#1693)

5783839
unverified

winglian commited on Jun 10

verbose failure message (#1694)

cbbf039
unverified

winglian commited on Jun 10

bump deepspeed for fix for grad norm compute putting tensors on different devices (#1699)

851ccb1
unverified

winglian commited on Jun 9

fix for when sample_packing and eval_sample_packing are different (#1695)

18cabc0
unverified

winglian commited on Jun 8

add back packing efficiency estimate so epochs and multi-gpu works properly (#1697)

ed8ef65
unverified

winglian commited on Jun 8

add qwen2-72b fsdp example (#1696)

00ac302
unverified

winglian commited on Jun 7

ensure explicit eval_sample_packing to avoid mismatch issues (#1692)

9c1af1a
unverified

winglian commited on Jun 7

Create phi3-ft-fsdp.yml (#1580)

a82a711
unverified

aaditya commited on Jun 4

Phi-3 conversation format, example training script and perplexity metric (#1582)

cf64284
unverified

roborovski

winglian commited on Jun 4

add support for rpo_alpha (#1681)

c996881
unverified

winglian commited on Jun 4

re-enable DPO for tests in modal ci (#1374)

1f151c0
unverified

winglian commited on Jun 3

Fix the broken link in README (#1678) [skip ci]

5cde065
unverified

saeedesmaili commited on Jun 3

need to add back drop_last for sampler (#1676)

05b0bd0
unverified

winglian commited on May 31

cleanup the deepspeed proxy model at the end of training (#1675)

d4f6c65
unverified

winglian commited on May 30

load explicit splits on datasets (#1652)

a944f7b
unverified

winglian commited on May 30

set chat_template in datasets config automatically (#1664)

9d4225a
unverified

winglian commited on May 30

use mixins for orpo and kto configs so they work with axolotl customizations (#1674)

f7332ac
unverified

winglian commited on May 30

re-enable phi for tests in modal ci (#1373)

16d46b7
unverified

winglian commited on May 29

revert multipack batch sampler changes (#1672)

a6b37bd
unverified

winglian commited on May 29

handle the system role too for chat templates (#1671)

b752080
unverified

winglian commited on May 29

make sure the CI fails when pytest script fails (#1669)

fe650dd
unverified

winglian commited on May 29

Fix README quick start example usage model dirs (#1668)

49b967b
unverified

Abe Voelker commited on May 28

Correct name of MixtralBlockSparseTop2MLP (L -> l) (#1667)

65db903
unverified

seungduk commited on May 28

Fix: ensure correct handling of `val_set_size` as `float` or `int` (#1655)

6a5a725
unverified

Davide Caroselli

winglian commited on May 28

fix lint issue that snuck through (#1665)

f5febc7
unverified

winglian commited on May 28

Fix Lora config error for Llama3 (#1659)

230e0ac
unverified

oaishi commited on May 28

Generalizing the chat_template prompt strategy (#1660) [skip ci]

cc11c6b
unverified

fozziethebeat commited on May 28

Fix Google Colab notebook 2024-05 (#1662) [skip ci]

5f91064
unverified

Maciek commited on May 28

update deps (#1663) [skip ci]

ef22351
unverified

winglian commited on May 28

document how to use `share_strategy="no"` (#1653) [skip ci]

8a20a7b
unverified

charlesfrye commited on May 24

Switch to parallel FFD bin packing algorithm. (#1619)

367b2e8
unverified

winglian

daaave commited on May 23

support for custom messages field in sharegpt (#1651)

bbfed31
unverified

winglian commited on May 23

Update tiny-llama qlora.yml addressing eval packing error (#1638)

84bb806
unverified

Jaydeep Thik commited on May 22

enable loraplus setting for dpo trainer (#1646)

a27d5e1
unverified

thepowerfuldeez commited on May 22

allow report_to for multiple providers (#1647)

6299eb5
unverified

winglian commited on May 22

Fix llama3 chat_template (extra <|eot_id|> on last turn) (#1635)

7c2bf30
unverified

leonardlin

winglian commited on May 21

Add KTO support (#1640)

22ae21a
unverified

benredmond

winglian commited on May 20

fixes to save on fractional save_steps (#1643)

ba45531
unverified

winglian commited on May 20

Unsloth optims for Llama (#1609)

8a1572a
unverified

winglian commited on May 20

add save_only_model option (#1634)

702a669
unverified

emozilla commited on May 17

fix ray install (#1630)

891ae8a
unverified

winglian commited on May 16

more fixes to work with runpod + skypilot (#1629)

0c49ecc
unverified

winglian commited on May 16

cloud image w/o tmux (#1628)

6011343
unverified

winglian commited on May 16

Commit History

Upload Dockerfile 65c38b0 verified

Delete Dockerfile fa92474 verified

Rename Dockerfile-cloud to Dockerfile 8f9180d verified

Upload Dockerfile-cloud 58a9a38 verified

Update README.md 9ea8e9a verified

Update README.md d304116 verified

drop length column for issues with eval without packing (#1711) 3f1f5e3 unverified

download model weights on preprocess step (#1693) 5783839 unverified

verbose failure message (#1694) cbbf039 unverified

bump deepspeed for fix for grad norm compute putting tensors on different devices (#1699) 851ccb1 unverified

fix for when sample_packing and eval_sample_packing are different (#1695) 18cabc0 unverified

add back packing efficiency estimate so epochs and multi-gpu works properly (#1697) ed8ef65 unverified

add qwen2-72b fsdp example (#1696) 00ac302 unverified

ensure explicit eval_sample_packing to avoid mismatch issues (#1692) 9c1af1a unverified

Create phi3-ft-fsdp.yml (#1580) a82a711 unverified

Phi-3 conversation format, example training script and perplexity metric (#1582) cf64284 unverified

add support for rpo_alpha (#1681) c996881 unverified

re-enable DPO for tests in modal ci (#1374) 1f151c0 unverified

Fix the broken link in README (#1678) [skip ci] 5cde065 unverified

need to add back drop_last for sampler (#1676) 05b0bd0 unverified

cleanup the deepspeed proxy model at the end of training (#1675) d4f6c65 unverified

load explicit splits on datasets (#1652) a944f7b unverified

set chat_template in datasets config automatically (#1664) 9d4225a unverified

use mixins for orpo and kto configs so they work with axolotl customizations (#1674) f7332ac unverified

re-enable phi for tests in modal ci (#1373) 16d46b7 unverified

revert multipack batch sampler changes (#1672) a6b37bd unverified

handle the system role too for chat templates (#1671) b752080 unverified

make sure the CI fails when pytest script fails (#1669) fe650dd unverified

Fix README quick start example usage model dirs (#1668) 49b967b unverified

Correct name of MixtralBlockSparseTop2MLP (L -> l) (#1667) 65db903 unverified

Fix: ensure correct handling of `val_set_size` as `float` or `int` (#1655) 6a5a725 unverified

fix lint issue that snuck through (#1665) f5febc7 unverified

Fix Lora config error for Llama3 (#1659) 230e0ac unverified

Generalizing the chat_template prompt strategy (#1660) [skip ci] cc11c6b unverified

Fix Google Colab notebook 2024-05 (#1662) [skip ci] 5f91064 unverified

update deps (#1663) [skip ci] ef22351 unverified

document how to use `share_strategy="no"` (#1653) [skip ci] 8a20a7b unverified

Switch to parallel FFD bin packing algorithm. (#1619) 367b2e8 unverified

support for custom messages field in sharegpt (#1651) bbfed31 unverified

Update tiny-llama qlora.yml addressing eval packing error (#1638) 84bb806 unverified

enable loraplus setting for dpo trainer (#1646) a27d5e1 unverified

allow report_to for multiple providers (#1647) 6299eb5 unverified

Fix llama3 chat_template (extra <|eot_id|> on last turn) (#1635) 7c2bf30 unverified

Add KTO support (#1640) 22ae21a unverified

fixes to save on fractional save_steps (#1643) ba45531 unverified

Unsloth optims for Llama (#1609) 8a1572a unverified

add save_only_model option (#1634) 702a669 unverified

fix ray install (#1630) 891ae8a unverified

more fixes to work with runpod + skypilot (#1629) 0c49ecc unverified

cloud image w/o tmux (#1628) 6011343 unverified

Upload Dockerfile

65c38b0
verified

Delete Dockerfile

fa92474
verified

Rename Dockerfile-cloud to Dockerfile

8f9180d
verified

Upload Dockerfile-cloud

58a9a38
verified

Update README.md

9ea8e9a
verified

Update README.md

d304116
verified

drop length column for issues with eval without packing (#1711)

3f1f5e3
unverified

download model weights on preprocess step (#1693)

5783839
unverified

verbose failure message (#1694)

cbbf039
unverified

bump deepspeed for fix for grad norm compute putting tensors on different devices (#1699)

851ccb1
unverified

fix for when sample_packing and eval_sample_packing are different (#1695)

18cabc0
unverified

add back packing efficiency estimate so epochs and multi-gpu works properly (#1697)

ed8ef65
unverified

add qwen2-72b fsdp example (#1696)

00ac302
unverified

ensure explicit eval_sample_packing to avoid mismatch issues (#1692)

9c1af1a
unverified

Create phi3-ft-fsdp.yml (#1580)

a82a711
unverified

Phi-3 conversation format, example training script and perplexity metric (#1582)

cf64284
unverified

add support for rpo_alpha (#1681)

c996881
unverified

re-enable DPO for tests in modal ci (#1374)

1f151c0
unverified

Fix the broken link in README (#1678) [skip ci]

5cde065
unverified

need to add back drop_last for sampler (#1676)

05b0bd0
unverified

cleanup the deepspeed proxy model at the end of training (#1675)

d4f6c65
unverified

load explicit splits on datasets (#1652)

a944f7b
unverified

set chat_template in datasets config automatically (#1664)

9d4225a
unverified

use mixins for orpo and kto configs so they work with axolotl customizations (#1674)

f7332ac
unverified

re-enable phi for tests in modal ci (#1373)

16d46b7
unverified

revert multipack batch sampler changes (#1672)

a6b37bd
unverified

handle the system role too for chat templates (#1671)

b752080
unverified

make sure the CI fails when pytest script fails (#1669)

fe650dd
unverified

Fix README quick start example usage model dirs (#1668)

49b967b
unverified

Correct name of MixtralBlockSparseTop2MLP (L -> l) (#1667)

65db903
unverified

Fix: ensure correct handling of `val_set_size` as `float` or `int` (#1655)

6a5a725
unverified

fix lint issue that snuck through (#1665)

f5febc7
unverified

Fix Lora config error for Llama3 (#1659)

230e0ac
unverified

Generalizing the chat_template prompt strategy (#1660) [skip ci]

cc11c6b
unverified

Fix Google Colab notebook 2024-05 (#1662) [skip ci]

5f91064
unverified

update deps (#1663) [skip ci]

ef22351
unverified

document how to use `share_strategy="no"` (#1653) [skip ci]

8a20a7b
unverified

Switch to parallel FFD bin packing algorithm. (#1619)

367b2e8
unverified

support for custom messages field in sharegpt (#1651)

bbfed31
unverified

Update tiny-llama qlora.yml addressing eval packing error (#1638)

84bb806
unverified

enable loraplus setting for dpo trainer (#1646)

a27d5e1
unverified

allow report_to for multiple providers (#1647)

6299eb5
unverified

Fix llama3 chat_template (extra <|eot_id|> on last turn) (#1635)

7c2bf30
unverified

Add KTO support (#1640)

22ae21a
unverified

fixes to save on fractional save_steps (#1643)

ba45531
unverified

Unsloth optims for Llama (#1609)

8a1572a
unverified

add save_only_model option (#1634)

702a669
unverified

fix ray install (#1630)

891ae8a
unverified

more fixes to work with runpod + skypilot (#1629)

0c49ecc
unverified

cloud image w/o tmux (#1628)

6011343
unverified