Commits · Dovakiins/qwerrwe

Deprecate max packed sequence len (#1141)

2ce5c0d
unverified

winglian commited on Jan 20

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

winglian commited on Jan 18

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

jrc joecummings

winglian commited on Jan 18

Add `layers_to_transform` for `lora_config` (#1118)

8487b97
unverified

xzuyn commited on Jan 16

Enable or disable bf16 support based on availability (#1116)

0865613
unverified

Simon Hällqvist commited on Jan 14

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

winglian commited on Jan 12

add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083)

78c5b19
unverified

winglian commited on Jan 11

update sharegpt conversations when chatml chat template is set (#1075) [skip ci]

0ce1a65
unverified

winglian commited on Jan 10

fix: `train_on_inputs: true` ignored for sharegpt (#1045) [skip ci]

043c386
unverified

Nanobit

winglian commited on Jan 10

be more robust about checking embedding modules for lora finetunes (#1074) [skip ci]

0f10080
unverified

winglian commited on Jan 10

attempt to also run e2e tests that needs gpus (#1070)

788649f
unverified

winglian commited on Jan 10

fix double eos token for chatml (#1054) [skip ci]

651b7a3
unverified

winglian commited on Jan 9

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 jinwonkim93@github.com

winglian commited on Jan 6

RL/DPO (#935)

f243c21

winglian commited on Jan 4

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

winglian commited on Jan 3

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

Nanobit commited on Dec 22, 2023

fix mistral prompt assembly (#982)

7bbaac9
unverified

hamel commited on Dec 21, 2023

Fix prompt assembly for llama (#952)

5ada140
unverified

hamel

tokestermw commited on Dec 14, 2023

Respect sequence_len in config for `type: llama2_chat` (#926)

f1de29d
unverified

hamel commited on Dec 12, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

Feat: Add warmup_ratio (#893)

fb12895
unverified

Nanobit commited on Nov 25, 2023

Phi update 202311 (#876)

9bf854e
unverified

winglian commited on Nov 17, 2023

add e2e tests for checking functionality of resume from checkpoint (#865)

b3a61e8
unverified

winglian commited on Nov 16, 2023

use temp_dir kwarg instead

6dc68a6

winglian commited on Nov 6, 2023

missing dunder-init

7de6a56

winglian commited on Nov 6, 2023

chore: lint

c74f045

winglian commited on Nov 6, 2023

make sure to cleanup tmp output_dir for e2e tests

0402d19

winglian commited on Nov 5, 2023

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

winglian commited on Oct 23, 2023

Fix: Warn when fullfinetune without adapter (#770)

44c9d01
unverified

Nanobit commited on Oct 22, 2023

convert exponential notation lr to floats (#771)

ca84cca
unverified

winglian commited on Oct 22, 2023

Fix: eval table conflict with eval_sample_packing (#769)

9923b72
unverified

Nanobit commited on Oct 22, 2023

remove lora fused packing test (#758)

21cf09b
unverified

winglian commited on Oct 22, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

misc sharegpt fixes (#723)

f30afe4
unverified

winglian commited on Oct 13, 2023

Feat: Allow usage of native Mistral FA when no sample_packing (#669)

697c50d
unverified

Nanobit commited on Oct 4, 2023

add mistral e2e tests (#649)

5b0bc48
unverified

winglian commited on Sep 29, 2023

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)

383f88d
unverified

Nanobit commited on Sep 28, 2023

use fastchat conversations template (#578)

e7d3e2d
unverified

winglian commited on Sep 27, 2023

Fix: Fail bf16 check when running on cpu during merge (#631)

cfbce02
unverified

Nanobit commited on Sep 25, 2023

better handling and logging of empty sharegpt turns (#603)

a363604
unverified

winglian commited on Sep 22, 2023

misc fixes to add gptq tests (#621)

03e5907
unverified

winglian commited on Sep 22, 2023

Support Sample packing for phi arch (#586)

12a2dbb
unverified

winglian commited on Sep 15, 2023

E2e device cuda (#575)

2414673
unverified

winglian commited on Sep 15, 2023

e2e testing (#574)

9218ebe
unverified

winglian commited on Sep 15, 2023

Fix pretraining with iterable/streaming Dataset (#556)

2f586d1
unverified

Jan Philipp Harries Jan Philipp Harries commited on Sep 13, 2023

workaround for md5 variations (#533)

0b4cf5b
unverified

winglian commited on Sep 8, 2023

recommend padding when using sample packing (#531)

3437149
unverified

winglian commited on Sep 6, 2023

fix test fixture b/c hf trainer tokenization changed (#464)

d5dcf9c
unverified

winglian commited on Aug 23, 2023

Commit History

Deprecate max packed sequence len (#1141) 2ce5c0d unverified

Multipack simplify for Mixtral (#1142) 6910e6a unverified

Add shifted sparse attention (#973) [skip-ci] 1d70f24 unverified

Add `layers_to_transform` for `lora_config` (#1118) 8487b97 unverified

Enable or disable bf16 support based on availability (#1116) 0865613 unverified

keep gate in fp32 for 16 bit loras (#1105) da97285 unverified

add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083) 78c5b19 unverified

update sharegpt conversations when chatml chat template is set (#1075) [skip ci] 0ce1a65 unverified

fix: `train_on_inputs: true` ignored for sharegpt (#1045) [skip ci] 043c386 unverified

be more robust about checking embedding modules for lora finetunes (#1074) [skip ci] 0f10080 unverified

attempt to also run e2e tests that needs gpus (#1070) 788649f unverified

fix double eos token for chatml (#1054) [skip ci] 651b7a3 unverified

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

RL/DPO (#935) f243c21

bump transformers and update attention class map name (#1023) bcc78d8 unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787) 1ffa386 unverified

fix mistral prompt assembly (#982) 7bbaac9 unverified

Fix prompt assembly for llama (#952) 5ada140 unverified

Respect sequence_len in config for `type: llama2_chat` (#926) f1de29d unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

Feat: Add warmup_ratio (#893) fb12895 unverified

Phi update 202311 (#876) 9bf854e unverified

add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified

use temp_dir kwarg instead 6dc68a6

missing dunder-init 7de6a56

chore: lint c74f045

make sure to cleanup tmp output_dir for e2e tests 0402d19

simplify by removing duplicate base_model_config (#772) 2d8def6 unverified

Fix: Warn when fullfinetune without adapter (#770) 44c9d01 unverified

convert exponential notation lr to floats (#771) ca84cca unverified

Fix: eval table conflict with eval_sample_packing (#769) 9923b72 unverified

remove lora fused packing test (#758) 21cf09b unverified

Implement fused modules (#747) 15d3a65 unverified

misc sharegpt fixes (#723) f30afe4 unverified

Feat: Allow usage of native Mistral FA when no sample_packing (#669) 697c50d unverified

add mistral e2e tests (#649) 5b0bc48 unverified

Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified

use fastchat conversations template (#578) e7d3e2d unverified

Fix: Fail bf16 check when running on cpu during merge (#631) cfbce02 unverified

better handling and logging of empty sharegpt turns (#603) a363604 unverified

misc fixes to add gptq tests (#621) 03e5907 unverified

Support Sample packing for phi arch (#586) 12a2dbb unverified

E2e device cuda (#575) 2414673 unverified

e2e testing (#574) 9218ebe unverified

Fix pretraining with iterable/streaming Dataset (#556) 2f586d1 unverified

workaround for md5 variations (#533) 0b4cf5b unverified

recommend padding when using sample packing (#531) 3437149 unverified

fix test fixture b/c hf trainer tokenization changed (#464) d5dcf9c unverified

Deprecate max packed sequence len (#1141)

2ce5c0d
unverified

Multipack simplify for Mixtral (#1142)

6910e6a
unverified

Add shifted sparse attention (#973) [skip-ci]

1d70f24
unverified

Add `layers_to_transform` for `lora_config` (#1118)

8487b97
unverified

Enable or disable bf16 support based on availability (#1116)

0865613
unverified

keep gate in fp32 for 16 bit loras (#1105)

da97285
unverified

add gptneox embeddings, fix phi2 inputs, also fix the casting (#1083)

78c5b19
unverified

update sharegpt conversations when chatml chat template is set (#1075) [skip ci]

0ce1a65
unverified

fix: `train_on_inputs: true` ignored for sharegpt (#1045) [skip ci]

043c386
unverified

be more robust about checking embedding modules for lora finetunes (#1074) [skip ci]

0f10080
unverified

attempt to also run e2e tests that needs gpus (#1070)

788649f
unverified

fix double eos token for chatml (#1054) [skip ci]

651b7a3
unverified

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

RL/DPO (#935)

f243c21

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

fix mistral prompt assembly (#982)

7bbaac9
unverified

Fix prompt assembly for llama (#952)

5ada140
unverified

Respect sequence_len in config for `type: llama2_chat` (#926)

f1de29d
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

Phi update 202311 (#876)

9bf854e
unverified

add e2e tests for checking functionality of resume from checkpoint (#865)

b3a61e8
unverified

use temp_dir kwarg instead

6dc68a6

missing dunder-init

7de6a56

chore: lint

c74f045

make sure to cleanup tmp output_dir for e2e tests

0402d19

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

Fix: Warn when fullfinetune without adapter (#770)

44c9d01
unverified

convert exponential notation lr to floats (#771)

ca84cca
unverified

Fix: eval table conflict with eval_sample_packing (#769)

9923b72
unverified

remove lora fused packing test (#758)

21cf09b
unverified

Implement fused modules (#747)

15d3a65
unverified

misc sharegpt fixes (#723)

f30afe4
unverified

Feat: Allow usage of native Mistral FA when no sample_packing (#669)

697c50d
unverified

add mistral e2e tests (#649)

5b0bc48
unverified

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)

383f88d
unverified

use fastchat conversations template (#578)

e7d3e2d
unverified

Fix: Fail bf16 check when running on cpu during merge (#631)

cfbce02
unverified

better handling and logging of empty sharegpt turns (#603)

a363604
unverified

misc fixes to add gptq tests (#621)

03e5907
unverified

Support Sample packing for phi arch (#586)

12a2dbb
unverified

E2e device cuda (#575)

2414673
unverified

e2e testing (#574)

9218ebe
unverified

Fix pretraining with iterable/streaming Dataset (#556)

2f586d1
unverified

workaround for md5 variations (#533)

0b4cf5b
unverified

recommend padding when using sample packing (#531)

3437149
unverified

fix test fixture b/c hf trainer tokenization changed (#464)

d5dcf9c
unverified