Commits · Dovakiins/qwerrwe

Phi2 rewrite (#1058)

732851f
unverified

winglian commited on Jan 8

streaming multipack for pretraining dataset (#959)

553c80f
unverified

jinwonkim93 jinwonkim93@github.com

winglian commited on Jan 6

RL/DPO (#935)

f243c21

winglian commited on Jan 4

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

winglian commited on Jan 3

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

Nanobit commited on Dec 22, 2023

fix mistral prompt assembly (#982)

7bbaac9
unverified

hamel commited on Dec 21, 2023

Fix prompt assembly for llama (#952)

5ada140
unverified

hamel

tokestermw commited on Dec 14, 2023

Respect sequence_len in config for `type: llama2_chat` (#926)

f1de29d
unverified

hamel commited on Dec 12, 2023

support for mamba (#915)

40a6362
unverified

winglian commited on Dec 9, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

Feat: Add warmup_ratio (#893)

fb12895
unverified

Nanobit commited on Nov 25, 2023

Phi update 202311 (#876)

9bf854e
unverified

winglian commited on Nov 17, 2023

add e2e tests for checking functionality of resume from checkpoint (#865)

b3a61e8
unverified

winglian commited on Nov 16, 2023

use temp_dir kwarg instead

6dc68a6

winglian commited on Nov 6, 2023

missing dunder-init

7de6a56

winglian commited on Nov 6, 2023

chore: lint

c74f045

winglian commited on Nov 6, 2023

make sure to cleanup tmp output_dir for e2e tests

0402d19

winglian commited on Nov 5, 2023

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

winglian commited on Oct 23, 2023

Fix: Warn when fullfinetune without adapter (#770)

44c9d01
unverified

Nanobit commited on Oct 22, 2023

convert exponential notation lr to floats (#771)

ca84cca
unverified

winglian commited on Oct 22, 2023

Fix: eval table conflict with eval_sample_packing (#769)

9923b72
unverified

Nanobit commited on Oct 22, 2023

remove lora fused packing test (#758)

21cf09b
unverified

winglian commited on Oct 22, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

misc sharegpt fixes (#723)

f30afe4
unverified

winglian commited on Oct 13, 2023

Feat: Allow usage of native Mistral FA when no sample_packing (#669)

697c50d
unverified

Nanobit commited on Oct 4, 2023

add mistral e2e tests (#649)

5b0bc48
unverified

winglian commited on Sep 29, 2023

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)

383f88d
unverified

Nanobit commited on Sep 28, 2023

use fastchat conversations template (#578)

e7d3e2d
unverified

winglian commited on Sep 27, 2023

Fix: Fail bf16 check when running on cpu during merge (#631)

cfbce02
unverified

Nanobit commited on Sep 25, 2023

better handling and logging of empty sharegpt turns (#603)

a363604
unverified

winglian commited on Sep 22, 2023

misc fixes to add gptq tests (#621)

03e5907
unverified

winglian commited on Sep 22, 2023

Support Sample packing for phi arch (#586)

12a2dbb
unverified

winglian commited on Sep 15, 2023

E2e device cuda (#575)

2414673
unverified

winglian commited on Sep 15, 2023

e2e testing (#574)

9218ebe
unverified

winglian commited on Sep 15, 2023

Fix pretraining with iterable/streaming Dataset (#556)

2f586d1
unverified

Jan Philipp Harries Jan Philipp Harries commited on Sep 13, 2023

workaround for md5 variations (#533)

0b4cf5b
unverified

winglian commited on Sep 8, 2023

recommend padding when using sample packing (#531)

3437149
unverified

winglian commited on Sep 6, 2023

fix test fixture b/c hf trainer tokenization changed (#464)

d5dcf9c
unverified

winglian commited on Aug 23, 2023

fix fixture for new tokenizer handling in transformers (#428)

8cace80
unverified

winglian commited on Aug 17, 2023

simplify `load_tokenizer`

efb3b2c

tmm1 commited on Aug 13, 2023

extract module for working with cfg

8cec513

tmm1 commited on Aug 13, 2023

fix DefaultDict.or

a13e45d

tmm1 commited on Aug 10, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

experimental llama 2 chat support (#296)

3392270
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 6, 2023

update prompts for open orca to match the paper (#317)

3d4984b
unverified

winglian commited on Jul 22, 2023

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var

b1f4f7a

theobjectivedad commited on Jul 15, 2023

Adding logging enhancement

553a86b

theobjectivedad commited on Jul 14, 2023

params are adam_, not adamw_

19cf0bd

winglian commited on Jul 8, 2023

add tests and supoort for loader for sys prompt data

3a38271

winglian commited on Jun 18, 2023

initial wip to get sys prompt from dataset

8d20e0a

winglian commited on Jun 17, 2023

Commit History

Phi2 rewrite (#1058) 732851f unverified

streaming multipack for pretraining dataset (#959) 553c80f unverified

RL/DPO (#935) f243c21

bump transformers and update attention class map name (#1023) bcc78d8 unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787) 1ffa386 unverified

fix mistral prompt assembly (#982) 7bbaac9 unverified

Fix prompt assembly for llama (#952) 5ada140 unverified

Respect sequence_len in config for `type: llama2_chat` (#926) f1de29d unverified

support for mamba (#915) 40a6362 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

Feat: Add warmup_ratio (#893) fb12895 unverified

Phi update 202311 (#876) 9bf854e unverified

add e2e tests for checking functionality of resume from checkpoint (#865) b3a61e8 unverified

use temp_dir kwarg instead 6dc68a6

missing dunder-init 7de6a56

chore: lint c74f045

make sure to cleanup tmp output_dir for e2e tests 0402d19

simplify by removing duplicate base_model_config (#772) 2d8def6 unverified

Fix: Warn when fullfinetune without adapter (#770) 44c9d01 unverified

convert exponential notation lr to floats (#771) ca84cca unverified

Fix: eval table conflict with eval_sample_packing (#769) 9923b72 unverified

remove lora fused packing test (#758) 21cf09b unverified

Implement fused modules (#747) 15d3a65 unverified

misc sharegpt fixes (#723) f30afe4 unverified

Feat: Allow usage of native Mistral FA when no sample_packing (#669) 697c50d unverified

add mistral e2e tests (#649) 5b0bc48 unverified

Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified

use fastchat conversations template (#578) e7d3e2d unverified

Fix: Fail bf16 check when running on cpu during merge (#631) cfbce02 unverified

better handling and logging of empty sharegpt turns (#603) a363604 unverified

misc fixes to add gptq tests (#621) 03e5907 unverified

Support Sample packing for phi arch (#586) 12a2dbb unverified

E2e device cuda (#575) 2414673 unverified

e2e testing (#574) 9218ebe unverified

Fix pretraining with iterable/streaming Dataset (#556) 2f586d1 unverified

workaround for md5 variations (#533) 0b4cf5b unverified

recommend padding when using sample packing (#531) 3437149 unverified

fix test fixture b/c hf trainer tokenization changed (#464) d5dcf9c unverified

fix fixture for new tokenizer handling in transformers (#428) 8cace80 unverified

simplify `load_tokenizer` efb3b2c

extract module for working with cfg 8cec513

fix DefaultDict.__or__ a13e45d

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

experimental llama 2 chat support (#296) 3392270 unverified

update prompts for open orca to match the paper (#317) 3d4984b unverified

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var b1f4f7a

Adding logging enhancement 553a86b

params are adam_*, not adamw_* 19cf0bd

add tests and supoort for loader for sys prompt data 3a38271

initial wip to get sys prompt from dataset 8d20e0a

Phi2 rewrite (#1058)

732851f
unverified

streaming multipack for pretraining dataset (#959)

553c80f
unverified

RL/DPO (#935)

f243c21

bump transformers and update attention class map name (#1023)

bcc78d8
unverified

Feat: Warns to add to modules_to_save when adding tokens or switching special_tokens (#787)

1ffa386
unverified

fix mistral prompt assembly (#982)

7bbaac9
unverified

Fix prompt assembly for llama (#952)

5ada140
unverified

Respect sequence_len in config for `type: llama2_chat` (#926)

f1de29d
unverified

support for mamba (#915)

40a6362
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

Phi update 202311 (#876)

9bf854e
unverified

add e2e tests for checking functionality of resume from checkpoint (#865)

b3a61e8
unverified

use temp_dir kwarg instead

6dc68a6

missing dunder-init

7de6a56

chore: lint

c74f045

make sure to cleanup tmp output_dir for e2e tests

0402d19

simplify by removing duplicate base_model_config (#772)

2d8def6
unverified

Fix: Warn when fullfinetune without adapter (#770)

44c9d01
unverified

convert exponential notation lr to floats (#771)

ca84cca
unverified

Fix: eval table conflict with eval_sample_packing (#769)

9923b72
unverified

remove lora fused packing test (#758)

21cf09b
unverified

Implement fused modules (#747)

15d3a65
unverified

misc sharegpt fixes (#723)

f30afe4
unverified

Feat: Allow usage of native Mistral FA when no sample_packing (#669)

697c50d
unverified

add mistral e2e tests (#649)

5b0bc48
unverified

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)

383f88d
unverified

use fastchat conversations template (#578)

e7d3e2d
unverified

Fix: Fail bf16 check when running on cpu during merge (#631)

cfbce02
unverified

better handling and logging of empty sharegpt turns (#603)

a363604
unverified

misc fixes to add gptq tests (#621)

03e5907
unverified

Support Sample packing for phi arch (#586)

12a2dbb
unverified

E2e device cuda (#575)

2414673
unverified

e2e testing (#574)

9218ebe
unverified

Fix pretraining with iterable/streaming Dataset (#556)

2f586d1
unverified

workaround for md5 variations (#533)

0b4cf5b
unverified

recommend padding when using sample packing (#531)

3437149
unverified

fix test fixture b/c hf trainer tokenization changed (#464)

d5dcf9c
unverified

fix fixture for new tokenizer handling in transformers (#428)

8cace80
unverified

simplify `load_tokenizer`

efb3b2c

extract module for working with cfg

8cec513

fix DefaultDict.or

a13e45d

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

experimental llama 2 chat support (#296)

3392270
unverified

update prompts for open orca to match the paper (#317)

3d4984b
unverified

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var

b1f4f7a

Adding logging enhancement

553a86b

params are adam_, not adamw_

19cf0bd

add tests and supoort for loader for sys prompt data

3a38271

initial wip to get sys prompt from dataset

8d20e0a