Commits · Dovakiins/qwerrwe

swap batch size for gradient accumulation steps to decouple from num gpu

c2a0792

winglian commited on May 31, 2023

Update wandb_log_model on vicuna_13B_4bit_reflect.yml

e0ccacc
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on cerebras_1_3B_alpaca.yml

b6a539b
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on pythia_1_2B_alpaca.yml

abddcf4
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on llama_7B_jeopardy.yml

15aabd2
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on llama_65B_alpaca.yml

232b931
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on llama_13B_alpaca.yml

0736f4f
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on llama_7B_alpaca.yml

d77d736
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on galactica_1_3B.yml

2aacf75
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on llama_7B_4bit.yml

7187134
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on stability_3b.yml

0d14e95
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on gpt_neox_20b.yml

84fc217
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on quickstart.yml

f317296
unverified

Viktorius Suwandi commited on May 29, 2023

Update wandb_log_model on sample.yml

42a971d
unverified

Viktorius Suwandi commited on May 29, 2023

refactor(param): rename load_4bit config param by gptq

dd00657

Thytu commited on May 27, 2023

fix config for parity with previous change

165da58

winglian commited on May 11, 2023

Jeopardy bot! (#17)

a12fb0a
unverified

winglian commited on May 8, 2023

update stablelm config

4818380

winglian commited on May 7, 2023

fix dataset handling, support galactica

4a17a4c

winglian commited on Apr 24, 2023

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

winglian commited on Apr 22, 2023

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release

8d43785

winglian commited on Apr 20, 2023

various bugfixes

94f5e41

winglian commited on Apr 19, 2023

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

winglian commited on Apr 18, 2023

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified

winglian commited on Apr 18, 2023

add support for alpaca reflect training (#2)

81de0ef
unverified

winglian commited on Apr 18, 2023

fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets

87e073d

winglian commited on Apr 17, 2023

4bit quantized support (wip)

77fca25

winglian commited on Apr 17, 2023

deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches

d1aed4c

winglian commited on Apr 16, 2023

add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug)

d060c80

winglian commited on Apr 15, 2023

more logging, wandb fixes

05fffb5

winglian commited on Apr 15, 2023

improve prepared dataset loading, fix inference

b164725

winglian commited on Apr 15, 2023

helpful info output

937f44f

winglian commited on Apr 15, 2023

various bugfixes

80b2ed2

winglian commited on Apr 15, 2023

more fixes and prep for llama training

949a27b

winglian commited on Apr 14, 2023

config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes

f2a2029

winglian commited on Apr 14, 2023

make it work with pythia in the cloud

8d959a7

winglian commited on Apr 14, 2023

WIP for axolotl trainer

ce24f5e

winglian commited on Apr 14, 2023

Spaces:

Dovakiins
/

qwerrwe

Build error

Commit History

swap batch size for gradient accumulation steps to decouple from num gpu

c2a0792

Update wandb_log_model on vicuna_13B_4bit_reflect.yml

e0ccacc
unverified

Update wandb_log_model on cerebras_1_3B_alpaca.yml

b6a539b
unverified

Update wandb_log_model on pythia_1_2B_alpaca.yml

abddcf4
unverified

Update wandb_log_model on llama_7B_jeopardy.yml

15aabd2
unverified

Update wandb_log_model on llama_65B_alpaca.yml

232b931
unverified

Update wandb_log_model on llama_13B_alpaca.yml

0736f4f
unverified

Update wandb_log_model on llama_7B_alpaca.yml

d77d736
unverified

Update wandb_log_model on galactica_1_3B.yml

2aacf75
unverified

Update wandb_log_model on llama_7B_4bit.yml

7187134
unverified

Update wandb_log_model on stability_3b.yml

0d14e95
unverified

Update wandb_log_model on gpt_neox_20b.yml

84fc217
unverified

Update wandb_log_model on quickstart.yml

f317296
unverified

Update wandb_log_model on sample.yml

42a971d
unverified

refactor(param): rename load_4bit config param by gptq

dd00657

fix config for parity with previous change

165da58

Jeopardy bot! (#17)

a12fb0a
unverified

update stablelm config

4818380

fix dataset handling, support galactica

4a17a4c

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release

8d43785

various bugfixes

94f5e41

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified

add support for alpaca reflect training (#2)

81de0ef
unverified

fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets

87e073d

4bit quantized support (wip)

77fca25

deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches

d1aed4c

add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug)

d060c80

more logging, wandb fixes

05fffb5

improve prepared dataset loading, fix inference

b164725

helpful info output

937f44f

various bugfixes

80b2ed2

more fixes and prep for llama training

949a27b

config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes

f2a2029

make it work with pythia in the cloud

8d959a7

WIP for axolotl trainer

ce24f5e

Commit History

swap batch size for gradient accumulation steps to decouple from num gpu c2a0792

Update wandb_log_model on vicuna_13B_4bit_reflect.yml e0ccacc unverified

Update wandb_log_model on cerebras_1_3B_alpaca.yml b6a539b unverified

Update wandb_log_model on pythia_1_2B_alpaca.yml abddcf4 unverified

Update wandb_log_model on llama_7B_jeopardy.yml 15aabd2 unverified

Update wandb_log_model on llama_65B_alpaca.yml 232b931 unverified

Update wandb_log_model on llama_13B_alpaca.yml 0736f4f unverified

Update wandb_log_model on llama_7B_alpaca.yml d77d736 unverified

Update wandb_log_model on galactica_1_3B.yml 2aacf75 unverified

Update wandb_log_model on llama_7B_4bit.yml 7187134 unverified

Update wandb_log_model on stability_3b.yml 0d14e95 unverified

Update wandb_log_model on gpt_neox_20b.yml 84fc217 unverified

Update wandb_log_model on quickstart.yml f317296 unverified

Update wandb_log_model on sample.yml 42a971d unverified

refactor(param): rename load_4bit config param by gptq dd00657

fix config for parity with previous change 165da58

Jeopardy bot! (#17) a12fb0a unverified

update stablelm config 4818380

fix dataset handling, support galactica 4a17a4c

tweaks to data loading, 8 bit adam, accelerate and deepspeed 097d367

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release 8d43785

various bugfixes 94f5e41

quickstart instructions for starting from runpod (#5) 0a472e1 unverified

WIP large refactor to make finetune script a little more manageable (#3) 6045345 unverified

add support for alpaca reflect training (#2) 81de0ef unverified

fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets 87e073d

4bit quantized support (wip) 77fca25

deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches d1aed4c

add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug) d060c80

more logging, wandb fixes 05fffb5

improve prepared dataset loading, fix inference b164725

helpful info output 937f44f

various bugfixes 80b2ed2

more fixes and prep for llama training 949a27b

config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes f2a2029

make it work with pythia in the cloud 8d959a7

WIP for axolotl trainer ce24f5e

swap batch size for gradient accumulation steps to decouple from num gpu

c2a0792

Update wandb_log_model on vicuna_13B_4bit_reflect.yml

e0ccacc
unverified

Update wandb_log_model on cerebras_1_3B_alpaca.yml

b6a539b
unverified

Update wandb_log_model on pythia_1_2B_alpaca.yml

abddcf4
unverified

Update wandb_log_model on llama_7B_jeopardy.yml

15aabd2
unverified

Update wandb_log_model on llama_65B_alpaca.yml

232b931
unverified

Update wandb_log_model on llama_13B_alpaca.yml

0736f4f
unverified

Update wandb_log_model on llama_7B_alpaca.yml

d77d736
unverified

Update wandb_log_model on galactica_1_3B.yml

2aacf75
unverified

Update wandb_log_model on llama_7B_4bit.yml

7187134
unverified

Update wandb_log_model on stability_3b.yml

0d14e95
unverified

Update wandb_log_model on gpt_neox_20b.yml

84fc217
unverified

Update wandb_log_model on quickstart.yml

f317296
unverified

Update wandb_log_model on sample.yml

42a971d
unverified

refactor(param): rename load_4bit config param by gptq

dd00657

fix config for parity with previous change

165da58

Jeopardy bot! (#17)

a12fb0a
unverified

update stablelm config

4818380

fix dataset handling, support galactica

4a17a4c

tweaks to data loading, 8 bit adam, accelerate and deepspeed

097d367

fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release

8d43785

various bugfixes

94f5e41

quickstart instructions for starting from runpod (#5)

0a472e1
unverified

WIP large refactor to make finetune script a little more manageable (#3)

6045345
unverified

add support for alpaca reflect training (#2)

81de0ef
unverified

fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets

87e073d

4bit quantized support (wip)

77fca25

deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches

d1aed4c

add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug)

d060c80

more logging, wandb fixes

05fffb5

improve prepared dataset loading, fix inference

b164725

helpful info output

937f44f

various bugfixes

80b2ed2

more fixes and prep for llama training

949a27b

config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes

f2a2029

make it work with pythia in the cloud

8d959a7

WIP for axolotl trainer

ce24f5e