swap batch size for gradient accumulation steps to decouple from num gpu c2a0792 winglian commited on May 31, 2023
Update wandb_log_model on vicuna_13B_4bit_reflect.yml e0ccacc unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on cerebras_1_3B_alpaca.yml b6a539b unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on pythia_1_2B_alpaca.yml abddcf4 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on llama_7B_jeopardy.yml 15aabd2 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on llama_65B_alpaca.yml 232b931 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on llama_13B_alpaca.yml 0736f4f unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on llama_7B_alpaca.yml d77d736 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on galactica_1_3B.yml 2aacf75 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on llama_7B_4bit.yml 7187134 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on stability_3b.yml 0d14e95 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on gpt_neox_20b.yml 84fc217 unverified Viktorius Suwandi commited on May 29, 2023
Update wandb_log_model on quickstart.yml f317296 unverified Viktorius Suwandi commited on May 29, 2023
tweaks to data loading, 8 bit adam, accelerate and deepspeed 097d367 winglian commited on Apr 22, 2023
fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release 8d43785 winglian commited on Apr 20, 2023
quickstart instructions for starting from runpod (#5) 0a472e1 unverified winglian commited on Apr 18, 2023
WIP large refactor to make finetune script a little more manageable (#3) 6045345 unverified winglian commited on Apr 18, 2023
fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets 87e073d winglian commited on Apr 17, 2023
deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches d1aed4c winglian commited on Apr 16, 2023
add llama 7b config and fiz lora_fan_in_fan_out for llama (copy pasta bug) d060c80 winglian commited on Apr 15, 2023
config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes f2a2029 winglian commited on Apr 14, 2023