Merge pull request #12 from NanoCode012/feat/eval_config a15d823 unverified winglian commited on May 7, 2023
support for multi line inference input, log sweep over learning rates 9105935 winglian commited on May 3, 2023
fix adam bnb optimizer grouped parameters, fix peft model 8bit conversion logic, black formatting 7748f3d winglian commited on May 1, 2023
don't load models in 8bit unless they are using an adapter, also fix tokenizer load in exceptional case 6dfdd2d winglian commited on Apr 30, 2023
tweaks to data loading, 8 bit adam, accelerate and deepspeed 097d367 winglian commited on Apr 22, 2023
fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release 8d43785 winglian commited on Apr 20, 2023
quickstart instructions for starting from runpod (#5) 0a472e1 unverified winglian commited on Apr 18, 2023
WIP large refactor to make finetune script a little more manageable (#3) 6045345 unverified winglian commited on Apr 18, 2023
suppport for alpaca-like instruction datasets without inputs e107643 winglian commited on Apr 18, 2023
casts the prepared data to int16 (doesn't help with training memory) 2db9436 winglian commited on Apr 18, 2023
fix lora target module, require explicit flash attention, fix min logging steps, don't use adam8bit for int4, hash prepared datasets, support hf hub datasets 87e073d winglian commited on Apr 17, 2023
deepspeed doesn't work with flash-attn, and the gpu savings w flash attn are better than the deepspeed headaches d1aed4c winglian commited on Apr 16, 2023