pygmalion dataset prompts format, cached tokenized datasets should be hashed on the tokenizer too 2809f3f winglian commited on May 21, 2023
move filter to before saving so it doesn't happen everytime, update runpod manual script 0d28df0 winglian commited on May 14, 2023
optimize dataloading to use cache, fix model token embedding sizes aa3c3f9 winglian commited on May 12, 2023
tweaks to data loading, 8 bit adam, accelerate and deepspeed 097d367 winglian commited on Apr 22, 2023
fix sharegpt handling from hf, don't worry about loading llama if using earlier transformers release 8d43785 winglian commited on Apr 20, 2023
WIP large refactor to make finetune script a little more manageable (#3) 6045345 unverified winglian commited on Apr 18, 2023