Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155) cc25039 unverified Tilemachos Chatzipapas twenty8th winglian commited on Jan 23
set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci] 782b6a4 unverified winglian Nanobit commited on Jan 22
Add shifted sparse attention (#973) [skip-ci] 1d70f24 unverified jrc joecummings winglian commited on Jan 18
streaming multipack for pretraining dataset (#959) 553c80f unverified jinwonkim93 jinwonkim93@github.com winglian commited on Jan 6
Set eval_sample_packing to false in mistral config.yaml (#1003) 384b817 unverified Kevin Sydney commited on Dec 28, 2023
Add an example config for finetuning a 34B model on a 24GB GPU (#1000) 6ef46f8 unverified Evan Griffiths commited on Dec 25, 2023
set output_router_logits for mixtral config: (#995) 628b754 unverified winglian commited on Dec 22, 2023
new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified winglian commited on Dec 12, 2023
update to latest transformers for mixstral support (#929) 35f9b0f unverified winglian commited on Dec 10, 2023
feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified user735 Karl-Johan Alm commited on Dec 4, 2023
don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified winglian commited on Nov 9, 2023
disable eval table w sample packing in examples (#778) 9b43e7e unverified winglian commited on Oct 23, 2023
simplify by removing duplicate base_model_config (#772) 2d8def6 unverified winglian commited on Oct 23, 2023
Get qlora mistral-7b fine tuning working on a single 4090 (#708) 295b266 unverified lukemarsden commited on Oct 10, 2023
Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified Nanobit commited on Oct 6, 2023
prepared dataset caching, other misc fixes (#665) e50a64e unverified winglian commited on Oct 3, 2023
eval_table isn't quite stable enough to be in default llama configs (#637) d887ad8 unverified winglian commited on Sep 26, 2023
more sane defaults for openllama 3b used for quickstarts (#602) 674c576 unverified winglian commited on Sep 19, 2023
btlm and falcon monkey patches for flash attn (#566) 6b9b229 unverified winglian commited on Sep 17, 2023