Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155) cc25039 unverified Tilemachos Chatzipapas twenty8th winglian commited on Jan 23
set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci] 782b6a4 unverified winglian Nanobit commited on Jan 22
Set eval_sample_packing to false in mistral config.yaml (#1003) 384b817 unverified Kevin Sydney commited on Dec 28, 2023
set output_router_logits for mixtral config: (#995) 628b754 unverified winglian commited on Dec 22, 2023
new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified winglian commited on Dec 12, 2023
update to latest transformers for mixstral support (#929) 35f9b0f unverified winglian commited on Dec 10, 2023
feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified user735 Karl-Johan Alm commited on Dec 4, 2023
don't compile deepspeed or bitsandbytes from source (#837) f544ab2 unverified winglian commited on Nov 9, 2023
disable eval table w sample packing in examples (#778) 9b43e7e unverified winglian commited on Oct 23, 2023
simplify by removing duplicate base_model_config (#772) 2d8def6 unverified winglian commited on Oct 23, 2023
Get qlora mistral-7b fine tuning working on a single 4090 (#708) 295b266 unverified lukemarsden commited on Oct 10, 2023
Fix: Higher vram usage for mistral and sample_packing (#691) 669f1d0 unverified Nanobit commited on Oct 6, 2023
prepared dataset caching, other misc fixes (#665) e50a64e unverified winglian commited on Oct 3, 2023