strip out hacky qlora-fsdp workarounds now that qlora-fsdp fixes are upstreamed (#1428) 2a1589f unverified winglian commited on Mar 21
Train parameters exclusively in specific ranges (#1390) 05bcc9e unverified seungduk commited on Mar 14
Update tinyllama lora.yml to fix eval packing issue (#1362) 8984bf1 unverified rasbt commited on Mar 5
Add instructions for playing with qlora model to colab example (#1290) 6ab69ec unverified Jared Palmer Nanobit JohanWork commited on Feb 21
fix(examples): remove is_*_derived as it's parsed automatically (#1297) a7a9a14 unverified Nanobit commited on Feb 21
Update qlora.yml - remove `max_packed_sequence_len` (#1210) [skip ci] 5407ddd unverified 7flash commited on Jan 26
Fine-Tuning Mistral-7b for Real-World Chatbot Applications Using Axolotl (Lora used) (#1155) cc25039 unverified Tilemachos Chatzipapas twenty8th winglian commited on Jan 23
set fp16 to false if bf16, update bf16: auto in example YAMLs (#1122) [skip ci] 782b6a4 unverified winglian Nanobit commited on Jan 22
Add shifted sparse attention (#973) [skip-ci] 1d70f24 unverified jrc joecummings winglian commited on Jan 18
streaming multipack for pretraining dataset (#959) 553c80f unverified jinwonkim93 jinwonkim93@github.com winglian commited on Jan 6
Set eval_sample_packing to false in mistral config.yaml (#1003) 384b817 unverified Kevin Sydney commited on Dec 28, 2023
Add an example config for finetuning a 34B model on a 24GB GPU (#1000) 6ef46f8 unverified Evan Griffiths commited on Dec 25, 2023
set output_router_logits for mixtral config: (#995) 628b754 unverified winglian commited on Dec 22, 2023
new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified winglian commited on Dec 12, 2023
update to latest transformers for mixstral support (#929) 35f9b0f unverified winglian commited on Dec 10, 2023
feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified user735 Karl-Johan Alm commited on Dec 4, 2023