qwerrwe / src /axolotl /utils /trainer.py

Commit History

support for true batches with multipack (#1230)
00568c1
unverified

winglian commited on

report min lenght of tokenized data (#1186) [skip ci]
d85d494
unverified

winglian commited on

Phi2 multipack (#1173)
814aee6
unverified

winglian commited on

Falcon embeddings (#1149) [skip docker]
e799e08
unverified

winglian commited on

Vram fix attempt (#1164) [skip ci]
32580c1
unverified

winglian commited on

Deprecate max packed sequence len (#1141)
2ce5c0d
unverified

winglian commited on

Multipack simplify for Mixtral (#1142)
6910e6a
unverified

winglian commited on

Preprocess dataset size fix (#1131)
7570446
unverified

winglian commited on

additional logging to get maximum token length of a sequence in the dataset (#1066) [skip ci]
2f2582e
unverified

winglian commited on

Efficiently get the length of the tokenized docs (#1063)
81d3845
unverified

ricdomolm winglian commited on

streaming multipack for pretraining dataset (#959)
553c80f
unverified

jinwonkim93 jinwonkim93@github.com winglian commited on

RL/DPO (#935)
f243c21

winglian commited on

Fix Deepspeed loading (#950)
5ea3aa3
unverified

winglian commited on

support for mamba (#915)
40a6362
unverified

winglian commited on

Determine FSDP/deepspeed settings on device select. (#883)
71b7ea3
unverified

user735 Karl-Johan Alm winglian commited on

don't train if eval split is too small (#873)
797f3dd
unverified

winglian commited on

various bugfixes (#856)
1470650
unverified

winglian commited on

multipack w batch sampler (#795)
641e6f7
unverified

winglian commited on

use accelerate logging for zero/main loggin only
b2430ce

winglian commited on

cleanup verbosity a bit
4c834bf

winglian commited on

Threaded MultipackDistributedDataloader with prefetched samples (#759)
05bd6f1
unverified

casperhansen commited on

refactor setup trainer so we can add more hooks (#773)
6c81c61
unverified

winglian commited on

fixes for alpaca w chatml, and don't include attention_mask w mistral for flash attention (#728)
3553172
unverified

winglian commited on

Save Axolotl config as WandB artifact (#716)
490923f
unverified

Jan Philipp Harries commited on

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)
2642cae
unverified

winglian commited on

Make dataset_processes configurable (#651)
9ec2077
unverified

corbt commited on

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)
383f88d
unverified

Nanobit commited on

attention_mask not needed for training (#642)
e8cbf50
unverified

winglian commited on

chore(callback): Remove old peft saving code (#510)
d5f8589
unverified

Nanobit commited on

misc fixes to add gptq tests (#621)
03e5907
unverified

winglian commited on

run eval on the first step to get a baseline (#617)
2844eb2
unverified

winglian commited on

minor tweaks to simplify (#597)
31b9e0c
unverified

winglian commited on

gather/broadcast the max value of the packing efficiency automatically (#463)
b15b19e
unverified

winglian commited on

don't add position_ids for evals (#591)
ab534d7
unverified

winglian commited on

optionally configure sample packing for evals (#589)
21ec195
unverified

winglian commited on

fix save_steps so it doesn't get duplicated (#567)
3fbde76
unverified

winglian commited on

let hf trainer handle torch compile (#516)
a4e1bb6
unverified

winglian tmm1 commited on

improve how we setup eval/save strategies and steps (#547)
36e53c7
unverified

winglian commited on

add optimization for group-by-len (#563)
e5bb22a
unverified

winglian commited on

Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified

Glavin001 commited on

Early stopping metric (#537)
e30f1e3
unverified

winglian commited on

misc fixes/improvements (#513)
a546ca2
unverified

winglian commited on

Add support for GPTQ using native transformers/peft (#468)
3355706
unverified

winglian commited on

log supervised token count (#448)
7710e81
unverified

winglian commited on

Added advanced DDP args (#515)
396a7a7
unverified

Jan Philipp Harries Jan Philipp Harries commited on

drop empty tokenized rows too (#509)
c56b450
unverified

winglian commited on

add eval benchmark callback (#441)
7657632
unverified

winglian commited on

use math.ceil instead of round /cc #498
fd55bc8

tmm1 commited on