qwerrwe / src /axolotl /monkeypatch

Commit History

Unsloth gradient checkpointing offload (#1528)
6319da1
unverified

winglian commited on

qwen2_moe support w multipack (#1455)
6086be8
unverified

winglian commited on

fix some of the edge cases for Jamba (#1452)
05b398a
unverified

winglian commited on

Remove seq_len arg in rotary_emb (#1443)
e07347b
unverified

wenbopan winglian commited on

beta support for multipack with gemmoe: (#1402)
8df7b88
unverified

winglian commited on

Update fastchat_conversation_turns.py (#1294) [skip ci]
2b9687f
unverified

eltociear commited on

fix steps check for anneal on first cycle (#1316)
2c9c88b
unverified

winglian commited on

make mlflow optional (#1317)
5894f0e
unverified

winglian commited on

multipack for gemma (#1313)
2752d5f
unverified

winglian commited on

allow the optimizer prune ratio for ReLoRA to be configurable (#1287)
4b997c3
unverified

winglian commited on

Add MPS support (#1264)
fac2d98
unverified

Maxime winglian commited on

simplify haldning for newer multipack patches so they can be added in a single place (#1270)
5698943
unverified

winglian commited on

relora: magnitude pruning of the optimizer (#1245)
8c2e05a
unverified

winglian commited on

support for true batches with multipack (#1230)
00568c1
unverified

winglian commited on

Respect sliding_window=None (#1214)
62ca4a2
unverified

DreamGenX commited on

Mixtral fixes 20240124 (#1192) [skip ci]
54d2ac1
unverified

winglian commited on

Phi2 multipack (#1173)
814aee6
unverified

winglian commited on

Falcon embeddings (#1149) [skip docker]
e799e08
unverified

winglian commited on

Qwen2 (#1166)
f5a828a
unverified

winglian commited on

Multipack simplify for Mixtral (#1142)
6910e6a
unverified

winglian commited on

Add shifted sparse attention (#973) [skip-ci]
1d70f24
unverified

jrc joecummings winglian commited on

optimize calculation of cu_seqlens from position_ids (#1084) [skip ci]
90036eb
unverified

winglian commited on

Added chatglm3 conversation type for training models like TinyLLama (#1036)
59b2d30
unverified

xaviviro commited on

bump transformers and update attention class map name (#1023)
bcc78d8
unverified

winglian commited on

remove landmark attn and xpos rope implementations (#1010)
70b46ca
unverified

winglian commited on

fix mistral prompt assembly (#982)
7bbaac9
unverified

hamel commited on

Fix prompt assembly for llama (#952)
5ada140
unverified

hamel tokestermw commited on

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)
ef24342
unverified

dg-kalle commited on

Mixtral official (#942)
7fabc4d
unverified

winglian commited on

adds llama and mistral dropout support (#858)
db8a8af
unverified

winglian commited on

various bugfixes (#856)
1470650
unverified

winglian commited on

refactor neft patch to be more re-usable similar to trl's impl (#796)
827ec3d
unverified

winglian commited on

Hotfix for not saving correctly (#762)
32eeeb5
unverified

casperhansen commited on

Mistral: Sliding Window Attention with Flash Attention and Sample Packing (#732)
a045db0
unverified

casperhansen winglian commited on

add noisy embedding (#721)
3bd9528
unverified

Maxime Maxime commited on

flash_attention + sample packing for stablelm 3b (#671)
2d60ba3
unverified

winglian commited on

fix for flash attn w mistral w/o sammple packing (#648)
b2edaae
unverified

winglian commited on

Mistral flash attn packing (#646)
b6ab8aa
unverified

winglian commited on

skip some flash attn patches unless explicitly enabled (#643)
895f0a0
unverified

winglian commited on

use fastchat conversations template (#578)
e7d3e2d
unverified

winglian commited on

update for recent transformers updates (#636)
60c7c48
unverified

winglian commited on

Feat: Add support for upstream FA2 (#626)
19a600a
unverified

Nanobit commited on

btlm and falcon monkey patches for flash attn (#566)
6b9b229
unverified

winglian commited on

Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified

Glavin001 commited on

reorg a bit
fc8766e

tmm1 commited on

use flash_attn rmsnorm when available (#526)
72a6fe1
unverified

tmm1 commited on

use flash_attn xentropy when available (#525)
5fe30b1
unverified

tmm1 commited on