Commit History
Unsloth gradient checkpointing offload (#1528)
6319da1
unverified
winglian
commited on
qwen2_moe support w multipack (#1455)
6086be8
unverified
winglian
commited on
fix some of the edge cases for Jamba (#1452)
05b398a
unverified
winglian
commited on
beta support for multipack with gemmoe: (#1402)
8df7b88
unverified
winglian
commited on
Update fastchat_conversation_turns.py (#1294) [skip ci]
2b9687f
unverified
eltociear
commited on
fix steps check for anneal on first cycle (#1316)
2c9c88b
unverified
winglian
commited on
make mlflow optional (#1317)
5894f0e
unverified
winglian
commited on
multipack for gemma (#1313)
2752d5f
unverified
winglian
commited on
allow the optimizer prune ratio for ReLoRA to be configurable (#1287)
4b997c3
unverified
winglian
commited on
Add MPS support (#1264)
fac2d98
unverified
simplify haldning for newer multipack patches so they can be added in a single place (#1270)
5698943
unverified
winglian
commited on
relora: magnitude pruning of the optimizer (#1245)
8c2e05a
unverified
winglian
commited on
support for true batches with multipack (#1230)
00568c1
unverified
winglian
commited on
Respect sliding_window=None (#1214)
62ca4a2
unverified
DreamGenX
commited on
Mixtral fixes 20240124 (#1192) [skip ci]
54d2ac1
unverified
winglian
commited on
Phi2 multipack (#1173)
814aee6
unverified
winglian
commited on
Falcon embeddings (#1149) [skip docker]
e799e08
unverified
winglian
commited on
Qwen2 (#1166)
f5a828a
unverified
winglian
commited on
Multipack simplify for Mixtral (#1142)
6910e6a
unverified
winglian
commited on
optimize calculation of cu_seqlens from position_ids (#1084) [skip ci]
90036eb
unverified
winglian
commited on
Added chatglm3 conversation type for training models like TinyLLama (#1036)
59b2d30
unverified
xaviviro
commited on
bump transformers and update attention class map name (#1023)
bcc78d8
unverified
winglian
commited on
remove landmark attn and xpos rope implementations (#1010)
70b46ca
unverified
winglian
commited on
fix mistral prompt assembly (#982)
7bbaac9
unverified
hamel
commited on
Fix prompt assembly for llama (#952)
5ada140
unverified
fix: switch to using the HuggingFace Transformers NEFT implementation (#941)
ef24342
unverified
dg-kalle
commited on
Mixtral official (#942)
7fabc4d
unverified
winglian
commited on
adds llama and mistral dropout support (#858)
db8a8af
unverified
winglian
commited on
various bugfixes (#856)
1470650
unverified
winglian
commited on
refactor neft patch to be more re-usable similar to trl's impl (#796)
827ec3d
unverified
winglian
commited on
Hotfix for not saving correctly (#762)
32eeeb5
unverified
casperhansen
commited on
Implement fused modules (#747)
15d3a65
unverified
Mistral: Sliding Window Attention with Flash Attention and Sample Packing (#732)
a045db0
unverified
add noisy embedding (#721)
3bd9528
unverified
Maxime
Maxime
commited on
flash_attention + sample packing for stablelm 3b (#671)
2d60ba3
unverified
winglian
commited on
fix for flash attn w mistral w/o sammple packing (#648)
b2edaae
unverified
winglian
commited on
Mistral flash attn packing (#646)
b6ab8aa
unverified
winglian
commited on
skip some flash attn patches unless explicitly enabled (#643)
895f0a0
unverified
winglian
commited on
use fastchat conversations template (#578)
e7d3e2d
unverified
winglian
commited on
update for recent transformers updates (#636)
60c7c48
unverified
winglian
commited on
Feat: Add support for upstream FA2 (#626)
19a600a
unverified
Nanobit
commited on
btlm and falcon monkey patches for flash attn (#566)
6b9b229
unverified
winglian
commited on
Add training callback to send predictions to WandB table (#521)
5b67ea9
unverified
Glavin001
commited on
reorg a bit
fc8766e
tmm1
commited on
use flash_attn rmsnorm when available (#526)
72a6fe1
unverified
tmm1
commited on
use flash_attn xentropy when available (#525)
5fe30b1
unverified
tmm1
commited on