Commit History
chore(script): remove redundant setting (#1411)
d485a08
unverified
Train parameters exclusively in specific ranges (#1390)
05bcc9e
unverified
don't use load and push together (#1284)
ea00dd0
unverified
support for true batches with multipack (#1230)
00568c1
unverified
Peft deepspeed resume (#1227)
c67fb71
unverified
workaround for transformers bug requireing do_sample for saveing pretrained (#1206)
ba944e6
unverified
Mixtral fixes 20240124 (#1192) [skip ci]
54d2ac1
unverified
keep gate in fp32 for 16 bit loras (#1105)
da97285
unverified
feat: enable trl's autounwrap (#1060)
b432889
unverified
fix model card upload for PEFT models (#1043)
31d2350
unverified
RL/DPO (#935)
f243c21
add config to model card (#1005)
85dd4d5
unverified
fix: switch to using the HuggingFace Transformers NEFT implementation (#941)
ef24342
unverified
kallewoof
commited on
Fix Deepspeed loading (#950)
5ea3aa3
unverified
support for mamba (#915)
40a6362
unverified
use accelerate logging for zero/main loggin only
b2430ce
cleanup verbosity a bit
4c834bf
refactor neft patch to be more re-usable similar to trl's impl (#796)
827ec3d
unverified
Implement fused modules (#747)
15d3a65
unverified
Fix DeepSpeed Zero 3 Saving (#709)
e4d1585
unverified
create a model card with axolotl badge (#624)
501958b
unverified
set fsdp state dict (#584)
be75668
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on