Commit History

rename var and reformat
f319b0b

tmm1 commited on

Update src/axolotl/utils/models.py
7fd662d
unverified

Maxime tmm1 commited on

Update src/axolotl/utils/models.py
9e69968
unverified

Maxime tmm1 commited on

ignore: address pr review
d03887f
unverified

Maxime commited on

ignore: linter
a184549
unverified

Maxime commited on

fix: finetune model inference needs the dtype fix to work with flash-attn
f311df9
unverified

Maxime commited on

fix types w lora (#478)
0b7ba57
unverified

winglian commited on

Fix(tokenizer): Fix condition to add pad token (#477)
71bd062
unverified

Nanobit commited on

improve llama pad token handling (#475)
cb9797e
unverified

winglian commited on

recast loralayer, norm, lmhead + embed token weights per original qlora (#393)
96deb6b
unverified

winglian commited on

fix evals (#447)
ee26281
unverified

winglian commited on

standardize attn hijack patches (#381)
06edf17
unverified

tmm1 winglian commited on

don't use mask expansion for inference (#392)
1687be6
unverified

winglian commited on

don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified

winglian commited on

try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified

tmm1 commited on

remove unnecessary local variable
0c96727

tmm1 commited on

simplify `load_tokenizer`
efb3b2c

tmm1 commited on

improve GPU logging to break out pytorch cache and system mem
7b55fe6

tmm1 commited on

quiet noise from llama tokenizer by setting pad token earlier
e029ab3

tmm1 commited on

Attention mask and position id fixes for packing (#285)
2bb0b78
unverified

winglian commited on

Feat: Add rope scaling (#343)
b521206
unverified

Nanobit commited on

Merge pull request #356 from tmm1/load_model-args
11ddccb
unverified

tmm1 commited on

simplify load_model signature
7181022

tmm1 commited on

log GPU memory usage
e303d64

tmm1 commited on

ensure enable_input_require_grads is called on model before getting the peft model (#345)
176b888
unverified

winglian commited on

fix typo
2eda9e0

tmm1 commited on

scope flash-attn+qlora fix correctly, scope to llama, add comment
78b9efb

tmm1 commited on

move flash-attn monkey patch alongside the others
312a9fa

tmm1 commited on

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype
248bf90

tmm1 commited on

qlora w flash attention fixes (#333)
77085ea
unverified

winglian commited on

add peft install back since it doesn't get installed by setup.py (#331)
db2a358
unverified

winglian commited on

don't resize embeddings to multiples of 32x by default
1066751

winglian commited on

support for loading a model by git revision
69a2350

winglian commited on

skip explicit model type too if using trust_remote_code
d69da99

winglian commited on

don't use llama if trust_remote_code is set since that needs to use AutoModel path
66afb76

winglian commited on

optionally define whether to use_fast tokenizer
47d601f

winglian commited on

add float16 docs and tweak typehints
88e17ff

winglian commited on

style correction
136522f

maciej.karasek commited on

issue #205 bugfix
556fe40

maciej.karasek commited on

Merge branch 'main' into flash-optimum
fd2c981
unverified

winglian commited on

Merge pull request #187 from OpenAccess-AI-Collective/strip-peft-device-map
93dacba
unverified

winglian commited on

Merge pull request #177 from NanoCode012/fix/landmark-patch
8002ffb
unverified

winglian commited on

Merge branch 'main' into strip-peft-device-map
5e616d9
unverified

winglian commited on

Merge pull request #159 from AngainorDev/patch-1
8e568bb
unverified

Nanobit commited on

add check for attr
c9a149f

winglian commited on

Fix strict and Lint
b565ecf

Angainor commited on

match up gradient checkpointing when using lora w config
fe0b768

winglian commited on

Fix undefined LlamaForCausalLM and del try except
563b6d8

Nanobit commited on

peft no longer needs device_map
cd0a6f6

winglian commited on