Commit History

fix eval steps and strategy (#403)
da10af0
unverified

winglian commited on

add utils.data.prepare_dataset
2e22404

tmm1 commited on

use context manager to run things on rank0 before others (#397)
fc2d6be
unverified

winglian commited on

don't use mask expansion for inference (#392)
1687be6
unverified

winglian commited on

Feat(config): add max steps (#387)
3c2ad00
unverified

ittailup commited on

Added "epoch" evaluation_strategy (#388)
5d48a10
unverified

flotos commited on

Feat(config): Add hub_strategy (#386)
73a0b6e
unverified

Nanobit commited on

don't pass rope_scaling kwarg if it's None (#383)
919246f
unverified

winglian commited on

Fix crash when running without CUDA
15f6e57

chargoddard commited on

try to detect accelerate and only use device_map=None in that case (#373)
094fc2c
unverified

tmm1 commited on

remove unnecessary local variable
0c96727

tmm1 commited on

simplify `load_tokenizer`
efb3b2c

tmm1 commited on

improve GPU logging to break out pytorch cache and system mem
7b55fe6

tmm1 commited on

quiet noise from llama tokenizer by setting pad token earlier
e029ab3

tmm1 commited on

extract module for working with cfg
8cec513

tmm1 commited on

fix DefaultDict.__or__
a13e45d

tmm1 commited on

Attention mask and position id fixes for packing (#285)
2bb0b78
unverified

winglian commited on

Add wandb_entity to wandb options, update example configs, update README (#361)
7019509
unverified

Morgan McGuire Morgan McGuire winglian commited on

Fix(model loading): Warn when model revision is passed to gptq (#364)
96bd6ae
unverified

Nanobit commited on

Feat: Add rope scaling (#343)
b521206
unverified

Nanobit commited on

Merge pull request #356 from tmm1/load_model-args
11ddccb
unverified

tmm1 commited on

simplify load_model signature
7181022

tmm1 commited on

log GPU memory usage
e303d64

tmm1 commited on

ensure enable_input_require_grads is called on model before getting the peft model (#345)
176b888
unverified

winglian commited on

experimental llama 2 chat support (#296)
3392270
unverified

Jan Philipp Harries Jan Philipp Harries commited on

optimize the iteration when tokenizeing large datasets (#332)
fe28543
unverified

winglian commited on

fix typo
2eda9e0

tmm1 commited on

scope flash-attn+qlora fix correctly, scope to llama, add comment
78b9efb

tmm1 commited on

move flash-attn monkey patch alongside the others
312a9fa

tmm1 commited on

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype
248bf90

tmm1 commited on

qlora w flash attention fixes (#333)
77085ea
unverified

winglian commited on

add peft install back since it doesn't get installed by setup.py (#331)
db2a358
unverified

winglian commited on

don't resize embeddings to multiples of 32x by default
1066751

winglian commited on

fix axolotl training args dataclass annotation
ebaec3c

winglian commited on

Merge pull request #276 from theobjectivedad/logging_enhancement
6f16c45
unverified

winglian commited on

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var
b1f4f7a

theobjectivedad commited on

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement
83237b8
unverified

The Objective Dad commited on

Add ability to pass 'name' argument to load_dataset
88089e8

chargoddard commited on

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2
168a7a0
unverified

Nanobit commited on

Feat: Add save_safetensors
5491278

Nanobit commited on

Set push to hub as private by default
1514739
unverified

Nanobit commited on

support for loading a model by git revision
69a2350

winglian commited on

Merge branch 'main' into quadratic-warmup
c4cf567
unverified

winglian commited on

better configuration for quadratic warmup
c49729d

winglian commited on

params are adam_*, not adamw_*
19cf0bd

winglian commited on

skip explicit model type too if using trust_remote_code
d69da99

winglian commited on

don't use llama if trust_remote_code is set since that needs to use AutoModel path
66afb76

winglian commited on

Merge pull request #221 from utensil/local_dataset
b9b7d4c
unverified

winglian commited on

Fix future deprecation push_to_hub_model_id
e79c8e6

Nanobit commited on