Commits · Dovakiins/qwerrwe

standardize attn hijack patches (#381)

06edf17
unverified

tmm1

winglian commited on Aug 18, 2023

adds color (#425)

0a22847
unverified

mhenrichsen

winglian commited on Aug 18, 2023

fix orca prompts (#422)

1b7e860
unverified

winglian commited on Aug 16, 2023

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Nanobit commited on Aug 15, 2023

fix eval steps and strategy (#403)

da10af0
unverified

winglian commited on Aug 15, 2023

better handling of empty input ids when tokenizing (#395)

85cf4f8
unverified

winglian commited on Aug 15, 2023

add utils.data.prepare_dataset

2e22404

tmm1 commited on Aug 15, 2023

use context manager to run things on rank0 before others (#397)

fc2d6be
unverified

winglian commited on Aug 15, 2023

don't use mask expansion for inference (#392)

1687be6
unverified

winglian commited on Aug 15, 2023

Feat(config): add max steps (#387)

3c2ad00
unverified

ittailup commited on Aug 14, 2023

Added "epoch" evaluation_strategy (#388)

5d48a10
unverified

flotos commited on Aug 14, 2023

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Nanobit commited on Aug 14, 2023

Error msg for sharegpt if conv has less than 2 msg (#379)

63fdb5a
unverified

flotos commited on Aug 14, 2023

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

winglian commited on Aug 13, 2023

Fix crash when running without CUDA

15f6e57

chargoddard commited on Aug 13, 2023

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

tmm1 commited on Aug 13, 2023

fix check for flash attn branching (#377)

343ac84
unverified

winglian commited on Aug 13, 2023

remove unnecessary local variable

0c96727

tmm1 commited on Aug 13, 2023

simplify `load_tokenizer`

efb3b2c

tmm1 commited on Aug 13, 2023

improve GPU logging to break out pytorch cache and system mem

7b55fe6

tmm1 commited on Aug 13, 2023

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

tmm1 commited on Aug 13, 2023

extract module for working with cfg

8cec513

tmm1 commited on Aug 13, 2023

fix DefaultDict.or

a13e45d

tmm1 commited on Aug 10, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Morgan McGuire Morgan McGuire

winglian commited on Aug 12, 2023

Fix(model loading): Warn when model revision is passed to gptq (#364)

96bd6ae
unverified

Nanobit commited on Aug 12, 2023

Fix(message): Improve error message for bad format (#365)

e37d935
unverified

Nanobit commited on Aug 12, 2023

Feat: Add rope scaling (#343)

b521206
unverified

Nanobit commited on Aug 12, 2023

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

tmm1 commited on Aug 10, 2023

simplify load_model signature

7181022

tmm1 commited on Aug 9, 2023

log GPU memory usage

e303d64

tmm1 commited on Aug 9, 2023

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

winglian commited on Aug 6, 2023

experimental llama 2 chat support (#296)

3392270
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 6, 2023

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)

10405b9
unverified

ssmi153 commited on Aug 6, 2023

Added Orca Mini prompt strategy (#263)

c93655c
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 5, 2023

optimize the iteration when tokenizeing large datasets (#332)

fe28543
unverified

winglian commited on Aug 4, 2023

fix typo

2eda9e0

tmm1 commited on Aug 3, 2023

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

tmm1 commited on Aug 3, 2023

move flash-attn monkey patch alongside the others

312a9fa

tmm1 commited on Aug 3, 2023

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

tmm1 commited on Aug 2, 2023

qlora w flash attention fixes (#333)

77085ea
unverified

winglian commited on Aug 2, 2023

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

winglian commited on Jul 31, 2023

update prompts for open orca to match the paper (#317)

3d4984b
unverified

winglian commited on Jul 22, 2023

Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens

40a53ff
unverified

winglian commited on Jul 22, 2023

Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings

3ffb018
unverified

winglian commited on Jul 22, 2023

don't resize embeddings to multiples of 32x by default

1066751

winglian commited on Jul 22, 2023

better handling since xgen tokenizer breaks with convert_tokens_to_ids

2a428e8

winglian commited on Jul 21, 2023

flash attention 2

9b790d3

winglian commited on Jul 20, 2023

fix sdp attention to use the flash/mem-efficient context manaager

a032c9f

winglian commited on Jul 20, 2023

feat: use multi-core

45ac7c4

Nanobit commited on Jul 19, 2023

Commit History

standardize attn hijack patches (#381) 06edf17 unverified

adds color (#425) 0a22847 unverified

fix orca prompts (#422) 1b7e860 unverified

Fix(config): Update handling of deepspeed config (#404) c01015f unverified

fix eval steps and strategy (#403) da10af0 unverified

better handling of empty input ids when tokenizing (#395) 85cf4f8 unverified

add utils.data.prepare_dataset 2e22404

use context manager to run things on rank0 before others (#397) fc2d6be unverified

don't use mask expansion for inference (#392) 1687be6 unverified

Feat(config): add max steps (#387) 3c2ad00 unverified

Added "epoch" evaluation_strategy (#388) 5d48a10 unverified

Feat(config): Add hub_strategy (#386) 73a0b6e unverified

Error msg for sharegpt if conv has less than 2 msg (#379) 63fdb5a unverified

don't pass rope_scaling kwarg if it's None (#383) 919246f unverified

Fix crash when running without CUDA 15f6e57

try to detect accelerate and only use device_map=None in that case (#373) 094fc2c unverified

fix check for flash attn branching (#377) 343ac84 unverified

remove unnecessary local variable 0c96727

simplify `load_tokenizer` efb3b2c

improve GPU logging to break out pytorch cache and system mem 7b55fe6

quiet noise from llama tokenizer by setting pad token earlier e029ab3

extract module for working with cfg 8cec513

fix DefaultDict.__or__ a13e45d

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Add wandb_entity to wandb options, update example configs, update README (#361) 7019509 unverified

Fix(model loading): Warn when model revision is passed to gptq (#364) 96bd6ae unverified

Fix(message): Improve error message for bad format (#365) e37d935 unverified

Feat: Add rope scaling (#343) b521206 unverified

Merge pull request #356 from tmm1/load_model-args 11ddccb unverified

simplify load_model signature 7181022

log GPU memory usage e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345) 176b888 unverified

experimental llama 2 chat support (#296) 3392270 unverified

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339) 10405b9 unverified

Added Orca Mini prompt strategy (#263) c93655c unverified

optimize the iteration when tokenizeing large datasets (#332) fe28543 unverified

fix typo 2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment 78b9efb

move flash-attn monkey patch alongside the others 312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype 248bf90

qlora w flash attention fixes (#333) 77085ea unverified

add peft install back since it doesn't get installed by setup.py (#331) db2a358 unverified

update prompts for open orca to match the paper (#317) 3d4984b unverified

Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens 40a53ff unverified

Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings 3ffb018 unverified

don't resize embeddings to multiples of 32x by default 1066751

better handling since xgen tokenizer breaks with convert_tokens_to_ids 2a428e8

flash attention 2 9b790d3

fix sdp attention to use the flash/mem-efficient context manaager a032c9f

feat: use multi-core 45ac7c4

standardize attn hijack patches (#381)

06edf17
unverified

adds color (#425)

0a22847
unverified

fix orca prompts (#422)

1b7e860
unverified

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

fix eval steps and strategy (#403)

da10af0
unverified

better handling of empty input ids when tokenizing (#395)

85cf4f8
unverified

add utils.data.prepare_dataset

2e22404

use context manager to run things on rank0 before others (#397)

fc2d6be
unverified

don't use mask expansion for inference (#392)

1687be6
unverified

Feat(config): add max steps (#387)

3c2ad00
unverified

Added "epoch" evaluation_strategy (#388)

5d48a10
unverified

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Error msg for sharegpt if conv has less than 2 msg (#379)

63fdb5a
unverified

don't pass rope_scaling kwarg if it's None (#383)

919246f
unverified

Fix crash when running without CUDA

15f6e57

try to detect accelerate and only use device_map=None in that case (#373)

094fc2c
unverified

fix check for flash attn branching (#377)

343ac84
unverified

remove unnecessary local variable

0c96727

simplify `load_tokenizer`

efb3b2c

improve GPU logging to break out pytorch cache and system mem

7b55fe6

quiet noise from llama tokenizer by setting pad token earlier

e029ab3

extract module for working with cfg

8cec513

fix DefaultDict.or

a13e45d

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Fix(model loading): Warn when model revision is passed to gptq (#364)

96bd6ae
unverified

Fix(message): Improve error message for bad format (#365)

e37d935
unverified

Feat: Add rope scaling (#343)

b521206
unverified

Merge pull request #356 from tmm1/load_model-args

11ddccb
unverified

simplify load_model signature

7181022

log GPU memory usage

e303d64

ensure enable_input_require_grads is called on model before getting the peft model (#345)

176b888
unverified

experimental llama 2 chat support (#296)

3392270
unverified

Update XFormers Attention Monkeypatch to handle Llama-2 70B (GQA) (#339)

10405b9
unverified

Added Orca Mini prompt strategy (#263)

c93655c
unverified

optimize the iteration when tokenizeing large datasets (#332)

fe28543
unverified

fix typo

2eda9e0

scope flash-attn+qlora fix correctly, scope to llama, add comment

78b9efb

move flash-attn monkey patch alongside the others

312a9fa

ensure flash-attn fixes happen in both adapter/lora modes, and use torch_dtype

248bf90

qlora w flash attention fixes (#333)

77085ea
unverified

add peft install back since it doesn't get installed by setup.py (#331)

db2a358
unverified

update prompts for open orca to match the paper (#317)

3d4984b
unverified

Merge pull request #307 from OpenAccess-AI-Collective/xgen-user-sharegpt-tokens

40a53ff
unverified

Merge pull request #313 from OpenAccess-AI-Collective/tokenizer-llama2-embeddings

3ffb018
unverified

don't resize embeddings to multiples of 32x by default

1066751

better handling since xgen tokenizer breaks with convert_tokens_to_ids

2a428e8

flash attention 2

9b790d3

fix sdp attention to use the flash/mem-efficient context manaager

a032c9f

feat: use multi-core

45ac7c4