Commits · Dovakiins/qwerrwe

flash attn pip install (#426)

cf66547
unverified

mhenrichsen Ubuntu mhenrichsen Mads Henrichsen

winglian commited on Aug 18, 2023

Fix(docs): Remove gptq+lora and fix xformer compat list (#423)

3d1f203
unverified

Nanobit commited on Aug 16, 2023

hopefully improve the README (#419)

2495909
unverified

winglian commited on Aug 15, 2023

Merge pull request #413 from mhenrichsen/chore/update-deepseed-config

f806e86
unverified

mhenrichsen commited on Aug 15, 2023

Feat(doc): Add lr_quadratic_warmup to readme (#412)

2b990eb
unverified

Nanobit commited on Aug 15, 2023

update path to align with fsdp example

bd8cab4

mhenrichsen commited on Aug 15, 2023

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Nanobit commited on Aug 15, 2023

Fix(docs): Update flash attn requirements (#409)

72fe3f8
unverified

Nanobit commited on Aug 15, 2023

update docs for tokenizer_legacy (#401)

47961fd
unverified

winglian commited on Aug 15, 2023

add templates, CoC and contributing guide (#126)

31db0ec
unverified

lightningRalf

winglian

Nanobit commited on Aug 15, 2023

Feat(doc): Add how to save by epochs (#396)

be294fd
unverified

Nanobit commited on Aug 15, 2023

Feat(doc): Add max_steps to readme (#389)

41ecb45
unverified

Nanobit commited on Aug 14, 2023

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Nanobit commited on Aug 14, 2023

Feat(doc): Improve sharegpt doc (#378)

729c299
unverified

Nanobit commited on Aug 13, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Morgan McGuire Morgan McGuire

winglian commited on Aug 12, 2023

Feat: Add rope scaling (#343)

b521206
unverified

Nanobit commited on Aug 12, 2023

Update README.md on pretraining_dataset (#360)

fae6ed8
unverified

Nanobit commited on Aug 11, 2023

Clarify pre-tokenize before multigpu (#359)

94d03c8
unverified

Nanobit commited on Aug 11, 2023

note pattern when using groups

b4d1d22

tmm1 commited on Aug 7, 2023

update comment for group_by_length

9f99104

tmm1 commited on Aug 7, 2023

python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev

58d6659

tmm1 commited on Aug 3, 2023

there is no configs folder

cc7e800

tmm1 commited on Aug 3, 2023

update README for updated docker images (#328)

41a4d15
unverified

winglian commited on Jul 28, 2023

Merge pull request #306 from ethanhs/xgen

dcdec44
unverified

winglian commited on Jul 22, 2023

don't resize embeddings to multiples of 32x by default

1066751

winglian commited on Jul 22, 2023

Add XGen info to README and example config

3881143

ethanhs commited on Jul 21, 2023

Fix(readme): Improve wording for push model

165907f
unverified

Nanobit commited on Jul 21, 2023

fix(readme): remove accelerate config

b64f411
unverified

Nanobit commited on Jul 17, 2023

Merge pull request #279 from NanoCode012/feat/multi-gpu-readme

469c08c
unverified

winglian commited on Jul 16, 2023

Add dataset name to all yaml options in README

3cdd8e4

chargoddard commited on Jul 15, 2023

Feat(readme): improve docs on multi-gpu

cf5ae6b

Nanobit commited on Jul 15, 2023

Fix formatting mistake

46032a1

chargoddard commited on Jul 15, 2023

Add example of dataset with configuration name to README

8bba642

chargoddard commited on Jul 15, 2023

Merge pull request #275 from NanoCode012/feat/safetensors

231031a
unverified

Nanobit commited on Jul 14, 2023

Feat: Add save_safetensors

5491278

Nanobit commited on Jul 14, 2023

Feat(docs): Add model_revision arg

896c1ae
unverified

Nanobit commited on Jul 14, 2023

Fix for linter

41da98b
unverified

Nanobit commited on Jul 6, 2023

Fix local path loading and custom strategy type

9e64f42
unverified

Nanobit commited on Jul 6, 2023

Fix future deprecation push_to_hub_model_id

e79c8e6

Nanobit commited on Jul 3, 2023

open orca support

78a1e1f

winglian commited on Jul 1, 2023

Update README.md

c146880
unverified

Nanobit commited on Jun 30, 2023

optionally define whether to use_fast tokenizer

47d601f

winglian commited on Jun 25, 2023

add docs

c969f0a

winglian commited on Jun 15, 2023

hint to what AMP means

d7635b7

winglian commited on Jun 15, 2023

add float16 docs and tweak typehints

88e17ff

winglian commited on Jun 15, 2023

Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum

16bb627
unverified

winglian commited on Jun 14, 2023

Fix sharegpt type

3513885
unverified

Nanobit commited on Jun 13, 2023

Update README.md to include a community showcase

5ff547d
unverified

PocketDoc commited on Jun 13, 2023

fix inference

34ae699

mhenrichsen commited on Jun 12, 2023

Commit History

flash attn pip install (#426) cf66547 unverified

Fix(docs): Remove gptq+lora and fix xformer compat list (#423) 3d1f203 unverified

hopefully improve the README (#419) 2495909 unverified

Merge pull request #413 from mhenrichsen/chore/update-deepseed-config f806e86 unverified

Feat(doc): Add lr_quadratic_warmup to readme (#412) 2b990eb unverified

update path to align with fsdp example bd8cab4

Fix(config): Update handling of deepspeed config (#404) c01015f unverified

Fix(docs): Update flash attn requirements (#409) 72fe3f8 unverified

update docs for tokenizer_legacy (#401) 47961fd unverified

add templates, CoC and contributing guide (#126) 31db0ec unverified

Feat(doc): Add how to save by epochs (#396) be294fd unverified

Feat(doc): Add max_steps to readme (#389) 41ecb45 unverified

Feat(config): Add hub_strategy (#386) 73a0b6e unverified

Feat(doc): Improve sharegpt doc (#378) 729c299 unverified

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

Add wandb_entity to wandb options, update example configs, update README (#361) 7019509 unverified

Feat: Add rope scaling (#343) b521206 unverified

Update README.md on pretraining_dataset (#360) fae6ed8 unverified

Clarify pre-tokenize before multigpu (#359) 94d03c8 unverified

note pattern when using groups b4d1d22

update comment for group_by_length 9f99104

python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev 58d6659

there is no configs folder cc7e800

update README for updated docker images (#328) 41a4d15 unverified

Merge pull request #306 from ethanhs/xgen dcdec44 unverified

don't resize embeddings to multiples of 32x by default 1066751

Add XGen info to README and example config 3881143

Fix(readme): Improve wording for push model 165907f unverified

fix(readme): remove accelerate config b64f411 unverified

Merge pull request #279 from NanoCode012/feat/multi-gpu-readme 469c08c unverified

Add dataset name to all yaml options in README 3cdd8e4

Feat(readme): improve docs on multi-gpu cf5ae6b

Fix formatting mistake 46032a1

Add example of dataset with configuration name to README 8bba642

Merge pull request #275 from NanoCode012/feat/safetensors 231031a unverified

Feat: Add save_safetensors 5491278

Feat(docs): Add model_revision arg 896c1ae unverified

Fix for linter 41da98b unverified

Fix local path loading and custom strategy type 9e64f42 unverified

Fix future deprecation push_to_hub_model_id e79c8e6

open orca support 78a1e1f

Update README.md c146880 unverified

optionally define whether to use_fast tokenizer 47d601f

add docs c969f0a

hint to what AMP means d7635b7

add float16 docs and tweak typehints 88e17ff

Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum 16bb627 unverified

Fix sharegpt type 3513885 unverified

Update README.md to include a community showcase 5ff547d unverified

fix inference 34ae699

flash attn pip install (#426)

cf66547
unverified

Fix(docs): Remove gptq+lora and fix xformer compat list (#423)

3d1f203
unverified

hopefully improve the README (#419)

2495909
unverified

Merge pull request #413 from mhenrichsen/chore/update-deepseed-config

f806e86
unverified

Feat(doc): Add lr_quadratic_warmup to readme (#412)

2b990eb
unverified

update path to align with fsdp example

bd8cab4

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Fix(docs): Update flash attn requirements (#409)

72fe3f8
unverified

update docs for tokenizer_legacy (#401)

47961fd
unverified

add templates, CoC and contributing guide (#126)

31db0ec
unverified

Feat(doc): Add how to save by epochs (#396)

be294fd
unverified

Feat(doc): Add max_steps to readme (#389)

41ecb45
unverified

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Feat(doc): Improve sharegpt doc (#378)

729c299
unverified

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

Add wandb_entity to wandb options, update example configs, update README (#361)

7019509
unverified

Feat: Add rope scaling (#343)

b521206
unverified

Update README.md on pretraining_dataset (#360)

fae6ed8
unverified

Clarify pre-tokenize before multigpu (#359)

94d03c8
unverified

note pattern when using groups

b4d1d22

update comment for group_by_length

9f99104

python 3.10 and 3.11 both work fine, as does pytorch 2.1.0.dev

58d6659

there is no configs folder

cc7e800

update README for updated docker images (#328)

41a4d15
unverified

Merge pull request #306 from ethanhs/xgen

dcdec44
unverified

don't resize embeddings to multiples of 32x by default

1066751

Add XGen info to README and example config

3881143

Fix(readme): Improve wording for push model

165907f
unverified

fix(readme): remove accelerate config

b64f411
unverified

Merge pull request #279 from NanoCode012/feat/multi-gpu-readme

469c08c
unverified

Add dataset name to all yaml options in README

3cdd8e4

Feat(readme): improve docs on multi-gpu

cf5ae6b

Fix formatting mistake

46032a1

Add example of dataset with configuration name to README

8bba642

Merge pull request #275 from NanoCode012/feat/safetensors

231031a
unverified

Feat: Add save_safetensors

5491278

Feat(docs): Add model_revision arg

896c1ae
unverified

Fix for linter

41da98b
unverified

Fix local path loading and custom strategy type

9e64f42
unverified

Fix future deprecation push_to_hub_model_id

e79c8e6

open orca support

78a1e1f

Update README.md

c146880
unverified

optionally define whether to use_fast tokenizer

47d601f

add docs

c969f0a

hint to what AMP means

d7635b7

add float16 docs and tweak typehints

88e17ff

Merge pull request #92 from OpenAccess-AI-Collective/flash-optimum

16bb627
unverified

Fix sharegpt type

3513885
unverified

Update README.md to include a community showcase

5ff547d
unverified

fix inference

34ae699