Commits · Dovakiins/qwerrwe

set default for merge (#1044)

63fb3eb
unverified

hamel commited on Jan 5

[Docs] delete unused cfg value `lora_out_dir` (#1029)

a3e8783
unverified

hamel

Nanobit commited on Jan 3

chore(readme): update instruction to set config to load from cache (#1030)

b31038a
unverified

Nanobit commited on Jan 3

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

winglian commited on Jan 2

Adds chat templates (#1022)

f8ae59b
unverified

mhenrichsen commited on Dec 29, 2023

feat: expose bnb kwargs (#1018)

41353d2
unverified

Nanobit

hamel commited on Dec 29, 2023

feat: remove need to add load_in* during merge (#1017)

f6ecf14
unverified

Nanobit commited on Dec 29, 2023

[Docs] Nit: Remind people to auth to wandb if they are going to use it (#1013)

dec66d7
unverified

hamel commited on Dec 29, 2023

Update README.md (#1012)

76357dc
unverified

hamel commited on Dec 29, 2023

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

winglian commited on Dec 28, 2023

Update README.md (#966)

d25c34c
unverified

eltociear commited on Dec 17, 2023

Add docs (#947)

712fd27
unverified

hamel

winglian commited on Dec 13, 2023

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

dg-kalle commited on Dec 13, 2023

More hints on what to do with CUDA Out of memory errors (#925)

b0cf397
unverified

Juraj Bednar commited on Dec 13, 2023

new evals_per_epoch and saves_per_epoch to make things cleaner (#944)

5f79b82
unverified

winglian commited on Dec 12, 2023

Mixtral multipack (#928)

68b227a
unverified

winglian commited on Dec 10, 2023

chore: clarify Readme on sharegpt system role

d339beb
unverified

Nanobit commited on Dec 8, 2023

Support device_map=sequential & max_memory config parameters (#903)

992e742
unverified

Bryan Thornbury

winglian commited on Dec 4, 2023

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

Nanobit commited on Dec 4, 2023

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

user735 Karl-Johan Alm commited on Dec 4, 2023

Feat: Add Qwen (#894)

1115c50
unverified

Nanobit commited on Nov 25, 2023

Feat: Add warmup_ratio (#893)

fb12895
unverified

Nanobit commited on Nov 25, 2023

chore(doc): Add info on changing role in sharegpt (#886)

9fc29e0
unverified

Nanobit commited on Nov 22, 2023

Install from git url (#874)

ddf8150
unverified

marksaroufim commited on Nov 17, 2023

try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867)

0de1457
unverified

winglian commited on Nov 16, 2023

Feat: Add dataset loading from S3, GCS (#765)

3cc67d2
unverified

Nanobit commited on Nov 16, 2023

allow overriding of model_config parameters from the YML (#853)

1bc1186
unverified

winglian commited on Nov 16, 2023

make docker command more robust (#861)

8a8d1c4
unverified

winglian commited on Nov 16, 2023

lint fix that didn't get caught by linter (#866)

332984d
unverified

winglian commited on Nov 15, 2023

Docs: add instructions to 1-click launching on public clouds (#862)

b33c1d5
unverified

zongheng commited on Nov 15, 2023

chore(doc): Separate section on runpod (#860)

501b4d1
unverified

Nanobit commited on Nov 15, 2023

feat(doc): add more info on train_on_split (#855)

306fe19
unverified

Nanobit commited on Nov 15, 2023

Feat: Added Gradio support (#812)

738a057
unverified

stillerman commited on Nov 5, 2023

update table for rwkv4 support, fix process count for dataset (#822)

cdc71f7
unverified

winglian commited on Nov 5, 2023

fix eval_steps to be a sane default (#797)

8b79ff0
unverified

winglian commited on Oct 28, 2023

Add docker advanced instruction to README (#792)

2e71ff0
unverified

gordicaleksa commited on Oct 27, 2023

Create preprocess CLI (#785)

e50ab07
unverified

casperhansen commited on Oct 26, 2023

chore(readme): Improve documentation on conversation field (#782)

20aa4b5
unverified

Nanobit commited on Oct 24, 2023

Fix: Cannot tokenize with bf16 and on cpu (#766)

afedc47
unverified

Nanobit commited on Oct 22, 2023

Implement fused modules (#747)

15d3a65
unverified

casperhansen

winglian commited on Oct 21, 2023

add to docs (#703)

a21935f
unverified

winglian commited on Oct 20, 2023

Clarify custom format example (#729)

e1b214c
unverified

casperhansen commited on Oct 14, 2023

add noisy embedding (#721)

3bd9528
unverified

Maxime Maxime commited on Oct 13, 2023

fix(doc): update default doc according to arg (#714)

5855dde
unverified

Nanobit commited on Oct 10, 2023

fix(doc): Add note on inference w sample packing (#712)

11c48c5
unverified

Nanobit commited on Oct 10, 2023

Update README with some explanations (#700)

77c84e0
unverified

seungduk commited on Oct 8, 2023

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)

2642cae
unverified

winglian commited on Oct 3, 2023

Make dataset_processes configurable (#651)

9ec2077
unverified

corbt commited on Sep 29, 2023

Fix bug when using pretokenized datasets (#652)

590d603
unverified

ich commited on Sep 29, 2023

add support for defined train split (#654)

409ca0f
unverified

winglian commited on Sep 29, 2023

Commit History

set default for merge (#1044) 63fb3eb unverified

[Docs] delete unused cfg value `lora_out_dir` (#1029) a3e8783 unverified

chore(readme): update instruction to set config to load from cache (#1030) b31038a unverified

use recommended setting for use_reentrant w gradient checkpointing (#1021) 4d2e842 unverified

Adds chat templates (#1022) f8ae59b unverified

feat: expose bnb kwargs (#1018) 41353d2 unverified

feat: remove need to add load_in* during merge (#1017) f6ecf14 unverified

[Docs] Nit: Remind people to auth to wandb if they are going to use it (#1013) dec66d7 unverified

Update README.md (#1012) 76357dc unverified

remove landmark attn and xpos rope implementations (#1010) 70b46ca unverified

Update README.md (#966) d25c34c unverified

Add docs (#947) 712fd27 unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941) ef24342 unverified

More hints on what to do with CUDA Out of memory errors (#925) b0cf397 unverified

new evals_per_epoch and saves_per_epoch to make things cleaner (#944) 5f79b82 unverified

Mixtral multipack (#928) 68b227a unverified

chore: clarify Readme on sharegpt system role d339beb unverified

Support device_map=sequential & max_memory config parameters (#903) 992e742 unverified

Feat(wandb): Refactor to be more flexible (#767) a1da39c unverified

feature: loss watchdog for terminating training runs that are failing (#899) 58ec8b1 unverified

Feat: Add Qwen (#894) 1115c50 unverified

Feat: Add warmup_ratio (#893) fb12895 unverified

chore(doc): Add info on changing role in sharegpt (#886) 9fc29e0 unverified

Install from git url (#874) ddf8150 unverified

try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867) 0de1457 unverified

Feat: Add dataset loading from S3, GCS (#765) 3cc67d2 unverified

allow overriding of model_config parameters from the YML (#853) 1bc1186 unverified

make docker command more robust (#861) 8a8d1c4 unverified

lint fix that didn't get caught by linter (#866) 332984d unverified

Docs: add instructions to 1-click launching on public clouds (#862) b33c1d5 unverified

chore(doc): Separate section on runpod (#860) 501b4d1 unverified

feat(doc): add more info on train_on_split (#855) 306fe19 unverified

Feat: Added Gradio support (#812) 738a057 unverified

update table for rwkv4 support, fix process count for dataset (#822) cdc71f7 unverified

fix eval_steps to be a sane default (#797) 8b79ff0 unverified

Add docker advanced instruction to README (#792) 2e71ff0 unverified

Create preprocess CLI (#785) e50ab07 unverified

chore(readme): Improve documentation on conversation field (#782) 20aa4b5 unverified

Fix: Cannot tokenize with bf16 and on cpu (#766) afedc47 unverified

Implement fused modules (#747) 15d3a65 unverified

add to docs (#703) a21935f unverified

Clarify custom format example (#729) e1b214c unverified

add noisy embedding (#721) 3bd9528 unverified

fix(doc): update default doc according to arg (#714) 5855dde unverified

fix(doc): Add note on inference w sample packing (#712) 11c48c5 unverified

Update README with some explanations (#700) 77c84e0 unverified

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662) 2642cae unverified

Make dataset_processes configurable (#651) 9ec2077 unverified

Fix bug when using pretokenized datasets (#652) 590d603 unverified

add support for defined train split (#654) 409ca0f unverified

set default for merge (#1044)

63fb3eb
unverified

[Docs] delete unused cfg value `lora_out_dir` (#1029)

a3e8783
unverified

chore(readme): update instruction to set config to load from cache (#1030)

b31038a
unverified

use recommended setting for use_reentrant w gradient checkpointing (#1021)

4d2e842
unverified

Adds chat templates (#1022)

f8ae59b
unverified

feat: expose bnb kwargs (#1018)

41353d2
unverified

feat: remove need to add load_in* during merge (#1017)

f6ecf14
unverified

[Docs] Nit: Remind people to auth to wandb if they are going to use it (#1013)

dec66d7
unverified

Update README.md (#1012)

76357dc
unverified

remove landmark attn and xpos rope implementations (#1010)

70b46ca
unverified

Update README.md (#966)

d25c34c
unverified

Add docs (#947)

712fd27
unverified

fix: switch to using the HuggingFace Transformers NEFT implementation (#941)

ef24342
unverified

More hints on what to do with CUDA Out of memory errors (#925)

b0cf397
unverified

new evals_per_epoch and saves_per_epoch to make things cleaner (#944)

5f79b82
unverified

Mixtral multipack (#928)

68b227a
unverified

chore: clarify Readme on sharegpt system role

d339beb
unverified

Support device_map=sequential & max_memory config parameters (#903)

992e742
unverified

Feat(wandb): Refactor to be more flexible (#767)

a1da39c
unverified

feature: loss watchdog for terminating training runs that are failing (#899)

58ec8b1
unverified

Feat: Add Qwen (#894)

1115c50
unverified

Feat: Add warmup_ratio (#893)

fb12895
unverified

chore(doc): Add info on changing role in sharegpt (#886)

9fc29e0
unverified

Install from git url (#874)

ddf8150
unverified

try #2: pin hf transformers and accelerate to latest release, don't reinstall pytorch (#867)

0de1457
unverified

Feat: Add dataset loading from S3, GCS (#765)

3cc67d2
unverified

allow overriding of model_config parameters from the YML (#853)

1bc1186
unverified

make docker command more robust (#861)

8a8d1c4
unverified

lint fix that didn't get caught by linter (#866)

332984d
unverified

Docs: add instructions to 1-click launching on public clouds (#862)

b33c1d5
unverified

chore(doc): Separate section on runpod (#860)

501b4d1
unverified

feat(doc): add more info on train_on_split (#855)

306fe19
unverified

Feat: Added Gradio support (#812)

738a057
unverified

update table for rwkv4 support, fix process count for dataset (#822)

cdc71f7
unverified

fix eval_steps to be a sane default (#797)

8b79ff0
unverified

Add docker advanced instruction to README (#792)

2e71ff0
unverified

Create preprocess CLI (#785)

e50ab07
unverified

chore(readme): Improve documentation on conversation field (#782)

20aa4b5
unverified

Fix: Cannot tokenize with bf16 and on cpu (#766)

afedc47
unverified

Implement fused modules (#747)

15d3a65
unverified

add to docs (#703)

a21935f
unverified

Clarify custom format example (#729)

e1b214c
unverified

add noisy embedding (#721)

3bd9528
unverified

fix(doc): update default doc according to arg (#714)

5855dde
unverified

fix(doc): Add note on inference w sample packing (#712)

11c48c5
unverified

Update README with some explanations (#700)

77c84e0
unverified

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)

2642cae
unverified

Make dataset_processes configurable (#651)

9ec2077
unverified

Fix bug when using pretokenized datasets (#652)

590d603
unverified

add support for defined train split (#654)

409ca0f
unverified