Commits · Dovakiins/qwerrwe

Save Axolotl config as WandB artifact (#716)

490923f
unverified

Jan Philipp Harries commited on Oct 11, 2023

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)

2642cae
unverified

winglian commited on Oct 3, 2023

Make dataset_processes configurable (#651)

9ec2077
unverified

corbt commited on Sep 29, 2023

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)

383f88d
unverified

Nanobit commited on Sep 28, 2023

attention_mask not needed for training (#642)

e8cbf50
unverified

winglian commited on Sep 27, 2023

chore(callback): Remove old peft saving code (#510)

d5f8589
unverified

Nanobit commited on Sep 22, 2023

misc fixes to add gptq tests (#621)

03e5907
unverified

winglian commited on Sep 22, 2023

run eval on the first step to get a baseline (#617)

2844eb2
unverified

winglian commited on Sep 22, 2023

minor tweaks to simplify (#597)

31b9e0c
unverified

winglian commited on Sep 18, 2023

gather/broadcast the max value of the packing efficiency automatically (#463)

b15b19e
unverified

winglian commited on Sep 17, 2023

don't add position_ids for evals (#591)

ab534d7
unverified

winglian commited on Sep 16, 2023

optionally configure sample packing for evals (#589)

21ec195
unverified

winglian commited on Sep 16, 2023

fix save_steps so it doesn't get duplicated (#567)

3fbde76
unverified

winglian commited on Sep 14, 2023

let hf trainer handle torch compile (#516)

a4e1bb6
unverified

winglian

tmm1 commited on Sep 13, 2023

improve how we setup eval/save strategies and steps (#547)

36e53c7
unverified

winglian commited on Sep 13, 2023

add optimization for group-by-len (#563)

e5bb22a
unverified

winglian commited on Sep 13, 2023

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Glavin001 commited on Sep 13, 2023

Early stopping metric (#537)

e30f1e3
unverified

winglian commited on Sep 8, 2023

misc fixes/improvements (#513)

a546ca2
unverified

winglian commited on Sep 5, 2023

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

winglian commited on Sep 5, 2023

log supervised token count (#448)

7710e81
unverified

winglian commited on Aug 31, 2023

Added advanced DDP args (#515)

396a7a7
unverified

Jan Philipp Harries Jan Philipp Harries commited on Aug 31, 2023

drop empty tokenized rows too (#509)

c56b450
unverified

winglian commited on Aug 30, 2023

add eval benchmark callback (#441)

7657632
unverified

winglian commited on Aug 29, 2023

use math.ceil instead of round /cc #498

fd55bc8

tmm1 commited on Aug 29, 2023

pad_to_worst_case_seq_len boolean, for testing memory limits (#498)

8e197f6
unverified

Birch-san

tmm1 commited on Aug 28, 2023

let transformers handle adamw_bnb_8bit

868530c

tmm1 commited on Aug 26, 2023

ReLoRA implementation (with quantization) (#322)

bde3c5a
unverified

chargoddard

winglian commited on Aug 24, 2023

always drop samples that are too long (#452)

50682a3
unverified

winglian commited on Aug 21, 2023

set env var for FSDP layer to wrap (#453)

5a1985b
unverified

winglian commited on Aug 21, 2023

add missing positional arg (#450)

58cf7e7
unverified

winglian commited on Aug 21, 2023

fix evals (#447)

ee26281
unverified

winglian commited on Aug 21, 2023

disable eval using multipack for now (#437)

f733d0f
unverified

winglian commited on Aug 19, 2023

fix comma, not a tuple (#436)

008505c
unverified

winglian commited on Aug 19, 2023

use save_strategy from config if available (#434)

b3f5e00
unverified

winglian commited on Aug 19, 2023

set env for FSDP offload params (#433)

5247c50
unverified

winglian commited on Aug 19, 2023

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

Nanobit commited on Aug 15, 2023

fix eval steps and strategy (#403)

da10af0
unverified

winglian commited on Aug 15, 2023

Feat(config): add max steps (#387)

3c2ad00
unverified

ittailup commited on Aug 14, 2023

Added "epoch" evaluation_strategy (#388)

5d48a10
unverified

flotos commited on Aug 14, 2023

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

Nanobit commited on Aug 14, 2023

improve GPU logging to break out pytorch cache and system mem

7b55fe6

tmm1 commited on Aug 13, 2023

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

winglian commited on Aug 12, 2023

log GPU memory usage

e303d64

tmm1 commited on Aug 9, 2023

fix axolotl training args dataclass annotation

ebaec3c

winglian commited on Jul 17, 2023

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement

83237b8
unverified

The Objective Dad commited on Jul 15, 2023

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2

168a7a0
unverified

Nanobit commited on Jul 14, 2023

Adding logging enhancement

553a86b

theobjectivedad commited on Jul 14, 2023

Feat: Add save_safetensors

5491278

Nanobit commited on Jul 14, 2023

Set push to hub as private by default

1514739
unverified

Nanobit commited on Jul 14, 2023

Commit History

Save Axolotl config as WandB artifact (#716) 490923f unverified

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662) 2642cae unverified

Make dataset_processes configurable (#651) 9ec2077 unverified

Fix(cfg): Add validation for save_strategy and eval_strategy (#633) 383f88d unverified

attention_mask not needed for training (#642) e8cbf50 unverified

chore(callback): Remove old peft saving code (#510) d5f8589 unverified

misc fixes to add gptq tests (#621) 03e5907 unverified

run eval on the first step to get a baseline (#617) 2844eb2 unverified

minor tweaks to simplify (#597) 31b9e0c unverified

gather/broadcast the max value of the packing efficiency automatically (#463) b15b19e unverified

don't add position_ids for evals (#591) ab534d7 unverified

optionally configure sample packing for evals (#589) 21ec195 unverified

fix save_steps so it doesn't get duplicated (#567) 3fbde76 unverified

let hf trainer handle torch compile (#516) a4e1bb6 unverified

improve how we setup eval/save strategies and steps (#547) 36e53c7 unverified

add optimization for group-by-len (#563) e5bb22a unverified

Add training callback to send predictions to WandB table (#521) 5b67ea9 unverified

Early stopping metric (#537) e30f1e3 unverified

misc fixes/improvements (#513) a546ca2 unverified

Add support for GPTQ using native transformers/peft (#468) 3355706 unverified

log supervised token count (#448) 7710e81 unverified

Added advanced DDP args (#515) 396a7a7 unverified

drop empty tokenized rows too (#509) c56b450 unverified

add eval benchmark callback (#441) 7657632 unverified

use math.ceil instead of round /cc #498 fd55bc8

pad_to_worst_case_seq_len boolean, for testing memory limits (#498) 8e197f6 unverified

let transformers handle adamw_bnb_8bit 868530c

ReLoRA implementation (with quantization) (#322) bde3c5a unverified

always drop samples that are too long (#452) 50682a3 unverified

set env var for FSDP layer to wrap (#453) 5a1985b unverified

add missing positional arg (#450) 58cf7e7 unverified

fix evals (#447) ee26281 unverified

disable eval using multipack for now (#437) f733d0f unverified

fix comma, not a tuple (#436) 008505c unverified

use save_strategy from config if available (#434) b3f5e00 unverified

set env for FSDP offload params (#433) 5247c50 unverified

Fix(config): Update handling of deepspeed config (#404) c01015f unverified

fix eval steps and strategy (#403) da10af0 unverified

Feat(config): add max steps (#387) 3c2ad00 unverified

Added "epoch" evaluation_strategy (#388) 5d48a10 unverified

Feat(config): Add hub_strategy (#386) 73a0b6e unverified

improve GPU logging to break out pytorch cache and system mem 7b55fe6

Attention mask and position id fixes for packing (#285) 2bb0b78 unverified

log GPU memory usage e303d64

fix axolotl training args dataclass annotation ebaec3c

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement 83237b8 unverified

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2 168a7a0 unverified

Adding logging enhancement 553a86b

Feat: Add save_safetensors 5491278

Set push to hub as private by default 1514739 unverified

Save Axolotl config as WandB artifact (#716)

490923f
unverified

refactor to set eval_batch_size earlier if unset, so we can warn if mismatched (#662)

2642cae
unverified

Make dataset_processes configurable (#651)

9ec2077
unverified

Fix(cfg): Add validation for save_strategy and eval_strategy (#633)

383f88d
unverified

attention_mask not needed for training (#642)

e8cbf50
unverified

chore(callback): Remove old peft saving code (#510)

d5f8589
unverified

misc fixes to add gptq tests (#621)

03e5907
unverified

run eval on the first step to get a baseline (#617)

2844eb2
unverified

minor tweaks to simplify (#597)

31b9e0c
unverified

gather/broadcast the max value of the packing efficiency automatically (#463)

b15b19e
unverified

don't add position_ids for evals (#591)

ab534d7
unverified

optionally configure sample packing for evals (#589)

21ec195
unverified

fix save_steps so it doesn't get duplicated (#567)

3fbde76
unverified

let hf trainer handle torch compile (#516)

a4e1bb6
unverified

improve how we setup eval/save strategies and steps (#547)

36e53c7
unverified

add optimization for group-by-len (#563)

e5bb22a
unverified

Add training callback to send predictions to WandB table (#521)

5b67ea9
unverified

Early stopping metric (#537)

e30f1e3
unverified

misc fixes/improvements (#513)

a546ca2
unverified

Add support for GPTQ using native transformers/peft (#468)

3355706
unverified

log supervised token count (#448)

7710e81
unverified

Added advanced DDP args (#515)

396a7a7
unverified

drop empty tokenized rows too (#509)

c56b450
unverified

add eval benchmark callback (#441)

7657632
unverified

use math.ceil instead of round /cc #498

fd55bc8

pad_to_worst_case_seq_len boolean, for testing memory limits (#498)

8e197f6
unverified

let transformers handle adamw_bnb_8bit

868530c

ReLoRA implementation (with quantization) (#322)

bde3c5a
unverified

always drop samples that are too long (#452)

50682a3
unverified

set env var for FSDP layer to wrap (#453)

5a1985b
unverified

add missing positional arg (#450)

58cf7e7
unverified

fix evals (#447)

ee26281
unverified

disable eval using multipack for now (#437)

f733d0f
unverified

fix comma, not a tuple (#436)

008505c
unverified

use save_strategy from config if available (#434)

b3f5e00
unverified

set env for FSDP offload params (#433)

5247c50
unverified

Fix(config): Update handling of deepspeed config (#404)

c01015f
unverified

fix eval steps and strategy (#403)

da10af0
unverified

Feat(config): add max steps (#387)

3c2ad00
unverified

Added "epoch" evaluation_strategy (#388)

5d48a10
unverified

Feat(config): Add hub_strategy (#386)

73a0b6e
unverified

improve GPU logging to break out pytorch cache and system mem

7b55fe6

Attention mask and position id fixes for packing (#285)

2bb0b78
unverified

log GPU memory usage

e303d64

fix axolotl training args dataclass annotation

ebaec3c

Merge branch 'OpenAccess-AI-Collective:main' into logging_enhancement

83237b8
unverified

Merge pull request #274 from OpenAccess-AI-Collective/NanoCode012-patch-2

168a7a0
unverified

Adding logging enhancement

553a86b

Feat: Add save_safetensors

5491278

Set push to hub as private by default

1514739
unverified