qwerrwe / tests

Commit History

E2e device cuda (#575)
2414673
unverified

winglian commited on

e2e testing (#574)
9218ebe
unverified

winglian commited on

Fix pretraining with iterable/streaming Dataset (#556)
2f586d1
unverified

Jan Philipp Harries Jan Philipp Harries commited on

workaround for md5 variations (#533)
0b4cf5b
unverified

winglian commited on

recommend padding when using sample packing (#531)
3437149
unverified

winglian commited on

fix test fixture b/c hf trainer tokenization changed (#464)
d5dcf9c
unverified

winglian commited on

fix fixture for new tokenizer handling in transformers (#428)
8cace80
unverified

winglian commited on

simplify `load_tokenizer`
efb3b2c

tmm1 commited on

extract module for working with cfg
8cec513

tmm1 commited on

fix DefaultDict.__or__
a13e45d

tmm1 commited on

Attention mask and position id fixes for packing (#285)
2bb0b78
unverified

winglian commited on

experimental llama 2 chat support (#296)
3392270
unverified

Jan Philipp Harries Jan Philipp Harries commited on

update prompts for open orca to match the paper (#317)
3d4984b
unverified

winglian commited on

Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var
b1f4f7a

theobjectivedad commited on

params are adam_*, not adamw_*
19cf0bd

winglian commited on

add tests and supoort for loader for sys prompt data
3a38271

winglian commited on

initial wip to get sys prompt from dataset
8d20e0a

winglian commited on

optionally define whether to use_fast tokenizer
47d601f

winglian commited on

Additional test case per pr
ad5ca4f

winglian commited on

add validation and tests for adamw hyperparam
cb9d3af

winglian commited on

Merge pull request #214 from OpenAccess-AI-Collective/fix-tokenizing-labels
1925eaf
unverified

winglian commited on

fix test name
1ab3bf3

winglian commited on

ingore duplicate code in tests
baed440

winglian commited on

bugfix for potential off by one
7925ddc

winglian commited on

Merge branch 'main' into flash-optimum
fd2c981
unverified

winglian commited on

new validation for mpt w grad checkpoints
14668fa

winglian commited on

add streaming dataset support for pretraining datasets
eea2731

winglian commited on

Validate falcon with fsdp
babf0fd

Nanobit commited on

Update doc for grad_accu and add validation tests for batch size
3c71c8d

Nanobit commited on

don't worry about duplicate code here
0136f51

winglian commited on

fix packing so that concatenated sequences reset the attention
9b8585d

winglian commited on

black formatting
6fa40bf

winglian commited on

add support for gradient accumulation steps
3aad5f3

winglian commited on

Fix pre-commit for rebased files
b81c97f

Nanobit commited on

fix relative path for fixtures
cfcc549

winglian commited on

Apply isort then black
37293dc

Nanobit commited on

Ignore unsupported-binary-operation
0dd35c7

Nanobit commited on

Black formatting
b832a0a

Nanobit commited on

Lint validation
1f3c3f5

Nanobit commited on

Lint test_dict
0e95288

Nanobit commited on

Lint test_prompters
7eb33a7

Nanobit commited on

Lint and format
392dfd9

Nanobit commited on

fix relative path for fixtures
e65aeed

winglian commited on

add unit test for sharegpt tokenization
e6fdeb0

winglian commited on

update for pr feedback
fd5f965

winglian commited on

new hf_use_auth_token setting so login to hf isn't required
1c33eb8

winglian commited on

Feat: Update validate_config and add tests
52dd92a

Nanobit commited on

Fix incorrect syntax in test
f87bd20

Nanobit commited on

Add test for DictDefault
923151f

Nanobit commited on