Commit History
support user defined prompters, pretokenized datasets in config, local parquet, local arrow files (#348)
d2e7f27
unverified
winglian
commited on
fix orca prompts (#422)
1b7e860
unverified
winglian
commited on
Attention mask and position id fixes for packing (#285)
2bb0b78
unverified
winglian
commited on
Fix(message): Improve error message for bad format (#365)
e37d935
unverified
Nanobit
commited on
experimental llama 2 chat support (#296)
3392270
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on
Added Orca Mini prompt strategy (#263)
c93655c
unverified
Jan Philipp Harries
Jan Philipp Harries
commited on
update prompts for open orca to match the paper (#317)
3d4984b
unverified
winglian
commited on
Fixed pre-commit problems, fixed small bug in logging_config to handle LOG_LEVEL env var
b1f4f7a
theobjectivedad
commited on
Adding logging enhancement
553a86b
theobjectivedad
commited on
Merge pull request #255 from OpenAccess-AI-Collective/open-orca-prompts
1e5014a
unverified
winglian
commited on
open orca support
78a1e1f
winglian
commited on
add option for instruct w sys prompts
924bbfd
winglian
commited on
Merge pull request #224 from OpenAccess-AI-Collective/system-prompt-data
f150c02
unverified
winglian
commited on
push intermediate model checkpoints to hub
612aabd
winglian
commited on
skip the system prompt
05ab909
winglian
commited on
pylint for duplicated code for system prompts
7b57ed7
winglian
commited on
add tests and supoort for loader for sys prompt data
3a38271
winglian
commited on
initial wip to get sys prompt from dataset
8d20e0a
winglian
commited on
bugfix for potential off by one
7925ddc
winglian
commited on
update alpaca_chat prompts for instructions to explainn the conversation
4b43a66
winglian
commited on
add new sharegpt, refactor prompt so it can be customized later, add exception if no data is processed
aac4b76
winglian
commited on
fix camel ai, add guanaco/oasst mapping for sharegpt
59bb219
winglian
commited on
new prompters, misc fixes for output dir missing using fsdp, and changing max seq len
4ac9e25
winglian
commited on
Apply isort then black
37293dc
Nanobit
commited on
Fix mypy typing
e9650d3
Nanobit
commited on
Refactor duplicate code between Prompter and Pygmalion
8e46c0f
Nanobit
commited on
Lint pygmalion
01c8a33
Nanobit
commited on
Lint creative_acr
1645a4d
Nanobit
commited on
Lint alpaca_instruct
145b060
Nanobit
commited on
Lint alpaca_chat
8cc0aad
Nanobit
commited on
Fix lint
903ea30
Nanobit
commited on
apply black formatting
ce34d64
winglian
commited on
fix enum pass as value
fb100a9
winglian
commited on
Add qa style data for alpaca instructions, fix one_cycle scheduler
3a50377
winglian
commited on
fix new dataset prompt tokenizers
0f74464
winglian
commited on
add missing __init__
e0602a9
winglian
commited on
pygmalion dataset prompts format, cached tokenized datasets should be hashed on the tokenizer too
2809f3f
winglian
commited on
tokenization fixes
4ea9a66
winglian
commited on