qwerrwe / src /axolotl /prompt_tokenizers.py

Commit History

use fastchat conversations template (#578)
e7d3e2d
unverified

winglian commited on

better handling and logging of empty sharegpt turns (#603)
a363604
unverified

winglian commited on

split completion text to sequence_len (#616)
97d3776
unverified

winglian commited on

improve handling for empty text on the tokenization step (#502)
1eebbd0
unverified

winglian commited on

support custom field for completion from yml (#580)
f7a2263
unverified

winglian commited on

improve llama pad token handling (#475)
cb9797e
unverified

winglian commited on

gracefully handle empty input (#442)
9d629d8
unverified

winglian commited on

better handling of empty input ids when tokenizing (#395)
85cf4f8
unverified

winglian commited on

better handling since xgen tokenizer breaks with convert_tokens_to_ids
2a428e8

winglian commited on

Fix typing list
77bdb7d
unverified

Nanobit commited on

initial wip to get sys prompt from dataset
8d20e0a

winglian commited on

bugfix for potential off by one
7925ddc

winglian commited on

Fix sharegpt prompt
25eeeeb

Nanobit commited on

Fix security issue or ignore false positives
a1f9850

Nanobit commited on

Apply isort then black
37293dc

Nanobit commited on

Fix mypy typing
e9650d3

Nanobit commited on

Fix unsupported operand type(s) for |
be22551

Nanobit commited on

Refactor duplicate code between Prompter and Pygmalion
8e46c0f

Nanobit commited on

Lint prompt_tokenizers
5d86137

Nanobit commited on

refactor conversation plucking in sharegpt
21c8e2d

winglian commited on

apply black formatting
ce34d64

winglian commited on

tokenization fixes
4ea9a66

winglian commited on

optionally be able to specify alpaca or chat style prompts
1d5ab84

winglian commited on

concise multiple choice and tldr summarize
1365073

winglian commited on

add alpaca multiple choice instruct dataset support
b46bc02

winglian commited on

fix prompters, especially the sharegpt prompter
5e37144

winglian commited on

black formatting
2bc1a5b

winglian commited on

Rename variable to use same convention
174b74d

Nanobit commited on

Add CompletionPrompt type
cf68153

Nanobit commited on

Jeopardy bot! (#17)
a12fb0a
unverified

winglian commited on

WIP large refactor to make finetune script a little more manageable (#3)
6045345
unverified

winglian commited on

add support for alpaca reflect training (#2)
81de0ef
unverified

winglian commited on

Tokenization open assistant (#1)
87d7825
unverified

winglian commited on

suppport for alpaca-like instruction datasets without inputs
e107643

winglian commited on

config chooser, update readme instructions, device config, llama flash attention, debug out the labels, fix config key checks, other bugfixes
f2a2029

winglian commited on

black formatting
a6028d3

winglian commited on

make it work with pythia in the cloud
8d959a7

winglian commited on

WIP for axolotl trainer
ce24f5e

winglian commited on