qwerrwe / src /axolotl /utils /tokenization.py

Commit History

prepared dataset caching, other misc fixes (#665)
e50a64e
unverified

winglian commited on

use fastchat conversations template (#578)
e7d3e2d
unverified

winglian commited on

minor tweaks to simplify (#597)
31b9e0c
unverified

winglian commited on

Debug tokenization output: Add ability to output text only (no tokens), and/or specify num samples to see (#511)
48434be
unverified

Tom Jobbins commited on

add tests and supoort for loader for sys prompt data
3a38271

winglian commited on

Apply isort then black
37293dc

Nanobit commited on

Lint tokenization
e6b57de

Nanobit commited on

black formatting
2bc1a5b

winglian commited on

fix sharegpt tokenization, refactor tokenization debugging
5159d00

winglian commited on