Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
bigscience-catalogue-data-dev
/
byte-level-bpe-tokenizer-no-norm-250k-whitespace-and-eos-regex-alpha-v3-dedup-lines-articles
like
0
Follow
BigScience Catalogue Data Dev
5
Model card
Files
Files and versions
Community
cec6759
byte-level-bpe-tokenizer-no-norm-250k-whitespace-and-eos-regex-alpha-v3-dedup-lines-articles
2 contributors
History:
2 commits
TimeRobber
Add tokenizer
cec6759
over 2 years ago
.gitattributes
Safe
1.23 kB
Add tokenizer
over 2 years ago
special_tokens_map.json
Safe
85 Bytes
Add tokenizer
over 2 years ago
tokenizer.json
Safe
14.5 MB
LFS
Add tokenizer
over 2 years ago
tokenizer_config.json
Safe
131 Bytes
Add tokenizer
over 2 years ago