Commit History
update README
27e62e8
more spacy
e938b17
add spacy trainer
b1d540a
update trainer script
e8868b3
add 30k vocab size tokenizer
d6c3250
update tokenizer script
57c829b
update model
4e1f514
add tokenization process
24469e0
add balochi tokenizer
1380cef
process and clean files
b3db2b8
Get all text paths
470e696
add data folder to gitignore
c9036ae
add python gitignore
d1671f2
Initial commit
851cdf4
unverified
Alex Strick van Linschoten
commited on