BERTuit-base / README.md
jahuerta92's picture
Create README.md
3ba308b
|
raw
history blame
294 Bytes
Model BERTuit as presented in the [BERTuit: Understanding Spanish language in Twitter through a native transformer](https://arxiv.org/abs/2204.03465) article.
Before tokenization replace user tags and urls with "<usr>" and "<url>" respectively.
Tokenize text with base class RoBERTaTokenizer.