File size: 294 Bytes
3ba308b
 
 
 
 
1
2
3
4
5
Model BERTuit as presented in the [BERTuit: Understanding Spanish language in Twitter through a native transformer](https://arxiv.org/abs/2204.03465) article.

Before tokenization replace user tags and urls with "<usr>" and "<url>" respectively.

Tokenize text with base class RoBERTaTokenizer.