T-GBERT
This is a GBERT-base with continued pretraining on roughly 33 million German X/Twitter deduplicated posts from 2020. The pretraining is follows the task-adaptive pretraining setup suggested by Gururangan et al. (2020). In total, the model was trained for 10 epochs. I am sharing this model as it might be useful to some of you and initial result suggest (some) improvements compared to GBERT-base (which is a common choice for supervised fine-tuning).
Performance
GermEval-2017 (subtask B, synchronic test set) |
SB10k | |
---|---|---|
GBERT-base | 79.77% | 82.29% |
T-GBERT | 81.50% | 82.88% |
Results report the accuracy (micro F1-score) on the test set of the respective dataset. The results represent the average of five runs with different seeds for data shuffling and parameter initialization.
Preprocessing
Weblinks in posts were replaced by 'https' and user mentions were replaced by '@user'.
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.