---
license: mit
language:
- de
metrics:
- accuracy
tags:
- twitter
---
# T-GBERT
This is a [GBERT-base](https://huggingface.co/deepset/gbert-base) with continued pretraining
on roughly 33 million German X/Twitter deduplicated posts from 2020. The pretraining is follows the task-adaptive pretraining setup suggested by
[Gururangan et al. (2020)](https://aclanthology.org/2020.acl-main.740). In total, the model was trained for 10 epochs. I am sharing this model as
it might be useful to some of you and initial result suggest (some) improvements compared to [GBERT-base](https://huggingface.co/deepset/gbert-base)
(which is a common choice for supervised fine-tuning).
## Performance
| | [GermEval-2017](https://sites.google.com/view/germeval2017-absa/home)
(subtask B, synchronic test set)| [SB10k](https://aclanthology.org/W17-1106/) |
|:----------:|:-------------:|:-----:|
| GBERT-base | 79.77% | 82.29% |
| T-GBERT | 81.50% | 82.88% |
*Results report the accuracy (micro F1-score) on the test set of the respective dataset. The results represent the average of five runs
with different seeds for data shuffling and parameter initialization.*
## Preprocessing
Weblinks in posts were replaced by 'https' and user mentions were replaced by '@user'.