Edit model card

T-GBERT

This is a GBERT-base with continued pretraining on roughly 33 million German X/Twitter deduplicated posts from 2020. The pretraining is follows the task-adaptive pretraining setup suggested by Gururangan et al. (2020). In total, the model was trained for 10 epochs. I am sharing this model as it might be useful to some of you and initial result suggest (some) improvements compared to GBERT-base (which is a common choice for supervised fine-tuning).

Performance

GermEval-2017
(subtask B, synchronic test set)
SB10k
GBERT-base 79.77% 82.29%
T-GBERT 81.50% 82.88%

Results report the accuracy (micro F1-score) on the test set of the respective dataset. The results represent the average of five runs with different seeds for data shuffling and parameter initialization.

Preprocessing

Weblinks in posts were replaced by 'https' and user mentions were replaced by '@user'.

Downloads last month
3
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.