Model description

Dataset

Trained on fictional and non-fictional German texts written between 1840 and 1920:

Hardware used

1 Tesla P4 GPU

Hyperparameters

Parameter Value
Epochs 3
Gradient_accumulation_steps 1
Train_batch_size 32
Learning_rate 0.00003
Max_seq_len 128

Evaluation results: Automatic tagging of four forms of speech/thought/writing representation in historical fictional and non-fictional German texts

The language model was used in the task to tag direct, indirect, reported and free indirect speech/thought/writing representation in fictional and non-fictional German texts. The tagger is available and described in detail at https://github.com/redewiedergabe/tagger.

The tagging model was trained using the SequenceTagger Class of the Flair framework (Akbik et al., 2019) which implements a BiLSTM-CRF architecture on top of a language embedding (as proposed by Huang et al. (2015)).

Hyperparameters

Parameter Value
Hidden_size 256
Learning_rate 0.1
Mini_batch_size 8
Max_epochs 150

Results are reported below in comparison to a custom trained flair embedding, which was stacked onto a custom trained fastText-model. Both models were trained on the same dataset.

BERT FastText+Flair Test data
F1 Precision Recall F1 Precision Recall
Direct 0.80 0.86 0.74 0.84 0.90 0.79 historical German, fictional & non-fictional
Indirect 0.76 0.79 0.73 0.73 0.78 0.68 historical German, fictional & non-fictional
Reported 0.58 0.69 0.51 0.56 0.68 0.48 historical German, fictional & non-fictional
Free indirect 0.57 0.80 0.44 0.47 0.78 0.34 modern German, fictional

Intended use:

Historical German Texts (1840 to 1920)

(Showed good performance with modern German fictional texts as well)

Downloads last month
4
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.