julien-c HF staff commited on
Commit
ac8e84b
·
1 Parent(s): 3c5d41d

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/redewiedergabe/bert-base-historical-german-rw-cased/README.md

Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: de
3
+ ---
4
+
5
+ # Model description
6
+ ## Dataset
7
+ Trained on fictional and non-fictional German texts written between 1840 and 1920:
8
+ * Narrative texts from Digitale Bibliothek (https://textgrid.de/digitale-bibliothek)
9
+ * Fairy tales and sagas from Grimm Korpus (https://www1.ids-mannheim.de/kl/projekte/korpora/archiv/gri.html)
10
+ * Newspaper and magazine article from Mannheimer Korpus Historischer Zeitungen und Zeitschriften (https://repos.ids-mannheim.de/mkhz-beschreibung.html)
11
+ * Magazine article from the journal „Die Grenzboten“ (http://www.deutschestextarchiv.de/doku/textquellen#grenzboten)
12
+ * Fictional and non-fictional texts from Projekt Gutenberg (https://www.projekt-gutenberg.org)
13
+
14
+ ## Hardware used
15
+ 1 Tesla P4 GPU
16
+
17
+ ## Hyperparameters
18
+
19
+ | Parameter | Value |
20
+ |-------------------------------|----------|
21
+ | Epochs | 3 |
22
+ | Gradient_accumulation_steps | 1 |
23
+ | Train_batch_size | 32 |
24
+ | Learning_rate | 0.00003 |
25
+ | Max_seq_len | 128 |
26
+
27
+ ## Evaluation results: Automatic tagging of four forms of speech/thought/writing representation in historical fictional and non-fictional German texts
28
+
29
+ The language model was used in the task to tag direct, indirect, reported and free indirect speech/thought/writing representation in fictional and non-fictional German texts. The tagger is available and described in detail at https://github.com/redewiedergabe/tagger.
30
+
31
+ The tagging model was trained using the SequenceTagger Class of the Flair framework ([Akbik et al., 2019](https://www.aclweb.org/anthology/N19-4010)) which implements a BiLSTM-CRF architecture on top of a language embedding (as proposed by [Huang et al. (2015)](https://arxiv.org/abs/1508.01991)).
32
+
33
+
34
+ Hyperparameters
35
+
36
+ | Parameter | Value |
37
+ |-------------------------------|------------|
38
+ | Hidden_size | 256 |
39
+ | Learning_rate | 0.1 |
40
+ | Mini_batch_size | 8 |
41
+ | Max_epochs | 150 |
42
+
43
+ Results are reported below in comparison to a custom trained flair embedding, which was stacked onto a custom trained fastText-model. Both models were trained on the same dataset.
44
+
45
+ | | BERT ||| FastText+Flair |||Test data|
46
+ |----------------|----------|-----------|----------|------|-----------|--------|--------|
47
+ | | F1 | Precision | Recall | F1 | Precision | Recall ||
48
+ | Direct | 0.80 | 0.86 | 0.74 | 0.84 | 0.90 | 0.79 |historical German, fictional & non-fictional|
49
+ | Indirect | **0.76** | **0.79** | **0.73** | 0.73 | 0.78 | 0.68 |historical German, fictional & non-fictional|
50
+ | Reported | **0.58** | **0.69** | **0.51** | 0.56 | 0.68 | 0.48 |historical German, fictional & non-fictional|
51
+ | Free indirect | **0.57** | **0.80** | **0.44** | 0.47 | 0.78 | 0.34 |modern German, fictional|
52
+
53
+ ## Intended use:
54
+ Historical German Texts (1840 to 1920)
55
+
56
+ (Showed good performance with modern German fictional texts as well)
57
+