aehrm commited on
Commit
1d0e380
·
1 Parent(s): 1b5503e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -6,3 +6,55 @@ tags:
6
  language: de
7
  ---
8
  # Tagger for literary character mentions (DROC corpus)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  language: de
7
  ---
8
  # Tagger for literary character mentions (DROC corpus)
9
+
10
+ Thi is the character recognizer model that is being used in [LLpro](https://github.com/cophi-wue/LLpro). It detects character mentions in literary fiction: (a) proper nouns ("Alice", "Effi"), and (b) nominal phrases ("Gärtner", "Mutter", "Graf", "Idiot", "Schöne", ...). The model is trained on the [DROC dataset](https://gitlab2.informatik.uni-wuerzburg.de/kallimachos/DROC-Release), fine-tuning the domain-adapted [lkonle/fiction-gbert-large](https://huggingface.co/lkonle/fiction-gbert-large). ([Training code](https://github.com/cophi-wue/LLpro/blob/main/contrib/train_character_recognizer.py))
11
+
12
+ F1-Score: **91.85** (on a held-out data split; micro average on B-PER and I-PER labels)
13
+
14
+
15
+ ---
16
+
17
+ **Demo Usage:**
18
+
19
+ ```
20
+ from flair.data import Sentence
21
+ from flair.models import SequenceTagger
22
+
23
+ # load tagger
24
+ tagger = SequenceTagger.load("aehrm/droc-character-recognizer")
25
+
26
+ # make example sentence
27
+ sentence = Sentence("Effi folgte Graf Instetten nach Kessin.")
28
+
29
+ # predict NER tags
30
+ tagger.predict(sentence)
31
+
32
+ # print sentence
33
+ print(sentence)
34
+ # >>> Sentence[7]: "Effi folgte Graf Instetten nach Kessin." → ["Effi"/PER, "Graf Instetten"/PER]
35
+
36
+ # print predicted NER spans
37
+ print('The following NER tags are found:')
38
+ # iterate over entities and print
39
+ for entity in sentence.get_spans('character'):
40
+ print(entity)
41
+ # >>> Span[0:1]: "Effi" → PER (1.0)
42
+ # >>> Span[2:4]: "Graf Instetten" → PER (1.0)
43
+ ```
44
+
45
+ **Cite**:
46
+
47
+ Please cite the following paper when using this model.
48
+
49
+ ```
50
+
51
+ @inproceedings{ehrmanntraut_llpro_2023,
52
+ location = {Ingolstadt, Germany},
53
+ title = {{LLpro}: A Literary Language Processing Pipeline for {German} Narrative Text},
54
+ booktitle = {Proceedings of the 10th Conference on Natural Language Processing ({KONVENS} 2022)},
55
+ publisher = {{KONVENS} 2023 Organizers},
56
+ author = {Ehrmanntraut, Anton and Konle, Leonard and Jannidis, Fotis},
57
+ date = {2023},
58
+ }
59
+
60
+ ```