aehrm
/

droc-character-recognizer

Token Classification

sequence-tagger-model

Model card Files Files and versions Community

aehrm commited on Aug 16, 2023

Commit

1d0e380

·

1 Parent(s): 1b5503e

Update README.md

Files changed (1) hide show

README.md +52 -0

README.md CHANGED Viewed

@@ -6,3 +6,55 @@ tags:
 language: de
 ---
 # Tagger for literary character mentions (DROC corpus)

 language: de
 ---
 # Tagger for literary character mentions (DROC corpus)
+Thi is the character recognizer model that is being used in [LLpro](https://github.com/cophi-wue/LLpro). It detects character mentions in literary fiction: (a) proper nouns ("Alice", "Effi"), and (b) nominal phrases ("Gärtner", "Mutter", "Graf", "Idiot", "Schöne", ...). The model is trained on the [DROC dataset](https://gitlab2.informatik.uni-wuerzburg.de/kallimachos/DROC-Release), fine-tuning the domain-adapted [lkonle/fiction-gbert-large](https://huggingface.co/lkonle/fiction-gbert-large). ([Training code](https://github.com/cophi-wue/LLpro/blob/main/contrib/train_character_recognizer.py))
+F1-Score: **91.85** (on a held-out data split; micro average on B-PER and I-PER labels)
+---
+**Demo Usage:**
+```
+from flair.data import Sentence
+from flair.models import SequenceTagger
+# load tagger
+tagger = SequenceTagger.load("aehrm/droc-character-recognizer")
+# make example sentence
+sentence = Sentence("Effi folgte Graf Instetten nach Kessin.")
+# predict NER tags
+tagger.predict(sentence)
+# print sentence
+print(sentence)
+# >>> Sentence[7]: "Effi folgte Graf Instetten nach Kessin." → ["Effi"/PER, "Graf Instetten"/PER]
+# print predicted NER spans
+print('The following NER tags are found:')
+# iterate over entities and print
+for entity in sentence.get_spans('character'):
+    print(entity)
+# >>> Span[0:1]: "Effi" → PER (1.0)
+# >>> Span[2:4]: "Graf Instetten" → PER (1.0)
+```
+**Cite**:
+Please cite the following paper when using this model.
+```
+@inproceedings{ehrmanntraut_llpro_2023,
+	location = {Ingolstadt, Germany},
+	title = {{LLpro}: A Literary Language Processing Pipeline for {German} Narrative Text},
+	booktitle = {Proceedings of the 10th Conference on Natural Language Processing ({KONVENS} 2022)},
+	publisher = {{KONVENS} 2023 Organizers},
+	author = {Ehrmanntraut, Anton and Konle, Leonard and Jannidis, Fotis},
+	date = {2023},
+}
+```