File size: 2,025 Bytes
1b5503e 1d0e380 1f64c83 1d0e380 e0243a0 1d0e380 edea4cd 1d0e380 edea4cd 1d0e380 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
tags:
- flair
- token-classification
- sequence-tagger-model
language: de
---
# Tagger for literary character mentions (DROC corpus)
This is the character recognizer model that is being used in [LLpro](https://github.com/cophi-wue/LLpro). It detects character mentions in literary fiction: (a) proper nouns ("Alice", "Effi"), and (b) nominal phrases ("Gärtner", "Mutter", "Graf", "Idiot", "Schöne", ...). The model is trained on the [DROC dataset](https://gitlab2.informatik.uni-wuerzburg.de/kallimachos/DROC-Release), fine-tuning the domain-adapted [lkonle/fiction-gbert-large](https://huggingface.co/lkonle/fiction-gbert-large). ([Training code](https://github.com/cophi-wue/LLpro/blob/main/contrib/train_character_recognizer.py))
F1-Score: **91.85** (on a held-out data split; micro average on B-PER and I-PER labels)
---
**Demo Usage:**
```python
from flair.data import Sentence
from flair.models import SequenceTagger
# load tagger
tagger = SequenceTagger.load("aehrm/droc-character-recognizer")
# make example sentence
sentence = Sentence("Effi folgte Graf Instetten nach Kessin.")
# predict NER tags
tagger.predict(sentence)
# print sentence
print(sentence)
# >>> Sentence[7]: "Effi folgte Graf Instetten nach Kessin." → ["Effi"/PER, "Graf Instetten"/PER]
# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('character'):
print(entity)
# >>> Span[0:1]: "Effi" → PER (1.0)
# >>> Span[2:4]: "Graf Instetten" → PER (1.0)
```
**Cite**:
Please cite the following paper when using this model.
```
@inproceedings{ehrmanntraut-et-al-llpro-2023,
address = {Ingolstadt, Germany},
title = {{LLpro}: A Literary Language Processing Pipeline for {German} Narrative Text},
booktitle = {Proceedings of the 10th Conference on Natural Language Processing ({KONVENS} 2022)},
publisher = {{KONVENS} 2023 Organizers},
author = {Ehrmanntraut, Anton and Konle, Leonard and Jannidis, Fotis},
year = {2023},
}
``` |