aehrm
/

droc-character-recognizer

Token Classification

sequence-tagger-model

Model card Files Files and versions Community

droc-character-recognizer / README.md

aehrm's picture

readme: use bibtex

edea4cd over 1 year ago

|

history blame contribute delete

2.03 kB

	---
	tags:
	- flair
	- token-classification
	- sequence-tagger-model
	language: de
	---
	# Tagger for literary character mentions (DROC corpus)

	This is the character recognizer model that is being used in [LLpro](https://github.com/cophi-wue/LLpro). It detects character mentions in literary fiction: (a) proper nouns ("Alice", "Effi"), and (b) nominal phrases ("Gärtner", "Mutter", "Graf", "Idiot", "Schöne", ...). The model is trained on the [DROC dataset](https://gitlab2.informatik.uni-wuerzburg.de/kallimachos/DROC-Release), fine-tuning the domain-adapted [lkonle/fiction-gbert-large](https://huggingface.co/lkonle/fiction-gbert-large). ([Training code](https://github.com/cophi-wue/LLpro/blob/main/contrib/train_character_recognizer.py))

	F1-Score: 91.85 (on a held-out data split; micro average on B-PER and I-PER labels)


	---

	Demo Usage:

	```python
	from flair.data import Sentence
	from flair.models import SequenceTagger

	# load tagger
	tagger = SequenceTagger.load("aehrm/droc-character-recognizer")

	# make example sentence
	sentence = Sentence("Effi folgte Graf Instetten nach Kessin.")

	# predict NER tags
	tagger.predict(sentence)

	# print sentence
	print(sentence)
	# >>> Sentence[7]: "Effi folgte Graf Instetten nach Kessin." → ["Effi"/PER, "Graf Instetten"/PER]

	# print predicted NER spans
	print('The following NER tags are found:')
	# iterate over entities and print
	for entity in sentence.get_spans('character'):
	print(entity)
	# >>> Span[0:1]: "Effi" → PER (1.0)
	# >>> Span[2:4]: "Graf Instetten" → PER (1.0)
	```

	Cite:

	Please cite the following paper when using this model.

	```

	@inproceedings{ehrmanntraut-et-al-llpro-2023,
	address = {Ingolstadt, Germany},
	title = {{LLpro}: A Literary Language Processing Pipeline for {German} Narrative Text},
	booktitle = {Proceedings of the 10th Conference on Natural Language Processing ({KONVENS} 2022)},
	publisher = {{KONVENS} 2023 Organizers},
	author = {Ehrmanntraut, Anton and Konle, Leonard and Jannidis, Fotis},
	year = {2023},
	}

	```