|
--- |
|
language: |
|
- en |
|
library_name: flair |
|
pipeline_tag: token-classification |
|
base_model: FacebookAI/xlm-roberta-large |
|
widget: |
|
- text: According to the BBC George Washington went to Washington. |
|
tags: |
|
- flair |
|
- token-classification |
|
- sequence-tagger-model |
|
- hetzner |
|
- hetzner-gex44 |
|
- hetzner-gpu |
|
--- |
|
|
|
# Flair NER Model trained on CleanCoNLL Dataset |
|
|
|
This (unofficial) Flair NER model was trained on the awesome [CleanCoNLL](https://aclanthology.org/2023.emnlp-main.533/) dataset. |
|
|
|
The CleanCoNLL dataset was proposed by Susanna Rücker and Alan Akbik and introduces a corrected version of the classic CoNLL-03 dataset, with updated and more consistent NER labels. |
|
|
|
[](https://arxiv.org/abs/2310.16225) |
|
|
|
## Fine-Tuning |
|
|
|
We use XLM-RoBERTa Large as backbone language model and the following hyper-parameters for fine-tuning: |
|
|
|
| Hyper-Parameter | Value | |
|
|:--------------- |:-------| |
|
| Batch Size | `4` | |
|
| Learning Rate | `5-06` | |
|
| Max. Epochs | `10` | |
|
|
|
Additionally, the [FLERT](https://arxiv.org/abs/2011.06993) approach is used for fine-tuning the model. [Training logs](training.log) and [TensorBoard](../../tensorboard) are also available for each model. |
|
|
|
## Results |
|
|
|
We report micro F1-Score on development (in brackets) and test set for five runs with different seeds: |
|
|
|
| [Seed 1][1] | [Seed 2][2] | [Seed 3][3] | [Seed 4][4] | [Seed 5][5] | Avg. |
|
|:--------------- |:--------------- |:--------------- |:--------------- |:--------------- |:--------------- | |
|
| (97.34) / 97.00 | (97.26) / 96.90 | (97.66) / 97.02 | (97.42) / 96.96 | (97.46) / 96.99 | (97.43) / 96.97 | |
|
|
|
Rücker and Akbik report 96.98 on three different runs, so our results are very close to their reported performance! |
|
|
|
[1]: https://huggingface.co/stefan-it/flair-clean-conll-1 |
|
[2]: https://huggingface.co/stefan-it/flair-clean-conll-2 |
|
[3]: https://huggingface.co/stefan-it/flair-clean-conll-3 |
|
[4]: https://huggingface.co/stefan-it/flair-clean-conll-4 |
|
[5]: https://huggingface.co/stefan-it/flair-clean-conll-5 |
|
|
|
# Flair Demo |
|
|
|
The following snippet shows how to use the CleanCoNLL NER models with Flair: |
|
|
|
```python |
|
from flair.data import Sentence |
|
from flair.models import SequenceTagger |
|
|
|
# load tagger |
|
tagger = SequenceTagger.load("stefan-it/flair-clean-conll-5") |
|
|
|
# make example sentence |
|
sentence = Sentence("According to the BBC George Washington went to Washington.") |
|
|
|
# predict NER tags |
|
tagger.predict(sentence) |
|
|
|
# print sentence |
|
print(sentence) |
|
|
|
# print predicted NER spans |
|
print('The following NER tags are found:') |
|
# iterate over entities and print |
|
for entity in sentence.get_spans('ner'): |
|
print(entity) |
|
``` |