File size: 2,622 Bytes
ce3098a
 
 
 
 
 
a54c45d
 
11ca39e
 
 
 
 
 
 
ce3098a
 
 
 
 
 
 
 
2fed971
 
ce3098a
 
 
 
 
 
 
 
 
 
a26766d
ce3098a
 
 
 
 
3571be3
ce3098a
 
 
 
 
3571be3
 
 
 
 
 
ce3098a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
language:
- en
library_name: flair
pipeline_tag: token-classification
base_model: FacebookAI/xlm-roberta-large
widget:
  - text: According to the BBC George Washington went to Washington.
tags:
- flair
- token-classification
- sequence-tagger-model
- hetzner
- hetzner-gex44
- hetzner-gpu
---

# Flair NER Model trained on CleanCoNLL Dataset

This (unofficial) Flair NER model was trained on the awesome [CleanCoNLL](https://aclanthology.org/2023.emnlp-main.533/) dataset.

The CleanCoNLL dataset was proposed by Susanna Rücker and Alan Akbik and introduces a corrected version of the classic CoNLL-03 dataset, with updated and more consistent NER labels.

[](https://arxiv.org/abs/2310.16225)

## Fine-Tuning

We use XLM-RoBERTa Large as backbone language model and the following hyper-parameters for fine-tuning:

| Hyper-Parameter | Value  |
|:--------------- |:-------|
| Batch Size      | `4`    |
| Learning Rate   | `5-06` |
| Max. Epochs     | `10`   |

Additionally, the [FLERT](https://arxiv.org/abs/2011.06993) approach is used for fine-tuning the model. [Training logs](training.log) and [TensorBoard](../../tensorboard) are also available for each model.

## Results

We report micro F1-Score on development (in brackets) and test set for five runs with different seeds:

| [Seed 1][1]     | [Seed 2][2]     | [Seed 3][3]     | [Seed 4][4]     | [Seed 5][5]     | Avg.
|:--------------- |:--------------- |:--------------- |:--------------- |:--------------- |:--------------- |
| (97.34) / 97.00 | (97.26) / 96.90 | (97.66) / 97.02 | (97.42) / 96.96 | (97.46) / 96.99 | (97.43) / 96.97 |

Rücker and Akbik report 96.98 on three different runs, so our results are very close to their reported performance!

[1]: https://huggingface.co/stefan-it/flair-clean-conll-1
[2]: https://huggingface.co/stefan-it/flair-clean-conll-2
[3]: https://huggingface.co/stefan-it/flair-clean-conll-3
[4]: https://huggingface.co/stefan-it/flair-clean-conll-4
[5]: https://huggingface.co/stefan-it/flair-clean-conll-5

# Flair Demo

The following snippet shows how to use the CleanCoNLL NER models with Flair:

```python
from flair.data import Sentence
from flair.models import SequenceTagger

# load tagger
tagger = SequenceTagger.load("stefan-it/flair-clean-conll-5")

# make example sentence
sentence = Sentence("According to the BBC George Washington went to Washington.")

# predict NER tags
tagger.predict(sentence)

# print sentence
print(sentence)

# print predicted NER spans
print('The following NER tags are found:')
# iterate over entities and print
for entity in sentence.get_spans('ner'):
    print(entity)
```