|
--- |
|
language: |
|
- el |
|
license: cc-by-nc-2.0 |
|
tags: |
|
- flair |
|
- token-classification |
|
- sequence-tagger-model |
|
base_model: |
|
- nlpaueb/bert-base-greek-uncased-v1 |
|
--- |
|
|
|
# Greek Named Entity Model finetuned on the elNER Dataset |
|
|
|
This Greek NER model was fine-tuned by researchers at the [Institute for Language and Speech Processing/Athena RC](https://www.ilsp.gr). The model was finetuned on the [elNER-18 dataset](https://dl.acm.org/doi/10.1145/3411408.3411437) using the [nlpaueb/bert-base-greek-uncased-v1](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1) as backbone LM. |
|
|
|
## Dataset |
|
|
|
The [elNER-18 dataset](https://dl.acm.org/doi/10.1145/3411408.3411437) consists of 21K sentences, 623K tokens and 94K annotated named entities for 18 NE classes. |
|
|
|
The following 18 named entities are annotated in the train partition: |
|
|
|
|Class|#| |
|
|:---|:---| |
|
|ORG|10944| |
|
|PERSON|8774| |
|
|CARDINAL|7343| |
|
|GPE|6781| |
|
|DATE|6338| |
|
|ORDINAL|1438| |
|
|PERCENT|1437| |
|
|LOC|1404| |
|
|NORP|1396| |
|
|MONEY|1012| |
|
|TIME|1011| |
|
|EVENT|962| |
|
|PRODUCT|668| |
|
|WORK_OF_ART|608| |
|
|FAC|567| |
|
|QUANTITY|565| |
|
|LAW|235| |
|
|LANGUAGE|55| |
|
|
|
## Fine-Tuning |
|
|
|
[Flair version 0.14](https://github.com/flairNLP/flair/releases/tag/v0.14.0) was used for fine-tuning. |
|
|
|
<!-- A hyper-parameter search is to be performed. Right now we have results with the following parameters. --> |
|
The model was trained with the following hyper-parameters: |
|
|
|
* Batch Size: [`8`] |
|
* Learning Rate: [`5e-05`] |
|
|
|
|
|
## Results |
|
|
|
- F-score (micro) 0.9173 |
|
- F-score (macro) 0.8778 |
|
- Accuracy 0.8651 |
|
|
|
|Class|precision|recall|f1-score|support| |
|
|:---|:---|:---|:---|:---| |
|
|ORG|0.8931|0.8847|0.8889|1388| |
|
|PERSON|0.9516|0.9724|0.9619|1051| |
|
|CARDINAL|0.9330|0.9627|0.9476|911| |
|
|DATE|0.9403|0.9403|0.9403|838| |
|
|GPE|0.9282|0.9552|0.9415|826| |
|
|PERCENT|0.9807|0.9854|0.9831|206| |
|
|LOC|0.8011|0.7921|0.7966|178| |
|
|ORDINAL|0.9477|0.9477|0.9477|172| |
|
|NORP|0.8690|0.8936|0.8811|141| |
|
|TIME|0.8951|0.9343|0.9143|137| |
|
|EVENT|0.6395|0.7231|0.6787|130| |
|
|MONEY|0.9818|0.9730|0.9774|111| |
|
|PRODUCT|0.7882|0.8072|0.7976|83| |
|
|WORK_OF_ART|0.8313|0.8214|0.8263|84| |
|
|FAC|0.6933|0.6753|0.6842|77| |
|
|QUANTITY|0.8636|0.8769|0.8702|65| |
|
|LAW|0.8214|0.8214|0.8214|28| |
|
|LANGUAGE|1.0000|0.8889|0.9412|9| |
|
| |||| |
|
|micro avg|0.9112|0.9235|0.9173|6435| |
|
|macro avg|0.8755|0.8809|0.8778|6435| |
|
|weighted avg|0.9116|0.9235|0.9174|6435| |
|
|
|
## Files |
|
|
|
The Flair [training log](training.log) has also been uploaded to the model hub. |
|
|