pprokopidis's picture
Update README.md
d314b02 verified
---
language:
- el
license: cc-by-nc-2.0
tags:
- flair
- token-classification
- sequence-tagger-model
base_model:
- nlpaueb/bert-base-greek-uncased-v1
---
# Greek Named Entity Model finetuned on the elNER Dataset
This Greek NER model was fine-tuned by researchers at the [Institute for Language and Speech Processing/Athena RC](https://www.ilsp.gr). The model was finetuned on the [elNER-18 dataset](https://dl.acm.org/doi/10.1145/3411408.3411437) using the [nlpaueb/bert-base-greek-uncased-v1](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1) as backbone LM.
## Dataset
The [elNER-18 dataset](https://dl.acm.org/doi/10.1145/3411408.3411437) consists of 21K sentences, 623K tokens and 94K annotated named entities for 18 NE classes.
The following 18 named entities are annotated in the train partition:
|Class|#|
|:---|:---|
|ORG|10944|
|PERSON|8774|
|CARDINAL|7343|
|GPE|6781|
|DATE|6338|
|ORDINAL|1438|
|PERCENT|1437|
|LOC|1404|
|NORP|1396|
|MONEY|1012|
|TIME|1011|
|EVENT|962|
|PRODUCT|668|
|WORK_OF_ART|608|
|FAC|567|
|QUANTITY|565|
|LAW|235|
|LANGUAGE|55|
## Fine-Tuning
[Flair version 0.14](https://github.com/flairNLP/flair/releases/tag/v0.14.0) was used for fine-tuning.
<!-- A hyper-parameter search is to be performed. Right now we have results with the following parameters. -->
The model was trained with the following hyper-parameters:
* Batch Size: [`8`]
* Learning Rate: [`5e-05`]
## Results
- F-score (micro) 0.9173
- F-score (macro) 0.8778
- Accuracy 0.8651
|Class|precision|recall|f1-score|support|
|:---|:---|:---|:---|:---|
|ORG|0.8931|0.8847|0.8889|1388|
|PERSON|0.9516|0.9724|0.9619|1051|
|CARDINAL|0.9330|0.9627|0.9476|911|
|DATE|0.9403|0.9403|0.9403|838|
|GPE|0.9282|0.9552|0.9415|826|
|PERCENT|0.9807|0.9854|0.9831|206|
|LOC|0.8011|0.7921|0.7966|178|
|ORDINAL|0.9477|0.9477|0.9477|172|
|NORP|0.8690|0.8936|0.8811|141|
|TIME|0.8951|0.9343|0.9143|137|
|EVENT|0.6395|0.7231|0.6787|130|
|MONEY|0.9818|0.9730|0.9774|111|
|PRODUCT|0.7882|0.8072|0.7976|83|
|WORK_OF_ART|0.8313|0.8214|0.8263|84|
|FAC|0.6933|0.6753|0.6842|77|
|QUANTITY|0.8636|0.8769|0.8702|65|
|LAW|0.8214|0.8214|0.8214|28|
|LANGUAGE|1.0000|0.8889|0.9412|9|
| ||||
|micro avg|0.9112|0.9235|0.9173|6435|
|macro avg|0.8755|0.8809|0.8778|6435|
|weighted avg|0.9116|0.9235|0.9174|6435|
## Files
The Flair [training log](training.log) has also been uploaded to the model hub.