pprokopidis's picture
Update README.md
d314b02 verified
metadata
language:
  - el
license: cc-by-nc-2.0
tags:
  - flair
  - token-classification
  - sequence-tagger-model
base_model:
  - nlpaueb/bert-base-greek-uncased-v1

Greek Named Entity Model finetuned on the elNER Dataset

This Greek NER model was fine-tuned by researchers at the Institute for Language and Speech Processing/Athena RC. The model was finetuned on the elNER-18 dataset using the nlpaueb/bert-base-greek-uncased-v1 as backbone LM.

Dataset

The elNER-18 dataset consists of 21K sentences, 623K tokens and 94K annotated named entities for 18 NE classes.

The following 18 named entities are annotated in the train partition:

Class #
ORG 10944
PERSON 8774
CARDINAL 7343
GPE 6781
DATE 6338
ORDINAL 1438
PERCENT 1437
LOC 1404
NORP 1396
MONEY 1012
TIME 1011
EVENT 962
PRODUCT 668
WORK_OF_ART 608
FAC 567
QUANTITY 565
LAW 235
LANGUAGE 55

Fine-Tuning

Flair version 0.14 was used for fine-tuning.

The model was trained with the following hyper-parameters:

  • Batch Size: [8]
  • Learning Rate: [5e-05]

Results

  • F-score (micro) 0.9173
  • F-score (macro) 0.8778
  • Accuracy 0.8651
Class precision recall f1-score support
ORG 0.8931 0.8847 0.8889 1388
PERSON 0.9516 0.9724 0.9619 1051
CARDINAL 0.9330 0.9627 0.9476 911
DATE 0.9403 0.9403 0.9403 838
GPE 0.9282 0.9552 0.9415 826
PERCENT 0.9807 0.9854 0.9831 206
LOC 0.8011 0.7921 0.7966 178
ORDINAL 0.9477 0.9477 0.9477 172
NORP 0.8690 0.8936 0.8811 141
TIME 0.8951 0.9343 0.9143 137
EVENT 0.6395 0.7231 0.6787 130
MONEY 0.9818 0.9730 0.9774 111
PRODUCT 0.7882 0.8072 0.7976 83
WORK_OF_ART 0.8313 0.8214 0.8263 84
FAC 0.6933 0.6753 0.6842 77
QUANTITY 0.8636 0.8769 0.8702 65
LAW 0.8214 0.8214 0.8214 28
LANGUAGE 1.0000 0.8889 0.9412 9
micro avg 0.9112 0.9235 0.9173 6435
macro avg 0.8755 0.8809 0.8778 6435
weighted avg 0.9116 0.9235 0.9174 6435

Files

The Flair training log has also been uploaded to the model hub.