pprokopidis
commited on
Commit
•
3f44aca
1
Parent(s):
c3bb900
add README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,89 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- el
|
4 |
+
license: cc-by-nc-2.0
|
5 |
+
tags:
|
6 |
+
- flair
|
7 |
+
- token-classification
|
8 |
+
- sequence-tagger-model
|
9 |
+
base_model:
|
10 |
+
- nlpaueb/bert-base-greek-uncased-v1
|
11 |
+
---
|
12 |
+
|
13 |
+
# Greek Named Entity Model finetuned on the elNER Dataset
|
14 |
+
|
15 |
+
This Greek NER model was fine-tuned by researchers at the [Institute for Language and Speech Processing/Athena RC](https://www.ilsp.gr). The model was finetuned on the [elNER-18 dataset](https://dl.acm.org/doi/10.1145/3411408.3411437) using the [nlpaueb/bert-base-greek-uncased-v1](https://huggingface.co/nlpaueb/bert-base-greek-uncased-v1) as backbone LM.
|
16 |
+
|
17 |
+
## Dataset
|
18 |
+
|
19 |
+
The [elNER-18 dataset](https://dl.acm.org/doi/10.1145/3411408.3411437) consists of 21K sentences, 623K tokens and 94K annotated named entities for 18 NE classes.
|
20 |
+
|
21 |
+
The following 18 named entities are annotated:
|
22 |
+
|
23 |
+
|Class|#|
|
24 |
+
|:---|:---|
|
25 |
+
|ORG|10944|
|
26 |
+
|PERSON|8774|
|
27 |
+
|CARDINAL|7343|
|
28 |
+
|GPE|6781|
|
29 |
+
|DATE|6338|
|
30 |
+
|ORDINAL|1438|
|
31 |
+
|PERCENT|1437|
|
32 |
+
|LOC|1404|
|
33 |
+
|NORP|1396|
|
34 |
+
|MONEY|1012|
|
35 |
+
|TIME|1011|
|
36 |
+
|EVENT|962|
|
37 |
+
|PRODUCT|668|
|
38 |
+
|WORK_OF_ART|608|
|
39 |
+
|FAC|567|
|
40 |
+
|QUANTITY|565|
|
41 |
+
|LAW|235|
|
42 |
+
|LANGUAGE|55|
|
43 |
+
|
44 |
+
## Fine-Tuning
|
45 |
+
|
46 |
+
[Flair version 0.14](https://github.com/flairNLP/flair/releases/tag/v0.14.0) was used for fine-tuning.
|
47 |
+
|
48 |
+
<!-- A hyper-parameter search is to be performed. Right now we have results with the following parameters. -->
|
49 |
+
The model was trained with the following hyper-parameters:
|
50 |
+
|
51 |
+
* Batch Size: [`8`]
|
52 |
+
* Learning Rate: [`5e-05`]
|
53 |
+
|
54 |
+
|
55 |
+
## Results
|
56 |
+
|
57 |
+
- F-score (micro) 0.9169
|
58 |
+
- F-score (macro) 0.8735
|
59 |
+
- Accuracy 0.8634
|
60 |
+
|
61 |
+
|Class|precision|recall|f1-score|support|
|
62 |
+
|:---|:---|:---|:---|:---|
|
63 |
+
|ORG|0.8928|0.8761|0.8844|1388|
|
64 |
+
|PERSON|0.9578|0.9724|0.9651|1051|
|
65 |
+
|CARDINAL|0.9395|0.9550|0.9472|911|
|
66 |
+
|GPE|0.9292|0.9528|0.9408|826|
|
67 |
+
|DATE|0.9436|0.9391|0.9414|838|
|
68 |
+
|PERCENT|0.9903|0.9951|0.9927|206|
|
69 |
+
|LOC|0.8011|0.7921|0.7966|178|
|
70 |
+
|ORDINAL|0.9529|0.9419|0.9474|172|
|
71 |
+
|NORP|0.8944|0.9007|0.8975|141|
|
72 |
+
|TIME|0.9000|0.9197|0.9097|137|
|
73 |
+
|EVENT|0.6912|0.7231|0.7068|130|
|
74 |
+
|MONEY|0.9818|0.9730|0.9774|111|
|
75 |
+
|PRODUCT|0.7191|0.7711|0.7442|83|
|
76 |
+
|WORK_OF_ART|0.8272|0.7976|0.8121|84|
|
77 |
+
|FAC|0.6757|0.6494|0.6623|77|
|
78 |
+
|QUANTITY|0.8507|0.8769|0.8636|65|
|
79 |
+
|LAW|0.8400|0.7500|0.7925|28|
|
80 |
+
|LANGUAGE|1.0000|0.8889|0.9412|9|
|
81 |
+
| ||||
|
82 |
+
|micro avg|0.9150|0.9187|0.9169|6435|
|
83 |
+
|macro avg|0.8771|0.8708|0.8735|6435|
|
84 |
+
|weighted avg|0.9150|0.9187|0.9167|6435|
|
85 |
+
|
86 |
+
|
87 |
+
## Files
|
88 |
+
|
89 |
+
The Flair [training log](training.log) has also been uploaded to the model hub.
|