huseyincenik
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -40,9 +40,9 @@ This model has been trained to perform Named Entity Recognition (NER) and is bas
|
|
40 |
|
41 |
## Training and evaluation data
|
42 |
- **Training Dataset**: CoNLL-2003
|
|
|
43 |
- **Training Evaluation Metrics**:
|
44 |
-
|
45 |
-
- precision recall f1-score support
|
46 |
|
47 |
B-PER 0.98 0.98 0.98 11273
|
48 |
I-PER 0.98 0.99 0.99 9323
|
@@ -58,7 +58,7 @@ This model has been trained to perform Named Entity Recognition (NER) and is bas
|
|
58 |
weighted avg 0.90 0.90 0.89 53190
|
59 |
|
60 |
- **Validation Evaluation Metrics**:
|
61 |
-
|
62 |
|
63 |
B-PER 0.97 0.98 0.97 3018
|
64 |
I-PER 0.98 0.98 0.98 2741
|
@@ -74,7 +74,7 @@ weighted avg 0.90 0.90 0.89 53190
|
|
74 |
weighted avg 0.90 0.89 0.88 13235
|
75 |
|
76 |
- **Test Evaluation Metrics**:
|
77 |
-
|
78 |
|
79 |
B-PER 0.96 0.95 0.96 2714
|
80 |
I-PER 0.98 0.99 0.98 2487
|
@@ -142,6 +142,7 @@ tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert")
|
|
142 |
model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")
|
143 |
|
144 |
```
|
|
|
145 |
Abbreviation|Description
|
146 |
-|-
|
147 |
O|Outside of a named entity
|
@@ -157,12 +158,14 @@ I-LOC |Location
|
|
157 |
|
158 |
### CoNLL-2003 English Dataset Statistics
|
159 |
This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.
|
|
|
160 |
#### # of training examples per entity type
|
161 |
Dataset|LOC|MISC|ORG|PER
|
162 |
-|-|-|-|-
|
163 |
Train|7140|3438|6321|6600
|
164 |
Dev|1837|922|1341|1842
|
165 |
Test|1668|702|1661|1617
|
|
|
166 |
#### # of articles/sentences/tokens per dataset
|
167 |
Dataset |Articles |Sentences |Tokens
|
168 |
-|-|-|-
|
|
|
40 |
|
41 |
## Training and evaluation data
|
42 |
- **Training Dataset**: CoNLL-2003
|
43 |
+
|
44 |
- **Training Evaluation Metrics**:
|
45 |
+
precision recall f1-score support
|
|
|
46 |
|
47 |
B-PER 0.98 0.98 0.98 11273
|
48 |
I-PER 0.98 0.99 0.99 9323
|
|
|
58 |
weighted avg 0.90 0.90 0.89 53190
|
59 |
|
60 |
- **Validation Evaluation Metrics**:
|
61 |
+
precision recall f1-score support
|
62 |
|
63 |
B-PER 0.97 0.98 0.97 3018
|
64 |
I-PER 0.98 0.98 0.98 2741
|
|
|
74 |
weighted avg 0.90 0.89 0.88 13235
|
75 |
|
76 |
- **Test Evaluation Metrics**:
|
77 |
+
precision recall f1-score support
|
78 |
|
79 |
B-PER 0.96 0.95 0.96 2714
|
80 |
I-PER 0.98 0.99 0.98 2487
|
|
|
142 |
model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")
|
143 |
|
144 |
```
|
145 |
+
|
146 |
Abbreviation|Description
|
147 |
-|-
|
148 |
O|Outside of a named entity
|
|
|
158 |
|
159 |
### CoNLL-2003 English Dataset Statistics
|
160 |
This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.
|
161 |
+
|
162 |
#### # of training examples per entity type
|
163 |
Dataset|LOC|MISC|ORG|PER
|
164 |
-|-|-|-|-
|
165 |
Train|7140|3438|6321|6600
|
166 |
Dev|1837|922|1341|1842
|
167 |
Test|1668|702|1661|1617
|
168 |
+
|
169 |
#### # of articles/sentences/tokens per dataset
|
170 |
Dataset |Articles |Sentences |Tokens
|
171 |
-|-|-|-
|