huseyincenik commited on
Commit
8171496
·
verified ·
1 Parent(s): 5488480

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +126 -11
README.md CHANGED
@@ -21,30 +21,86 @@ probably proofread and complete it, then remove this comment. -->
21
 
22
  # huseyincenik/conll_ner_with_bert
23
 
24
- This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
25
- It achieves the following results on the evaluation set:
26
- - Train Loss: 0.0228
27
- - Validation Loss: 0.0180
28
- - Epoch: 1
29
 
30
  ## Model description
31
 
32
- More information needed
33
 
34
  ## Intended uses & limitations
35
 
36
- More information needed
 
 
 
 
 
 
 
37
 
38
  ## Training and evaluation data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
- More information needed
41
 
42
  ## Training procedure
43
 
44
- ### Training hyperparameters
 
 
 
 
 
 
45
 
46
- The following hyperparameters were used during training:
47
- - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 875.9, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 0.1, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
48
  - training_precision: float32
49
 
50
  ### Training results
@@ -54,6 +110,65 @@ The following hyperparameters were used during training:
54
  | 0.1016 | 0.0254 | 0 |
55
  | 0.0228 | 0.0180 | 1 |
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
 
58
  ### Framework versions
59
 
 
21
 
22
  # huseyincenik/conll_ner_with_bert
23
 
24
+ This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the CoNLL-2003 dataset for Named Entity Recognition (NER).
 
 
 
 
25
 
26
  ## Model description
27
 
28
+ This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks.
29
 
30
  ## Intended uses & limitations
31
 
32
+ ### Intended Uses
33
+
34
+ - **Named Entity Recognition**: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC).
35
+
36
+ ### Limitations
37
+
38
+ - **Domain Specificity**: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data.
39
+ - **Subword Tokens**: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases.
40
 
41
  ## Training and evaluation data
42
+ - **Training Dataset**: CoNLL-2003
43
+ - **Training Evaluation Metrics**:
44
+
45
+ - precision recall f1-score support
46
+
47
+ B-PER 0.98 0.98 0.98 11273
48
+ I-PER 0.98 0.99 0.99 9323
49
+ B-ORG 0.88 0.92 0.90 10447
50
+ I-ORG 0.81 0.92 0.86 5137
51
+ B-LOC 0.86 0.94 0.90 9621
52
+ I-LOC 1.00 0.08 0.14 1267
53
+ B-MISC 0.81 0.73 0.77 4793
54
+ I-MISC 0.83 0.36 0.50 1329
55
+
56
+ micro avg 0.90 0.90 0.90 53190
57
+ macro avg 0.89 0.74 0.75 53190
58
+ weighted avg 0.90 0.90 0.89 53190
59
+
60
+ - **Validation Evaluation Metrics**:
61
+ - precision recall f1-score support
62
+
63
+ B-PER 0.97 0.98 0.97 3018
64
+ I-PER 0.98 0.98 0.98 2741
65
+ B-ORG 0.86 0.91 0.88 2056
66
+ I-ORG 0.77 0.81 0.79 900
67
+ B-LOC 0.86 0.94 0.90 2618
68
+ I-LOC 1.00 0.10 0.18 281
69
+ B-MISC 0.77 0.74 0.76 1231
70
+ I-MISC 0.77 0.34 0.48 390
71
+
72
+ micro avg 0.90 0.89 0.89 13235
73
+ macro avg 0.87 0.73 0.74 13235
74
+ weighted avg 0.90 0.89 0.88 13235
75
+
76
+ - **Test Evaluation Metrics**:
77
+ - precision recall f1-score support
78
+
79
+ B-PER 0.96 0.95 0.96 2714
80
+ I-PER 0.98 0.99 0.98 2487
81
+ B-ORG 0.81 0.87 0.84 2588
82
+ I-ORG 0.74 0.87 0.80 1050
83
+ B-LOC 0.81 0.90 0.85 2121
84
+ I-LOC 0.89 0.12 0.22 276
85
+ B-MISC 0.75 0.67 0.71 996
86
+ I-MISC 0.85 0.49 0.62 241
87
+
88
+ micro avg 0.87 0.88 0.87 12473
89
+ macro avg 0.85 0.73 0.75 12473
90
+ weighted avg 0.87 0.88 0.86 12473
91
+
92
 
 
93
 
94
  ## Training procedure
95
 
96
+ ### Training Hyperparameters
97
+
98
+ - **Optimizer**: AdamWeightDecay
99
+ - Learning Rate: 2e-05
100
+ - Decay Schedule: PolynomialDecay
101
+ - Warmup Steps: 0.1
102
+ - Weight Decay Rate: 0.01
103
 
 
 
104
  - training_precision: float32
105
 
106
  ### Training results
 
110
  | 0.1016 | 0.0254 | 0 |
111
  | 0.0228 | 0.0180 | 1 |
112
 
113
+ ### Optimizer Details
114
+
115
+ ```python
116
+ from transformers import create_optimizer
117
+
118
+ batch_size = 32
119
+ num_train_epochs = 2
120
+ num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs
121
+
122
+ optimizer, lr_schedule = create_optimizer(
123
+ init_lr=2e-5,
124
+ num_train_steps=num_train_steps,
125
+ weight_decay_rate=0.01,
126
+ num_warmup_steps=0.1
127
+ )
128
+ ```
129
+
130
+ ## How to Use
131
+
132
+ ### Using a Pipeline
133
+
134
+ ```python
135
+ from transformers import pipeline
136
+
137
+ pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert")
138
+
139
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
140
+
141
+ tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert")
142
+ model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")
143
+
144
+ ```
145
+ Abbreviation|Description
146
+ -|-
147
+ O|Outside of a named entity
148
+ B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity
149
+ I-MISC | Miscellaneous entity
150
+ B-PER |Beginning of a person’s name right after another person’s name
151
+ I-PER |Person’s name
152
+ B-ORG |Beginning of an organization right after another organization
153
+ I-ORG |organization
154
+ B-LOC |Beginning of a location right after another location
155
+ I-LOC |Location
156
+
157
+
158
+ ### CoNLL-2003 English Dataset Statistics
159
+ This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.
160
+ #### # of training examples per entity type
161
+ Dataset|LOC|MISC|ORG|PER
162
+ -|-|-|-|-
163
+ Train|7140|3438|6321|6600
164
+ Dev|1837|922|1341|1842
165
+ Test|1668|702|1661|1617
166
+ #### # of articles/sentences/tokens per dataset
167
+ Dataset |Articles |Sentences |Tokens
168
+ -|-|-|-
169
+ Train |946 |14,987 |203,621
170
+ Dev |216 |3,466 |51,362
171
+ Test |231 |3,684 |46,435
172
 
173
  ### Framework versions
174