huseyincenik
/

conll_ner_with_bert

@@ -21,30 +21,86 @@ probably proofread and complete it, then remove this comment. -->
 # huseyincenik/conll_ner_with_bert
-This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an unknown dataset.
-It achieves the following results on the evaluation set:
-- Train Loss: 0.0228
-- Validation Loss: 0.0180
-- Epoch: 1
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 2e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 875.9, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 0.1, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
 - training_precision: float32
 ### Training results
@@ -54,6 +110,65 @@ The following hyperparameters were used during training:
 | 0.1016     | 0.0254          | 0     |
 | 0.0228     | 0.0180          | 1     |
 ### Framework versions

 # huseyincenik/conll_ner_with_bert
+This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the CoNLL-2003 dataset for Named Entity Recognition (NER).
 ## Model description
+This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks.
 ## Intended uses & limitations
+### Intended Uses
+- **Named Entity Recognition**: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC).
+### Limitations
+- **Domain Specificity**: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data.
+- **Subword Tokens**: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases.
 ## Training and evaluation data
+- **Training Dataset**: CoNLL-2003
+- **Training Evaluation Metrics**:
+-               precision    recall  f1-score   support
+       B-PER       0.98      0.98      0.98     11273
+       I-PER       0.98      0.99      0.99      9323
+       B-ORG       0.88      0.92      0.90     10447
+       I-ORG       0.81      0.92      0.86      5137
+       B-LOC       0.86      0.94      0.90      9621
+       I-LOC       1.00      0.08      0.14      1267
+      B-MISC       0.81      0.73      0.77      4793
+      I-MISC       0.83      0.36      0.50      1329
+   micro avg       0.90      0.90      0.90     53190
+   macro avg       0.89      0.74      0.75     53190
+weighted avg       0.90      0.90      0.89     53190
+- **Validation Evaluation Metrics**:
+-               precision    recall  f1-score   support
+       B-PER       0.97      0.98      0.97      3018
+       I-PER       0.98      0.98      0.98      2741
+       B-ORG       0.86      0.91      0.88      2056
+       I-ORG       0.77      0.81      0.79       900
+       B-LOC       0.86      0.94      0.90      2618
+       I-LOC       1.00      0.10      0.18       281
+      B-MISC       0.77      0.74      0.76      1231
+      I-MISC       0.77      0.34      0.48       390
+   micro avg       0.90      0.89      0.89     13235
+   macro avg       0.87      0.73      0.74     13235
+weighted avg       0.90      0.89      0.88     13235
+- **Test Evaluation Metrics**:
+-               precision    recall  f1-score   support
+       B-PER       0.96      0.95      0.96      2714
+       I-PER       0.98      0.99      0.98      2487
+       B-ORG       0.81      0.87      0.84      2588
+       I-ORG       0.74      0.87      0.80      1050
+       B-LOC       0.81      0.90      0.85      2121
+       I-LOC       0.89      0.12      0.22       276
+      B-MISC       0.75      0.67      0.71       996
+      I-MISC       0.85      0.49      0.62       241
+   micro avg       0.87      0.88      0.87     12473
+   macro avg       0.85      0.73      0.75     12473
+weighted avg       0.87      0.88      0.86     12473
 ## Training procedure
+### Training Hyperparameters
+- **Optimizer**: AdamWeightDecay
+  - Learning Rate: 2e-05
+  - Decay Schedule: PolynomialDecay
+  - Warmup Steps: 0.1
+  - Weight Decay Rate: 0.01
 - training_precision: float32
 ### Training results
 | 0.1016     | 0.0254          | 0     |
 | 0.0228     | 0.0180          | 1     |
+### Optimizer Details
+```python
+from transformers import create_optimizer
+batch_size = 32
+num_train_epochs = 2
+num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs
+optimizer, lr_schedule = create_optimizer(
+    init_lr=2e-5,
+    num_train_steps=num_train_steps,
+    weight_decay_rate=0.01,
+    num_warmup_steps=0.1
+)
+```
+## How to Use
+### Using a Pipeline
+```python
+from transformers import pipeline
+pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert")
+from transformers import AutoTokenizer, AutoModelForTokenClassification
+tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert")
+model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")
+```
+Abbreviation|Description
+-|-
+O|Outside of a named entity
+B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity
+I-MISC | Miscellaneous entity
+B-PER |Beginning of a person’s name right after another person’s name
+I-PER |Person’s name
+B-ORG |Beginning of an organization right after another organization
+I-ORG |organization
+B-LOC |Beginning of a location right after another location
+I-LOC |Location
+### CoNLL-2003 English Dataset Statistics
+This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.
+#### # of training examples per entity type
+Dataset|LOC|MISC|ORG|PER
+-|-|-|-|-
+Train|7140|3438|6321|6600
+Dev|1837|922|1341|1842
+Test|1668|702|1661|1617
+#### # of articles/sentences/tokens per dataset
+Dataset |Articles |Sentences |Tokens
+-|-|-|-
+Train |946 |14,987 |203,621
+Dev |216 |3,466 |51,362
+Test |231 |3,684 |46,435
 ### Framework versions