Prikshit7766 commited on
Commit
64d12da
·
verified ·
1 Parent(s): e881b06

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +119 -106
README.md CHANGED
@@ -1,106 +1,119 @@
1
- # Fine-Tuned BERT Model for Named Entity Recognition (NER) with Accelerate Library
2
-
3
- This repository contains a fine-tuned BERT model for Named Entity Recognition (NER) tasks, trained on the [CoNLL 2003 dataset](https://huggingface.co/datasets/eriktks/conll2003) using the Hugging Face Accelerate library.
4
-
5
- The dataset includes the following labels:
6
- - `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-LOC`, `I-LOC`, `B-MISC`, `I-MISC`
7
-
8
- ## Model Training Details
9
-
10
- ### Training Arguments
11
- - **Library**: Hugging Face Accelerate
12
- - **Model Architecture**: `bert-base-cased` for token classification
13
- - **Learning Rate**: `2e-5`
14
- - **Number of Epochs**: `20`
15
- - **Weight Decay**: `0.01`
16
- - **Batch Size**: `8`
17
- - **Evaluation Strategy**: `epoch`
18
- - **Save Strategy**: `epoch`
19
-
20
- *Additional default parameters from the Accelerate and Transformers libraries were used.*
21
-
22
- ---
23
-
24
- ## Evaluation Results
25
-
26
- ### Validation Set Performance
27
- - **Overall Metrics**:
28
- - Precision: 95.17%
29
- - Recall: 93.87%
30
- - F1 Score: 94.52%
31
- - Accuracy: 98.62%
32
-
33
- #### Per-Label Performance
34
- | Entity Type | Precision | Recall | F1 Score |
35
- |-------------|-----------|--------|----------|
36
- | LOC | 96.46% | 96.51% | 96.49% |
37
- | MISC | 90.78% | 89.14% | 89.95% |
38
- | ORG | 92.61% | 90.26% | 91.42% |
39
- | PER | 97.94% | 96.32% | 97.12% |
40
-
41
- ### Test Set Performance
42
- - **Overall Metrics**:
43
- - Precision: 91.82%
44
- - Recall: 89.68%
45
- - F1 Score: 90.74%
46
- - Accuracy: 97.23%
47
-
48
- #### Per-Label Performance
49
- | Entity Type | Precision | Recall | F1 Score |
50
- |-------------|-----------|--------|----------|
51
- | LOC | 92.99% | 92.10% | 92.54% |
52
- | MISC | 82.05% | 75.00% | 78.37% |
53
- | ORG | 90.67% | 88.28% | 89.46% |
54
- | PER | 96.04% | 95.57% | 95.81% |
55
-
56
- ---
57
-
58
- ## How to Use the Model
59
-
60
- You can load the model directly from the Hugging Face Model Hub:
61
-
62
- ```python
63
- from transformers import pipeline
64
-
65
- # Replace with your specific model checkpoint
66
- model_checkpoint = "Prikshit7766/bert-finetuned-ner-accelerate"
67
- token_classifier = pipeline(
68
- "token-classification",
69
- model=model_checkpoint,
70
- aggregation_strategy="simple"
71
- )
72
-
73
- # Example usage
74
- result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.")
75
- print(result)
76
- ```
77
-
78
- ### Example Output
79
- ```python
80
- [
81
- {
82
- "entity_group": "PER",
83
- "score": 0.9999658,
84
- "word": "Sylvain",
85
- "start": 11,
86
- "end": 18
87
- },
88
- {
89
- "entity_group": "ORG",
90
- "score": 0.99996203,
91
- "word": "Hugging Face",
92
- "start": 33,
93
- "end": 45
94
- },
95
- {
96
- "entity_group": "LOC",
97
- "score": 0.9999542,
98
- "word": "Brooklyn",
99
- "start": 49,
100
- "end": 57
101
- }
102
- ]
103
- ```
104
-
105
- ---
106
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - eriktks/conll2003
4
+ language:
5
+ - en
6
+ metrics:
7
+ - accuracy
8
+ - precision
9
+ - recall
10
+ - f1
11
+ base_model:
12
+ - google-bert/bert-base-uncased
13
+ pipeline_tag: token-classification
14
+ ---
15
+ # Fine-Tuned BERT Model for Named Entity Recognition (NER) with Accelerate Library
16
+
17
+ This repository contains a fine-tuned BERT model for Named Entity Recognition (NER) tasks, trained on the [CoNLL 2003 dataset](https://huggingface.co/datasets/eriktks/conll2003) using the Hugging Face Accelerate library.
18
+
19
+ The dataset includes the following labels:
20
+ - `O`, `B-PER`, `I-PER`, `B-ORG`, `I-ORG`, `B-LOC`, `I-LOC`, `B-MISC`, `I-MISC`
21
+
22
+ ## Model Training Details
23
+
24
+ ### Training Arguments
25
+ - **Library**: Hugging Face Accelerate
26
+ - **Model Architecture**: `bert-base-cased` for token classification
27
+ - **Learning Rate**: `2e-5`
28
+ - **Number of Epochs**: `20`
29
+ - **Weight Decay**: `0.01`
30
+ - **Batch Size**: `8`
31
+ - **Evaluation Strategy**: `epoch`
32
+ - **Save Strategy**: `epoch`
33
+
34
+ *Additional default parameters from the Accelerate and Transformers libraries were used.*
35
+
36
+ ---
37
+
38
+ ## Evaluation Results
39
+
40
+ ### Validation Set Performance
41
+ - **Overall Metrics**:
42
+ - Precision: 95.17%
43
+ - Recall: 93.87%
44
+ - F1 Score: 94.52%
45
+ - Accuracy: 98.62%
46
+
47
+ #### Per-Label Performance
48
+ | Entity Type | Precision | Recall | F1 Score |
49
+ |-------------|-----------|--------|----------|
50
+ | LOC | 96.46% | 96.51% | 96.49% |
51
+ | MISC | 90.78% | 89.14% | 89.95% |
52
+ | ORG | 92.61% | 90.26% | 91.42% |
53
+ | PER | 97.94% | 96.32% | 97.12% |
54
+
55
+ ### Test Set Performance
56
+ - **Overall Metrics**:
57
+ - Precision: 91.82%
58
+ - Recall: 89.68%
59
+ - F1 Score: 90.74%
60
+ - Accuracy: 97.23%
61
+
62
+ #### Per-Label Performance
63
+ | Entity Type | Precision | Recall | F1 Score |
64
+ |-------------|-----------|--------|----------|
65
+ | LOC | 92.99% | 92.10% | 92.54% |
66
+ | MISC | 82.05% | 75.00% | 78.37% |
67
+ | ORG | 90.67% | 88.28% | 89.46% |
68
+ | PER | 96.04% | 95.57% | 95.81% |
69
+
70
+ ---
71
+
72
+ ## How to Use the Model
73
+
74
+ You can load the model directly from the Hugging Face Model Hub:
75
+
76
+ ```python
77
+ from transformers import pipeline
78
+
79
+ # Replace with your specific model checkpoint
80
+ model_checkpoint = "Prikshit7766/bert-finetuned-ner-accelerate"
81
+ token_classifier = pipeline(
82
+ "token-classification",
83
+ model=model_checkpoint,
84
+ aggregation_strategy="simple"
85
+ )
86
+
87
+ # Example usage
88
+ result = token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn.")
89
+ print(result)
90
+ ```
91
+
92
+ ### Example Output
93
+ ```python
94
+ [
95
+ {
96
+ "entity_group": "PER",
97
+ "score": 0.9999658,
98
+ "word": "Sylvain",
99
+ "start": 11,
100
+ "end": 18
101
+ },
102
+ {
103
+ "entity_group": "ORG",
104
+ "score": 0.99996203,
105
+ "word": "Hugging Face",
106
+ "start": 33,
107
+ "end": 45
108
+ },
109
+ {
110
+ "entity_group": "LOC",
111
+ "score": 0.9999542,
112
+ "word": "Brooklyn",
113
+ "start": 49,
114
+ "end": 57
115
+ }
116
+ ]
117
+ ```
118
+
119
+ ---