Sadat07 commited on
Commit
3d6e990
1 Parent(s): 69d17c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -98
README.md CHANGED
@@ -17,10 +17,10 @@ bert_squad is a transformer-based model trained for context-based question answe
17
 
18
  The model was trained using free computational resources, demonstrating its accessibility for educational and small-scale research purposes.
19
 
20
- - **Developed by: SADAT PARVEJ, RAFIFA BINTE JAHIR
21
- - **Shared by [optional]: SADAT PARVEJ
22
- - **Language(s) (NLP): ENGLISH
23
- - **Finetuned from model [optional]:https://huggingface.co/google-bert/bert-base-uncased
24
 
25
  ### Model Sources [optional]
26
 
@@ -30,167 +30,156 @@ The model was trained using free computational resources, demonstrating its acce
30
  - **Paper [optional]:** [More Information Needed]
31
  - **Demo [optional]:** [More Information Needed]
32
 
33
- ## Uses
34
 
35
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
36
 
37
- ### Direct Use
38
 
39
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
40
 
41
- [More Information Needed]
42
 
43
- ### Downstream Use [optional]
44
 
45
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
46
 
47
- [More Information Needed]
48
 
49
- ### Out-of-Scope Use
 
 
50
 
51
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 
52
 
53
- [More Information Needed]
54
-
55
- ## Bias, Risks, and Limitations
56
-
57
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
58
-
59
- [More Information Needed]
60
-
61
- ### Recommendations
62
-
63
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
64
-
65
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
66
 
67
  ## How to Get Started with the Model
68
 
69
  Use the code below to get started with the model.
70
 
71
- [More Information Needed]
 
72
 
73
- ## Training Details
 
 
 
74
 
75
- ### Training Data
76
 
77
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
78
 
79
- [More Information Needed]
80
 
81
- ### Training Procedure
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
84
 
85
- #### Preprocessing [optional]
86
 
87
- [More Information Needed]
88
 
89
 
90
- #### Training Hyperparameters
 
91
 
92
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
93
 
94
- #### Speeds, Sizes, Times [optional]
95
 
96
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
97
 
98
- [More Information Needed]
99
 
100
- ## Evaluation
 
101
 
102
- <!-- This section describes the evaluation protocols and provides the results. -->
 
103
 
104
- ### Testing Data, Factors & Metrics
 
105
 
106
- #### Testing Data
 
107
 
108
- <!-- This should link to a Dataset Card if possible. -->
109
 
110
- [More Information Needed]
111
 
112
- #### Factors
113
 
114
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
115
 
116
- [More Information Needed]
117
 
118
- #### Metrics
119
 
120
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
121
 
122
- [More Information Needed]
 
 
 
123
 
124
  ### Results
125
 
126
- [More Information Needed]
127
-
128
- #### Summary
129
-
130
-
131
 
132
- ## Model Examination [optional]
 
 
 
 
133
 
134
- <!-- Relevant interpretability work for the model goes here -->
135
-
136
- [More Information Needed]
137
-
138
- ## Environmental Impact
139
 
140
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
141
 
142
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
143
 
144
- - **Hardware Type:** [More Information Needed]
145
- - **Hours used:** [More Information Needed]
146
- - **Cloud Provider:** [More Information Needed]
147
- - **Compute Region:** [More Information Needed]
148
- - **Carbon Emitted:** [More Information Needed]
149
 
150
- ## Technical Specifications [optional]
151
 
152
  ### Model Architecture and Objective
153
 
154
- [More Information Needed]
155
 
156
  ### Compute Infrastructure
157
 
158
- [More Information Needed]
159
-
160
  #### Hardware
161
 
162
- [More Information Needed]
163
 
164
  #### Software
165
 
166
- [More Information Needed]
167
-
168
- ## Citation [optional]
169
 
170
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
171
 
172
  **BibTeX:**
173
 
174
- [More Information Needed]
 
 
 
 
 
175
 
176
- **APA:**
177
 
178
- [More Information Needed]
179
 
180
  ## Glossary [optional]
181
 
182
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
183
-
184
- [More Information Needed]
185
-
186
- ## More Information [optional]
187
-
188
- [More Information Needed]
189
-
190
- ## Model Card Authors [optional]
191
-
192
- [More Information Needed]
193
-
194
- ## Model Card Contact
195
-
196
- [More Information Needed]
 
17
 
18
  The model was trained using free computational resources, demonstrating its accessibility for educational and small-scale research purposes.
19
 
20
+ Developed by: SADAT PARVEJ, RAFIFA BINTE JAHIR
21
+ Shared by: SADAT PARVEJ
22
+ Language(s) (NLP): ENGLISH
23
+ Finetuned from model: https://huggingface.co/google-bert/bert-base-uncased
24
 
25
  ### Model Sources [optional]
26
 
 
30
  - **Paper [optional]:** [More Information Needed]
31
  - **Demo [optional]:** [More Information Needed]
32
 
33
+ ## Training Objective
34
 
35
+ The model predicts the most relevant span of text in a given passage that answers a specific question. It fine-tunes BERT's ability to analyze context using supervised data from SQuAD.
36
 
37
+ ### Performance Benchmarks
38
 
39
+ Training Loss: 0.477800
40
+ Validation Loss: 0.465936
41
+ Exact Match (EM): 87.568590%
42
 
 
43
 
 
44
 
45
+ ## Intended Uses & Limitations
46
 
47
+ This model is designed for tasks such as:
48
 
49
+ Extractive Question Answering
50
+ Reading comprehension applications
51
+ Known Limitations:
52
 
53
+ As BERT is inherently a masked language model (MLM), its original pretraining limits its ability for generative tasks or handling queries outside the SQuAD-style question-answering setup.
54
+ The model's predictions may be biased or overly reliant on the training dataset, as SQuAD comprises structured and fact-based question-answer pairs.
55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56
 
57
  ## How to Get Started with the Model
58
 
59
  Use the code below to get started with the model.
60
 
61
+ from transformers import pipeline
62
+ qa_pipeline = pipeline('question-answering', model='bert_squad')
63
 
64
+ context = "BERT is a transformers model for natural language processing."
65
+ question = "What is BERT used for?"
66
+ result = qa_pipeline(question=question, context=context)
67
+ print(result)
68
 
 
69
 
70
+ ## Training Details
71
 
 
72
 
73
+ | Step | Training Loss | Validation Loss | Exact Match | Squad F1 | Start Accuracy | End Accuracy |
74
+ |------|---------------|-----------------|-------------|----------|----------------|--------------|
75
+ | 100 | 0.632200 | 0.811809 | 84.749290 | 84.749290| 0.847493 | 0.899243 |
76
+ | 200 | 0.751500 | 0.627198 | 84.768212 | 84.768212| 0.847682 | 0.899243 |
77
+ | 300 | 0.662600 | 0.557515 | 86.244087 | 86.244087| 0.862441 | 0.899243 |
78
+ | 400 | 0.600400 | 0.567693 | 86.177862 | 86.177862| 0.861779 | 0.899243 |
79
+ | 500 | 0.613200 | 0.523546 | 86.499527 | 86.499527| 0.864995 | 0.899243 |
80
+ | 600 | 0.495200 | 0.539225 | 86.565752 | 86.565752| 0.865658 | 0.899243 |
81
+ | 700 | 0.645300 | 0.552358 | 85.354778 | 85.354778| 0.853548 | 0.899243 |
82
+ | 800 | 0.499100 | 0.562317 | 86.338694 | 86.338694| 0.863387 | 0.899243 |
83
+ | 900 | 0.482800 | 0.499747 | 86.811731 | 86.811731| 0.868117 | 0.899243 |
84
+ | 1000 | 0.372800 | 0.543513 | 86.972564 | 86.972564| 0.869726 | 0.900000 |
85
+ | 1100 | 0.554000 | 0.502747 | 85.969726 | 85.969726| 0.859697 | 0.894797 |
86
+ | 1200 | 0.459800 | 0.484941 | 87.019868 | 87.019868| 0.870199 | 0.900662 |
87
+ | 1300 | 0.463600 | 0.477527 | 87.407758 | 87.407758| 0.874078 | 0.899905 |
88
+ | 1400 | 0.356800 | 0.499119 | 87.549669 | 87.549669| 0.875497 | 0.901608 |
89
+ | 1500 | 0.494200 | 0.485287 | 87.549669 | 87.549669| 0.875497 | 0.901703 |
90
+ | 1600 | 0.521100 | 0.466062 | 87.284768 | 87.284768| 0.872848 | 0.899243 |
91
+ | 1700 | 0.461200 | 0.462704 | 87.540208 | 87.540208| 0.875402 | 0.901419 |
92
+ | 1800 | 0.415700 | 0.474295 | 87.691580 | 87.691580| 0.876916 | 0.901892 |
93
+ | 1900 | 0.622900 | 0.462900 | 87.417219 | 87.417219| 0.874172 | 0.901987 |
94
+ | 2000 | 0.477800 | 0.465936 | 87.568590 | 87.568590| 0.875686 | 0.901892 |
95
 
 
96
 
 
97
 
 
98
 
99
 
100
+ ### Training Data
101
+ The model was trained on the [SQuAD](https://huggingface.co/datasets/squad) dataset, a widely used benchmark for context-based question-answering tasks. It consists of passages from Wikipedia and corresponding questions, with human-annotated answers.
102
 
103
+ During training, the dataset was processed to extract contexts, questions, and answers, ensuring compatibility with the BERT architecture for QA. The training utilized free resources to minimize costs and focus on model efficiency.
104
 
 
105
 
 
106
 
107
+ ### Training Procedure
108
 
109
+ **Training Objective**
110
+ The model was trained with the objective of performing context-based question answering using the SQuAD dataset. The fine-tuning process adapts BERT's masked language model (MLM) architecture for QA tasks by leveraging its ability to encode contextual relationships between the passage, question, and answer.
111
 
112
+ **Optimization**
113
+ The training utilized the AdamW optimizer with a linear learning rate scheduler and warm-up steps to ensure effective weight updates and prevent overfitting. The training was run for 2000 steps, with early stopping applied based on the validation loss and exact match score.
114
 
115
+ **Hardware and Resources**
116
+ Training was conducted on free resources, such as Google Colab or equivalent free GPU resources. While this limited the scale, adjustments in batch size and learning rate were optimized to make the training efficient within these constraints.
117
 
118
+ **Unique Features**
119
+ The model fine-tuning procedure emphasizes efficient learning, leveraging BERT's pre-trained knowledge while adapting it specifically to QA tasks in a resource-constrained environment.
120
 
 
121
 
 
122
 
 
123
 
 
124
 
 
125
 
 
126
 
127
+ #### Metrics
128
 
129
+ Performance was evaluated using the following metrics:
130
+ - **Exact Match (EM)**: Measures the percentage of predictions that match the ground-truth answers exactly.
131
+ - **F1 Score**: Assesses the overlap between the predicted and true answers at a token level, balancing precision and recall.
132
+ - **Start and End Accuracy**: Tracks the model’s ability to correctly identify the start and end indices of answers within the context.
133
 
134
  ### Results
135
 
136
+ The model trained on the SQuAD dataset achieved the following key performance metrics:
 
 
 
 
137
 
138
+ Exact Match (EM): Up to 87.69%
139
+ F1 Score: Up to 87.69%
140
+ Validation Loss: Reduced to 0.46
141
+ Start Accuracy: Peaked at 87.69%
142
+ End Accuracy: Peaked at 90.19%
143
 
144
+ #### Summary
145
+ #### Summary
 
 
 
146
 
147
+ The model, **bert_squad**, was fine-tuned for context-based question answering using the SQuAD dataset from Hugging Face. Key metrics include an Exact Match (EM) and F1 score of up to **87.69%**, demonstrating strong accuracy. Performance benchmarks show consistent improvement in loss and accuracy over 2000 steps, with validation loss reaching as low as **0.46**.
148
 
149
+ The training utilized free resources, leveraging BERT’s robust pretraining, although BERT’s limitation as a Masked Language Model (MLM) remains a consideration. This work highlights the potential for effective question-answering systems built on pre-existing datasets and infrastructure.
150
 
 
 
 
 
 
151
 
 
152
 
153
  ### Model Architecture and Objective
154
 
155
+ The model uses BERT, a pre-trained Transformer-based architecture, fine-tuned for context-based question answering tasks. It aims to predict answers based on the given input text and context.
156
 
157
  ### Compute Infrastructure
158
 
 
 
159
  #### Hardware
160
 
161
+ GPU: Tesla P100, NVIDIA T4
162
 
163
  #### Software
164
 
165
+ Framework: Hugging Face Transformers
166
+ Dataset: SQuAD (from Hugging Face)
167
+ Other tools: Python, PyTorch
168
 
 
169
 
170
  **BibTeX:**
171
 
172
+ @misc{bert_squad_finetune,
173
+ title = {BERT Fine-tuned for SQuAD},
174
+ author = {Your Name or Team Name},
175
+ year = {2024},
176
+ url = {https://huggingface.co/your-model-repository}
177
+ }
178
 
 
179
 
 
180
 
181
  ## Glossary [optional]
182
 
183
+ Exact Match (EM): A metric measuring the percentage of predictions that match the ground truth exactly.
184
+ F1 Score: The harmonic mean of precision and recall, used for evaluating the quality of the predictions.
185
+ Masked Language Model (MLM): Pre-training objective for BERT, predicting masked words in input sentences.