real-jiakai
commited on
Commit
•
e158287
1
Parent(s):
d831723
Update README.md
Browse files
README.md
CHANGED
@@ -1,55 +1,137 @@
|
|
1 |
---
|
2 |
-
|
|
|
3 |
license: apache-2.0
|
4 |
base_model: google-bert/bert-base-uncased
|
5 |
tags:
|
6 |
- generated_from_trainer
|
|
|
|
|
|
|
7 |
datasets:
|
8 |
- squad_v2
|
9 |
model-index:
|
10 |
-
- name:
|
11 |
-
results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
---
|
13 |
|
14 |
-
|
15 |
-
should probably proofread and complete it, then remove this comment. -->
|
16 |
|
17 |
-
|
18 |
-
|
19 |
-
This model is a fine-tuned version of [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) on the squad_v2 dataset.
|
20 |
|
21 |
## Model description
|
22 |
|
23 |
-
|
24 |
-
|
25 |
-
## Intended uses & limitations
|
26 |
-
|
27 |
-
More information needed
|
28 |
-
|
29 |
-
## Training and evaluation data
|
30 |
|
31 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
|
33 |
## Training procedure
|
34 |
|
35 |
### Training hyperparameters
|
36 |
|
37 |
-
The
|
38 |
-
-
|
39 |
-
-
|
40 |
-
-
|
41 |
-
-
|
42 |
-
-
|
43 |
-
-
|
44 |
-
-
|
45 |
|
46 |
### Training results
|
47 |
|
48 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
|
50 |
### Framework versions
|
51 |
|
52 |
-
- Transformers 4.47.0.dev0
|
53 |
-
-
|
54 |
-
- Datasets 3.1.0
|
55 |
-
- Tokenizers 0.20.3
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
license: apache-2.0
|
5 |
base_model: google-bert/bert-base-uncased
|
6 |
tags:
|
7 |
- generated_from_trainer
|
8 |
+
- question-answering
|
9 |
+
- squad-v2
|
10 |
+
- bert
|
11 |
datasets:
|
12 |
- squad_v2
|
13 |
model-index:
|
14 |
+
- name: bert-base-uncased-finetuned-squadv2
|
15 |
+
results:
|
16 |
+
- task:
|
17 |
+
type: question-answering
|
18 |
+
name: Question Answering
|
19 |
+
dataset:
|
20 |
+
name: squad_v2
|
21 |
+
type: squad_v2
|
22 |
+
config: squad_v2
|
23 |
+
metrics:
|
24 |
+
- name: HasAns_exact
|
25 |
+
type: exact_match
|
26 |
+
value: 71.25
|
27 |
+
- name: HasAns_f1
|
28 |
+
type: f1
|
29 |
+
value: 78.77
|
30 |
+
- name: NoAns_exact
|
31 |
+
type: exact_match
|
32 |
+
value: 73.42
|
33 |
+
- name: NoAns_f1
|
34 |
+
type: f1
|
35 |
+
value: 73.42
|
36 |
+
- name: best_exact
|
37 |
+
type: exact_match
|
38 |
+
value: 72.34
|
39 |
+
- name: best_f1
|
40 |
+
type: f1
|
41 |
+
value: 76.09
|
42 |
---
|
43 |
|
44 |
+
# bert-base-uncased-finetuned-squadv2
|
|
|
45 |
|
46 |
+
This model is a fine-tuned version of [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) on the SQuAD v2 dataset. It has been trained to perform extractive question answering with the ability to detect unanswerable questions.
|
|
|
|
|
47 |
|
48 |
## Model description
|
49 |
|
50 |
+
This model is based on BERT base uncased architecture and has been fine-tuned on SQuAD v2, which extends the original SQuAD dataset to include questions that cannot be answered based on the provided context. The model learns to either provide the answer span from the context or indicate that the question cannot be answered.
|
|
|
|
|
|
|
|
|
|
|
|
|
51 |
|
52 |
+
Key features:
|
53 |
+
- Architecture: BERT base uncased (12 layers, 768 hidden size, 12 attention heads)
|
54 |
+
- Task: Extractive Question Answering with No-Answer Detection
|
55 |
+
- Language: English
|
56 |
+
- Training Data: SQuAD v2.0
|
57 |
+
- Input: Question and context pairs
|
58 |
+
- Output: Answer span or indication that question is unanswerable
|
59 |
|
60 |
## Training procedure
|
61 |
|
62 |
### Training hyperparameters
|
63 |
|
64 |
+
The model was trained with the following hyperparameters:
|
65 |
+
- Learning rate: 3e-05
|
66 |
+
- Train batch size: 12
|
67 |
+
- Eval batch size: 8
|
68 |
+
- Optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
|
69 |
+
- LR scheduler: Linear
|
70 |
+
- Number of epochs: 5
|
71 |
+
- Seed: 42
|
72 |
|
73 |
### Training results
|
74 |
|
75 |
+
The model achieved the following performance metrics:
|
76 |
+
- HasAns Exact Match: 71.26%
|
77 |
+
- HasAns F1: 78.78%
|
78 |
+
- NoAns Exact Match: 73.42%
|
79 |
+
- NoAns F1: 73.42%
|
80 |
+
- Best Exact Match: 72.34%
|
81 |
+
- Best F1: 76.10%
|
82 |
+
|
83 |
+
Additional training statistics:
|
84 |
+
- Training samples: 131,754
|
85 |
+
- Evaluation samples: 12,134
|
86 |
+
- Training time: 31m 58s
|
87 |
+
- Evaluation time: 42.89s
|
88 |
+
- Training loss: 0.0711
|
89 |
+
- Training samples per second: 343.32
|
90 |
+
- Training steps per second: 28.61
|
91 |
|
92 |
### Framework versions
|
93 |
|
94 |
+
- Transformers: 4.47.0.dev0
|
95 |
+
- PyTorch: 2.5.1+cu124
|
96 |
+
- Datasets: 3.1.0
|
97 |
+
- Tokenizers: 0.20.3
|
98 |
+
|
99 |
+
## Intended uses & limitations
|
100 |
+
|
101 |
+
This model is intended for:
|
102 |
+
- Extractive question answering on English text
|
103 |
+
- Detecting unanswerable questions
|
104 |
+
- General-domain questions and contexts
|
105 |
+
- Research and educational purposes
|
106 |
+
|
107 |
+
Limitations:
|
108 |
+
- Performance may vary on domain-specific content
|
109 |
+
- May struggle with complex reasoning questions
|
110 |
+
- Limited to extractive QA (cannot generate free-form answers)
|
111 |
+
- Only works with English language content
|
112 |
+
|
113 |
+
## How to use
|
114 |
+
|
115 |
+
```python
|
116 |
+
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
|
117 |
+
|
118 |
+
# Load model & tokenizer
|
119 |
+
model_name = "your-username/bert-base-uncased-finetuned-squadv2"
|
120 |
+
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
|
121 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
122 |
+
|
123 |
+
# Example usage
|
124 |
+
context = "The Apollo program was designed to land humans on the Moon and bring them safely back to Earth."
|
125 |
+
question = "What was the goal of the Apollo program?"
|
126 |
+
|
127 |
+
# Tokenize input
|
128 |
+
inputs = tokenizer(
|
129 |
+
question,
|
130 |
+
context,
|
131 |
+
add_special_tokens=True,
|
132 |
+
return_tensors="pt"
|
133 |
+
)
|
134 |
+
|
135 |
+
# Get model predictions
|
136 |
+
outputs = model(**inputs)
|
137 |
+
```
|