Netta1994 commited on
Commit
a932207
1 Parent(s): b3e96cd

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,305 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: 'Reasoning for Good:
14
+
15
+ 1. **Context Grounding**: The answer is supported by the document, which clearly
16
+ indicates that Forbes began reporting on Beyoncé''s earnings in 2008.
17
+
18
+ 2. **Relevance**: The answer specifically addresses the question asked about who
19
+ began reporting Beyoncé''s annual earnings starting in 2008.
20
+
21
+ 3. **Conciseness**: The answer is brief and directly to the point, without including
22
+ extraneous information.
23
+
24
+
25
+ Reasoning for Bad:
26
+
27
+ 1. **Context Grounding**: While the statement is accurate, it introduces the aspect
28
+ of a "widespread misconception" about Times Magazine, which is not mentioned in
29
+ the provided document.
30
+
31
+ 2. **Relevance**: The mention of Times Magazine might be seen as deviating slightly
32
+ from the question, which just asked about the first entity to begin reporting
33
+ Beyoncé''s earnings.
34
+
35
+ 3. **Conciseness**: The answer could have been more concise by focusing solely
36
+ on Forbes without mentioning the misconception about Times Magazine.
37
+
38
+
39
+ Final result: Bad'
40
+ - text: 'The answer provided is:
41
+
42
+
43
+ "The average student at Notre Dame travels more than 750 miles to study there."
44
+
45
+
46
+ Reasoning:
47
+
48
+
49
+ **Good points:**
50
+
51
+ 1. **Context Grounding**: The answer is supported by information present in the
52
+ document, which states, "the average student traveled more than 750 miles to Notre
53
+ Dame".
54
+
55
+ 2. **Relevance**: The answer directly addresses the specific question asking about
56
+ the number of miles the average student travels to study at Notre Dame.
57
+
58
+ 3. **Conciseness**: The answer is clear and to the point without any unnecessary
59
+ information.
60
+
61
+
62
+ **Bad points:**
63
+
64
+ - There are no bad points in this case as the answer aligns perfectly with all
65
+ the evaluation criteria.
66
+
67
+
68
+ Final Result: **Good**'
69
+ - text: 'Reasoning why the answer may be good:
70
+
71
+ - The answer correctly identifies Mick LaSalle as the writer for the San Francisco
72
+ Chronicle.
73
+
74
+ - The answer states that Mick LaSalle awarded "Spectre" a perfect score, which
75
+ is supported by the document.
76
+
77
+
78
+ Reasoning why the answer may be bad:
79
+
80
+ - The answer is concise and to the point, fulfilling the criteria for conciseness
81
+ and relevance.
82
+
83
+ - The document provided confirms that Mick LaSalle gave "Spectre" a perfect score
84
+ of 100.
85
+
86
+ - There is no deviation into unrelated topics, maintaining focus on the question
87
+ asked.
88
+
89
+
90
+ Final result: Good'
91
+ - text: "Reasoning: \n\nWhy the answer may be good:\n- The answer directly addresses\
92
+ \ the specific question asked, \"What New York borough contains the highest population\
93
+ \ of Asian-Americans?\" \n- It is well-supported by the given document, which\
94
+ \ states, \"The New York City borough of Queens is home to the state's largest\
95
+ \ Asian American population.\"\n- The answer is clear and concise without unnecessary\
96
+ \ information.\n\nWhy the answer may be bad:\n- There are no significant reasons\
97
+ \ to consider the answer bad based on the criteria provided. \n\nFinal Result:\
98
+ \ \n\nGood"
99
+ - text: "The answer may be good:\n- The information provided in the answer is supported\
100
+ \ by the document. \n\nThe answer may be bad:\n- The answer does not address the\
101
+ \ specific question asked which pertains to the year that Doctorate degrees were\
102
+ \ first granted at Notre Dame.\n- It deviates into unrelated information about\
103
+ \ the opening of a theology library, which is irrelevant to the question.\n\n\
104
+ Final result: Bad"
105
+ inference: true
106
+ model-index:
107
+ - name: SetFit with BAAI/bge-base-en-v1.5
108
+ results:
109
+ - task:
110
+ type: text-classification
111
+ name: Text Classification
112
+ dataset:
113
+ name: Unknown
114
+ type: unknown
115
+ split: test
116
+ metrics:
117
+ - type: accuracy
118
+ value: 0.8360655737704918
119
+ name: Accuracy
120
+ ---
121
+
122
+ # SetFit with BAAI/bge-base-en-v1.5
123
+
124
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
125
+
126
+ The model has been trained using an efficient few-shot learning technique that involves:
127
+
128
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
129
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
130
+
131
+ ## Model Details
132
+
133
+ ### Model Description
134
+ - **Model Type:** SetFit
135
+ - **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
136
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
137
+ - **Maximum Sequence Length:** 512 tokens
138
+ - **Number of Classes:** 2 classes
139
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
140
+ <!-- - **Language:** Unknown -->
141
+ <!-- - **License:** Unknown -->
142
+
143
+ ### Model Sources
144
+
145
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
146
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
147
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
148
+
149
+ ### Model Labels
150
+ | Label | Examples |
151
+ |:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
152
+ | 1 | <ul><li>'Reasoning why the answer may be good:\n1. **Context Grounding**: The answer is directly supported by the document, which explicitly states that "With almost every line of his epic Punica Silius references Virgil."\n2. **Relevance**: The answer specifically addresses the question asked by identifying the title of Silius Italicus\' epic where Virgil is frequently referenced.\n3. **Conciseness**: The answer is short, clear, and to the point, providing just the necessary information without any extraneous details.\n\nReasoning why the answer may be bad:\n- There is no evidence of deviation or lack of support from the provided document, the relevance is clearly maintained, and the answer concisely addresses the question.\n\nFinal Result: Good'</li><li>'Good'</li><li>'Reasoning:\n\nWhy the answer may be good:\n1. Context Grounding: The answer mentions "3,000 police," which correlates with the figure provided in the document regarding the number of French police that protected the Olympic torch relay.\n2. Relevance: The answer directly addresses the question, which asks about the number of police protecting the torch in France.\n3. Conciseness: The answer is brief and to the point without adding any unnecessary information.\n\nWhy the answer may be bad:\nThere is no evident issue with context grounding, relevance, or conciseness in the answer provided.\n\nFinal result: Good'</li></ul> |
153
+ | 0 | <ul><li>"**Reasoning Why the Answer May Be Good:**\n- The answer correctly identifies a person associated with vice-presidential and presidential roles at Notre Dame, although it attributes the wrong timeframe for the vice-presidency.\n\n**Reasoning Why the Answer May Be Bad:**\n- The document specifically mentions that John Francis O'Hara became vice-president in 1933, not James Edward O'Hara, indicating the answer is not well-supported by the provided document.\n- The answer provides incorrect and irrelevant information that does not address the specific question asked.\n- The question asked for the vice-president elected in 1933, and the answer incorrectly identifies the year 1934.\n\n**Final Result:**\nBad"</li><li>"Reasoning:\n1. **Context Grounding**: The document does provide the necessary information about the gross earnings of Beyoncé's second world tour. Therefore, the answer is well-supported by the document.\n2. **Relevance**: The answer directly responds to the specific question asked about the gross earnings of Beyoncé during her second world tour in 2009.\n3. **Conciseness**: The answer is concise and sticks to the point, providing the exact figure and relevant context about the record without additional unnecessary information.\n\nFinal Result: Good"</li><li>"Reasoning:\n\nWhy the answer may be good:\n- The answer specifies a borough of New York, which is relevant to the question.\n- It provides a specific claim about the population distribution of Asian-Americans within New York City boroughs.\n\nWhy the answer may be bad:\n- The provided document explicitly states that Queens is home to the state's largest Asian-American population, not Manhattan.\n- The answer does not align with the key information from the document, thus failing the test of context grounding.\n\nFinal Result: Bad"</li></ul> |
154
+
155
+ ## Evaluation
156
+
157
+ ### Metrics
158
+ | Label | Accuracy |
159
+ |:--------|:---------|
160
+ | **all** | 0.8361 |
161
+
162
+ ## Uses
163
+
164
+ ### Direct Use for Inference
165
+
166
+ First install the SetFit library:
167
+
168
+ ```bash
169
+ pip install setfit
170
+ ```
171
+
172
+ Then you can load this model and run inference.
173
+
174
+ ```python
175
+ from setfit import SetFitModel
176
+
177
+ # Download from the 🤗 Hub
178
+ model = SetFitModel.from_pretrained("Netta1994/setfit_baai_squad_gpt-4o_improved-cot-instructions_two_reasoning_only_reasoning_17267")
179
+ # Run inference
180
+ preds = model("The answer may be good:
181
+ - The information provided in the answer is supported by the document.
182
+
183
+ The answer may be bad:
184
+ - The answer does not address the specific question asked which pertains to the year that Doctorate degrees were first granted at Notre Dame.
185
+ - It deviates into unrelated information about the opening of a theology library, which is irrelevant to the question.
186
+
187
+ Final result: Bad")
188
+ ```
189
+
190
+ <!--
191
+ ### Downstream Use
192
+
193
+ *List how someone could finetune this model on their own dataset.*
194
+ -->
195
+
196
+ <!--
197
+ ### Out-of-Scope Use
198
+
199
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
200
+ -->
201
+
202
+ <!--
203
+ ## Bias, Risks and Limitations
204
+
205
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
206
+ -->
207
+
208
+ <!--
209
+ ### Recommendations
210
+
211
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
212
+ -->
213
+
214
+ ## Training Details
215
+
216
+ ### Training Set Metrics
217
+ | Training set | Min | Median | Max |
218
+ |:-------------|:----|:--------|:----|
219
+ | Word count | 1 | 91.8596 | 275 |
220
+
221
+ | Label | Training Sample Count |
222
+ |:------|:----------------------|
223
+ | 0 | 27 |
224
+ | 1 | 30 |
225
+
226
+ ### Training Hyperparameters
227
+ - batch_size: (16, 16)
228
+ - num_epochs: (5, 5)
229
+ - max_steps: -1
230
+ - sampling_strategy: oversampling
231
+ - num_iterations: 20
232
+ - body_learning_rate: (2e-05, 2e-05)
233
+ - head_learning_rate: 2e-05
234
+ - loss: CosineSimilarityLoss
235
+ - distance_metric: cosine_distance
236
+ - margin: 0.25
237
+ - end_to_end: False
238
+ - use_amp: False
239
+ - warmup_proportion: 0.1
240
+ - l2_weight: 0.01
241
+ - seed: 42
242
+ - eval_max_steps: -1
243
+ - load_best_model_at_end: False
244
+
245
+ ### Training Results
246
+ | Epoch | Step | Training Loss | Validation Loss |
247
+ |:------:|:----:|:-------------:|:---------------:|
248
+ | 0.0070 | 1 | 0.1646 | - |
249
+ | 0.3497 | 50 | 0.2544 | - |
250
+ | 0.6993 | 100 | 0.1157 | - |
251
+ | 1.0490 | 150 | 0.0294 | - |
252
+ | 1.3986 | 200 | 0.0037 | - |
253
+ | 1.7483 | 250 | 0.0025 | - |
254
+ | 2.0979 | 300 | 0.0023 | - |
255
+ | 2.4476 | 350 | 0.002 | - |
256
+ | 2.7972 | 400 | 0.0018 | - |
257
+ | 3.1469 | 450 | 0.0017 | - |
258
+ | 3.4965 | 500 | 0.0016 | - |
259
+ | 3.8462 | 550 | 0.0017 | - |
260
+ | 4.1958 | 600 | 0.0016 | - |
261
+ | 4.5455 | 650 | 0.0015 | - |
262
+ | 4.8951 | 700 | 0.0016 | - |
263
+
264
+ ### Framework Versions
265
+ - Python: 3.10.14
266
+ - SetFit: 1.1.0
267
+ - Sentence Transformers: 3.1.0
268
+ - Transformers: 4.44.0
269
+ - PyTorch: 2.4.1+cu121
270
+ - Datasets: 2.19.2
271
+ - Tokenizers: 0.19.1
272
+
273
+ ## Citation
274
+
275
+ ### BibTeX
276
+ ```bibtex
277
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
278
+ doi = {10.48550/ARXIV.2209.11055},
279
+ url = {https://arxiv.org/abs/2209.11055},
280
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
281
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
282
+ title = {Efficient Few-Shot Learning Without Prompts},
283
+ publisher = {arXiv},
284
+ year = {2022},
285
+ copyright = {Creative Commons Attribution 4.0 International}
286
+ }
287
+ ```
288
+
289
+ <!--
290
+ ## Glossary
291
+
292
+ *Clearly define terms in order to be accessible across audiences.*
293
+ -->
294
+
295
+ <!--
296
+ ## Model Card Authors
297
+
298
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
299
+ -->
300
+
301
+ <!--
302
+ ## Model Card Contact
303
+
304
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
305
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.44.0",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5cfc7851d7528eb096f1de3d22673a89f066049c9384a5575f0bd84ae3ecd875
3
+ size 437951328
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:196a38b2ca537665bab690eba83534ef3cc57d37598d373c339df8e953ae133b
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff