nickprock commited on
Commit
b6ae1c4
·
verified ·
1 Parent(s): e80e6c9

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,814 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - generated_from_trainer
9
+ - dataset_size:65749
10
+ - loss:MultipleNegativesRankingLoss
11
+ - loss:SoftmaxLoss
12
+ - loss:CoSENTLoss
13
+ base_model: answerdotai/ModernBERT-base
14
+ widget:
15
+ - source_sentence: Can a US President destroy a city with actions?
16
+ sentences:
17
+ - What are best kids educational games?
18
+ - Can a US president destroy a city through actions?
19
+ - Why do people ask questions on Quora that are just as, if not more than easier
20
+ to, look up with a search engine?
21
+ - source_sentence: How would you handle stress people?
22
+ sentences:
23
+ - How do I handle stress with a parent?
24
+ - Why do some people on QUORA ask questions that they can easily findout on Google?
25
+ - How do I make a quick right decision?
26
+ - source_sentence: Two women playing field hockey on AstroTurf.
27
+ sentences:
28
+ - Women playing a game of field hockey.
29
+ - The children are outside.
30
+ - Women re-sod a field hockey field.
31
+ - source_sentence: A dog reaches to catch a ball with its mouth.
32
+ sentences:
33
+ - The dog is playing with a rope.
34
+ - The dog is playing with a ball.
35
+ - Someone holding their baby is smiling while sitting down.
36
+ - source_sentence: There is a very full description of the various types of hormone
37
+ rooting compound here.
38
+ sentences:
39
+ - The least that can be said is that we must be born with the ability and 'knowledge'
40
+ to learn.
41
+ - It is meant to stimulate root growth - in particular to stimulate the creation
42
+ of roots.
43
+ - A person folds a piece of paper.
44
+ datasets:
45
+ - sentence-transformers/all-nli
46
+ - sentence-transformers/stsb
47
+ - sentence-transformers/quora-duplicates
48
+ - sentence-transformers/natural-questions
49
+ pipeline_tag: sentence-similarity
50
+ library_name: sentence-transformers
51
+ ---
52
+
53
+ # SentenceTransformer based on answerdotai/ModernBERT-base
54
+
55
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [all-nli-pair](https://huggingface.co/datasets/sentence-transformers/all-nli), [all-nli-pair-class](https://huggingface.co/datasets/sentence-transformers/all-nli), [all-nli-pair-score](https://huggingface.co/datasets/sentence-transformers/all-nli), [all-nli-triplet](https://huggingface.co/datasets/sentence-transformers/all-nli), [stsb](https://huggingface.co/datasets/sentence-transformers/stsb), [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) and [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
56
+
57
+ ## Model Details
58
+
59
+ ### Model Description
60
+ - **Model Type:** Sentence Transformer
61
+ - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) <!-- at revision 8949b909ec900327062f0ebf497f51aef5e6f0c8 -->
62
+ - **Maximum Sequence Length:** 8192 tokens
63
+ - **Output Dimensionality:** 768 dimensions
64
+ - **Similarity Function:** Cosine Similarity
65
+ - **Training Datasets:**
66
+ - [all-nli-pair](https://huggingface.co/datasets/sentence-transformers/all-nli)
67
+ - [all-nli-pair-class](https://huggingface.co/datasets/sentence-transformers/all-nli)
68
+ - [all-nli-pair-score](https://huggingface.co/datasets/sentence-transformers/all-nli)
69
+ - [all-nli-triplet](https://huggingface.co/datasets/sentence-transformers/all-nli)
70
+ - [stsb](https://huggingface.co/datasets/sentence-transformers/stsb)
71
+ - [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates)
72
+ - [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions)
73
+ - **Language:** en
74
+ <!-- - **License:** Unknown -->
75
+
76
+ ### Model Sources
77
+
78
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
79
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
80
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
81
+
82
+ ### Full Model Architecture
83
+
84
+ ```
85
+ SentenceTransformer(
86
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
87
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
88
+ )
89
+ ```
90
+
91
+ ## Usage
92
+
93
+ ### Direct Usage (Sentence Transformers)
94
+
95
+ First install the Sentence Transformers library:
96
+
97
+ ```bash
98
+ pip install -U sentence-transformers
99
+ ```
100
+
101
+ Then you can load this model and run inference.
102
+ ```python
103
+ from sentence_transformers import SentenceTransformer
104
+
105
+ # Download from the 🤗 Hub
106
+ model = SentenceTransformer("nickprock/modernbert-base-all-nli-stsb-quora-nq")
107
+ # Run inference
108
+ sentences = [
109
+ 'There is a very full description of the various types of hormone rooting compound here.',
110
+ 'It is meant to stimulate root growth - in particular to stimulate the creation of roots.',
111
+ "The least that can be said is that we must be born with the ability and 'knowledge' to learn.",
112
+ ]
113
+ embeddings = model.encode(sentences)
114
+ print(embeddings.shape)
115
+ # [3, 768]
116
+
117
+ # Get the similarity scores for the embeddings
118
+ similarities = model.similarity(embeddings, embeddings)
119
+ print(similarities.shape)
120
+ # [3, 3]
121
+ ```
122
+
123
+ <!--
124
+ ### Direct Usage (Transformers)
125
+
126
+ <details><summary>Click to see the direct usage in Transformers</summary>
127
+
128
+ </details>
129
+ -->
130
+
131
+ <!--
132
+ ### Downstream Usage (Sentence Transformers)
133
+
134
+ You can finetune this model on your own dataset.
135
+
136
+ <details><summary>Click to expand</summary>
137
+
138
+ </details>
139
+ -->
140
+
141
+ <!--
142
+ ### Out-of-Scope Use
143
+
144
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
145
+ -->
146
+
147
+ <!--
148
+ ## Bias, Risks and Limitations
149
+
150
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
151
+ -->
152
+
153
+ <!--
154
+ ### Recommendations
155
+
156
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
157
+ -->
158
+
159
+ ## Training Details
160
+
161
+ ### Training Datasets
162
+ <details><summary>all-nli-pair</summary>
163
+
164
+ #### all-nli-pair
165
+
166
+ * Dataset: [all-nli-pair](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
167
+ * Size: 10,000 training samples
168
+ * Columns: <code>anchor</code> and <code>positive</code>
169
+ * Approximate statistics based on the first 1000 samples:
170
+ | | anchor | positive |
171
+ |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|
172
+ | type | string | string |
173
+ | details | <ul><li>min: 5 tokens</li><li>mean: 17.29 tokens</li><li>max: 64 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.7 tokens</li><li>max: 31 tokens</li></ul> |
174
+ * Samples:
175
+ | anchor | positive |
176
+ |:---------------------------------------------------------------------------|:-------------------------------------------------|
177
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> |
178
+ | <code>Children smiling and waving at camera</code> | <code>There are children present</code> |
179
+ | <code>A boy is jumping on skateboard in the middle of a red bridge.</code> | <code>The boy does a skateboarding trick.</code> |
180
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
181
+ ```json
182
+ {
183
+ "scale": 20.0,
184
+ "similarity_fct": "cos_sim"
185
+ }
186
+ ```
187
+ </details>
188
+ <details><summary>all-nli-pair-class</summary>
189
+
190
+ #### all-nli-pair-class
191
+
192
+ * Dataset: [all-nli-pair-class](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
193
+ * Size: 10,000 training samples
194
+ * Columns: <code>premise</code>, <code>hypothesis</code>, and <code>label</code>
195
+ * Approximate statistics based on the first 1000 samples:
196
+ | | premise | hypothesis | label |
197
+ |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:-------------------------------------------------------------------|
198
+ | type | string | string | int |
199
+ | details | <ul><li>min: 6 tokens</li><li>mean: 17.6 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 10.8 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>0: ~33.40%</li><li>1: ~33.30%</li><li>2: ~33.30%</li></ul> |
200
+ * Samples:
201
+ | premise | hypothesis | label |
202
+ |:--------------------------------------------------------------------|:---------------------------------------------------------------|:---------------|
203
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is training his horse for a competition.</code> | <code>1</code> |
204
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is at a diner, ordering an omelette.</code> | <code>2</code> |
205
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> | <code>0</code> |
206
+ * Loss: [<code>SoftmaxLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#softmaxloss)
207
+ </details>
208
+ <details><summary>all-nli-pair-score</summary>
209
+
210
+ #### all-nli-pair-score
211
+
212
+ * Dataset: [all-nli-pair-score](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
213
+ * Size: 10,000 training samples
214
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
215
+ * Approximate statistics based on the first 1000 samples:
216
+ | | sentence1 | sentence2 | score |
217
+ |:--------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:--------------------------------------------------------------|
218
+ | type | string | string | float |
219
+ | details | <ul><li>min: 6 tokens</li><li>mean: 17.6 tokens</li><li>max: 51 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 10.8 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.5</li><li>max: 1.0</li></ul> |
220
+ * Samples:
221
+ | sentence1 | sentence2 | score |
222
+ |:--------------------------------------------------------------------|:---------------------------------------------------------------|:-----------------|
223
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is training his horse for a competition.</code> | <code>0.5</code> |
224
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is at a diner, ordering an omelette.</code> | <code>0.0</code> |
225
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> | <code>1.0</code> |
226
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
227
+ ```json
228
+ {
229
+ "scale": 20.0,
230
+ "similarity_fct": "pairwise_cos_sim"
231
+ }
232
+ ```
233
+ </details>
234
+ <details><summary>all-nli-triplet</summary>
235
+
236
+ #### all-nli-triplet
237
+
238
+ * Dataset: [all-nli-triplet](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
239
+ * Size: 10,000 training samples
240
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
241
+ * Approximate statistics based on the first 1000 samples:
242
+ | | anchor | positive | negative |
243
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
244
+ | type | string | string | string |
245
+ | details | <ul><li>min: 7 tokens</li><li>mean: 10.46 tokens</li><li>max: 46 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 12.91 tokens</li><li>max: 40 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 13.49 tokens</li><li>max: 51 tokens</li></ul> |
246
+ * Samples:
247
+ | anchor | positive | negative |
248
+ |:---------------------------------------------------------------------------|:-------------------------------------------------|:-----------------------------------------------------------|
249
+ | <code>A person on a horse jumps over a broken down airplane.</code> | <code>A person is outdoors, on a horse.</code> | <code>A person is at a diner, ordering an omelette.</code> |
250
+ | <code>Children smiling and waving at camera</code> | <code>There are children present</code> | <code>The kids are frowning</code> |
251
+ | <code>A boy is jumping on skateboard in the middle of a red bridge.</code> | <code>The boy does a skateboarding trick.</code> | <code>The boy skates down the sidewalk.</code> |
252
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
253
+ ```json
254
+ {
255
+ "scale": 20.0,
256
+ "similarity_fct": "cos_sim"
257
+ }
258
+ ```
259
+ </details>
260
+ <details><summary>stsb</summary>
261
+
262
+ #### stsb
263
+
264
+ * Dataset: [stsb](https://huggingface.co/datasets/sentence-transformers/stsb) at [ab7a5ac](https://huggingface.co/datasets/sentence-transformers/stsb/tree/ab7a5ac0e35aa22088bdcf23e7fd99b220e53308)
265
+ * Size: 5,749 training samples
266
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
267
+ * Approximate statistics based on the first 1000 samples:
268
+ | | sentence1 | sentence2 | score |
269
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------|
270
+ | type | string | string | float |
271
+ | details | <ul><li>min: 6 tokens</li><li>mean: 10.16 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 10.12 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.45</li><li>max: 1.0</li></ul> |
272
+ * Samples:
273
+ | sentence1 | sentence2 | score |
274
+ |:-----------------------------------------------------------|:----------------------------------------------------------------------|:------------------|
275
+ | <code>A plane is taking off.</code> | <code>An air plane is taking off.</code> | <code>1.0</code> |
276
+ | <code>A man is playing a large flute.</code> | <code>A man is playing a flute.</code> | <code>0.76</code> |
277
+ | <code>A man is spreading shreded cheese on a pizza.</code> | <code>A man is spreading shredded cheese on an uncooked pizza.</code> | <code>0.76</code> |
278
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
279
+ ```json
280
+ {
281
+ "scale": 20.0,
282
+ "similarity_fct": "pairwise_cos_sim"
283
+ }
284
+ ```
285
+ </details>
286
+ <details><summary>quora</summary>
287
+
288
+ #### quora
289
+
290
+ * Dataset: [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
291
+ * Size: 10,000 training samples
292
+ * Columns: <code>anchor</code> and <code>positive</code>
293
+ * Approximate statistics based on the first 1000 samples:
294
+ | | anchor | positive |
295
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
296
+ | type | string | string |
297
+ | details | <ul><li>min: 6 tokens</li><li>mean: 13.91 tokens</li><li>max: 45 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.09 tokens</li><li>max: 44 tokens</li></ul> |
298
+ * Samples:
299
+ | anchor | positive |
300
+ |:----------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------|
301
+ | <code>Astrology: I am a Capricorn Sun Cap moon and cap rising...what does that say about me?</code> | <code>I'm a triple Capricorn (Sun, Moon and ascendant in Capricorn) What does this say about me?</code> |
302
+ | <code>How can I be a good geologist?</code> | <code>What should I do to be a great geologist?</code> |
303
+ | <code>How do I read and find my YouTube comments?</code> | <code>How can I see all my Youtube comments?</code> |
304
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
305
+ ```json
306
+ {
307
+ "scale": 20.0,
308
+ "similarity_fct": "cos_sim"
309
+ }
310
+ ```
311
+ </details>
312
+ <details><summary>natural-questions</summary>
313
+
314
+ #### natural-questions
315
+
316
+ * Dataset: [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
317
+ * Size: 10,000 training samples
318
+ * Columns: <code>query</code> and <code>answer</code>
319
+ * Approximate statistics based on the first 1000 samples:
320
+ | | query | answer |
321
+ |:--------|:-----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
322
+ | type | string | string |
323
+ | details | <ul><li>min: 10 tokens</li><li>mean: 12.47 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 17 tokens</li><li>mean: 138.32 tokens</li><li>max: 556 tokens</li></ul> |
324
+ * Samples:
325
+ | query | answer |
326
+ |:----------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
327
+ | <code>when did richmond last play in a preliminary final</code> | <code>Richmond Football Club Richmond began 2017 with 5 straight wins, a feat it had not achieved since 1995. A series of close losses hampered the Tigers throughout the middle of the season, including a 5-point loss to the Western Bulldogs, 2-point loss to Fremantle, and a 3-point loss to the Giants. Richmond ended the season strongly with convincing victories over Fremantle and St Kilda in the final two rounds, elevating the club to 3rd on the ladder. Richmond's first final of the season against the Cats at the MCG attracted a record qualifying final crowd of 95,028; the Tigers won by 51 points. Having advanced to the first preliminary finals for the first time since 2001, Richmond defeated Greater Western Sydney by 36 points in front of a crowd of 94,258 to progress to the Grand Final against Adelaide, their first Grand Final appearance since 1982. The attendance was 100,021, the largest crowd to a grand final since 1986. The Crows led at quarter time and led by as many as 13, but the Tig...</code> |
328
+ | <code>who sang what in the world's come over you</code> | <code>Jack Scott (singer) At the beginning of 1960, Scott again changed record labels, this time to Top Rank Records.[1] He then recorded four Billboard Hot 100 hits – "What in the World's Come Over You" (#5), "Burning Bridges" (#3) b/w "Oh Little One" (#34), and "It Only Happened Yesterday" (#38).[1] "What in the World's Come Over You" was Scott's second gold disc winner.[6] Scott continued to record and perform during the 1960s and 1970s.[1] His song "You're Just Gettin' Better" reached the country charts in 1974.[1] In May 1977, Scott recorded a Peel session for BBC Radio 1 disc jockey, John Peel.</code> |
329
+ | <code>who produces the most wool in the world</code> | <code>Wool Global wool production is about 2 million tonnes per year, of which 60% goes into apparel. Wool comprises ca 3% of the global textile market, but its value is higher owing to dying and other modifications of the material.[1] Australia is a leading producer of wool which is mostly from Merino sheep but has been eclipsed by China in terms of total weight.[30] New Zealand (2016) is the third-largest producer of wool, and the largest producer of crossbred wool. Breeds such as Lincoln, Romney, Drysdale, and Elliotdale produce coarser fibers, and wool from these sheep is usually used for making carpets.</code> |
330
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
331
+ ```json
332
+ {
333
+ "scale": 20.0,
334
+ "similarity_fct": "cos_sim"
335
+ }
336
+ ```
337
+ </details>
338
+
339
+ ### Evaluation Datasets
340
+ <details><summary>all-nli-triplet</summary>
341
+
342
+ #### all-nli-triplet
343
+
344
+ * Dataset: [all-nli-triplet](https://huggingface.co/datasets/sentence-transformers/all-nli) at [d482672](https://huggingface.co/datasets/sentence-transformers/all-nli/tree/d482672c8e74ce18da116f430137434ba2e52fab)
345
+ * Size: 6,584 evaluation samples
346
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
347
+ * Approximate statistics based on the first 1000 samples:
348
+ | | anchor | positive | negative |
349
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
350
+ | type | string | string | string |
351
+ | details | <ul><li>min: 6 tokens</li><li>mean: 18.25 tokens</li><li>max: 69 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 9.88 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 10.48 tokens</li><li>max: 29 tokens</li></ul> |
352
+ * Samples:
353
+ | anchor | positive | negative |
354
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------|:--------------------------------------------------------|
355
+ | <code>Two women are embracing while holding to go packages.</code> | <code>Two woman are holding packages.</code> | <code>The men are fighting outside a deli.</code> |
356
+ | <code>Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink.</code> | <code>Two kids in numbered jerseys wash their hands.</code> | <code>Two kids in jackets walk to school.</code> |
357
+ | <code>A man selling donuts to a customer during a world exhibition event held in the city of Angeles</code> | <code>A man selling donuts to a customer.</code> | <code>A woman drinks her coffee in a small cafe.</code> |
358
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
359
+ ```json
360
+ {
361
+ "scale": 20.0,
362
+ "similarity_fct": "cos_sim"
363
+ }
364
+ ```
365
+ </details>
366
+ <details><summary>stsb</summary>
367
+
368
+ #### stsb
369
+
370
+ * Dataset: [stsb](https://huggingface.co/datasets/sentence-transformers/stsb) at [ab7a5ac](https://huggingface.co/datasets/sentence-transformers/stsb/tree/ab7a5ac0e35aa22088bdcf23e7fd99b220e53308)
371
+ * Size: 1,500 evaluation samples
372
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
373
+ * Approximate statistics based on the first 1000 samples:
374
+ | | sentence1 | sentence2 | score |
375
+ |:--------|:----------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:---------------------------------------------------------------|
376
+ | type | string | string | float |
377
+ | details | <ul><li>min: 5 tokens</li><li>mean: 15.11 tokens</li><li>max: 44 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.1 tokens</li><li>max: 50 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.42</li><li>max: 1.0</li></ul> |
378
+ * Samples:
379
+ | sentence1 | sentence2 | score |
380
+ |:--------------------------------------------------|:------------------------------------------------------|:------------------|
381
+ | <code>A man with a hard hat is dancing.</code> | <code>A man wearing a hard hat is dancing.</code> | <code>1.0</code> |
382
+ | <code>A young child is riding a horse.</code> | <code>A child is riding a horse.</code> | <code>0.95</code> |
383
+ | <code>A man is feeding a mouse to a snake.</code> | <code>The man is feeding a mouse to the snake.</code> | <code>1.0</code> |
384
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
385
+ ```json
386
+ {
387
+ "scale": 20.0,
388
+ "similarity_fct": "pairwise_cos_sim"
389
+ }
390
+ ```
391
+ </details>
392
+ <details><summary>quora</summary>
393
+
394
+ #### quora
395
+
396
+ * Dataset: [quora](https://huggingface.co/datasets/sentence-transformers/quora-duplicates) at [451a485](https://huggingface.co/datasets/sentence-transformers/quora-duplicates/tree/451a4850bd141edb44ade1b5828c259abd762cdb)
397
+ * Size: 1,000 evaluation samples
398
+ * Columns: <code>anchor</code> and <code>positive</code>
399
+ * Approximate statistics based on the first 1000 samples:
400
+ | | anchor | positive |
401
+ |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
402
+ | type | string | string |
403
+ | details | <ul><li>min: 6 tokens</li><li>mean: 14.01 tokens</li><li>max: 63 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.04 tokens</li><li>max: 46 tokens</li></ul> |
404
+ * Samples:
405
+ | anchor | positive |
406
+ |:----------------------------------------------------------------------------|:--------------------------------------------------------------------------------|
407
+ | <code>What is your New Year resolution?</code> | <code>What can be my new year resolution for 2017?</code> |
408
+ | <code>Should I buy the IPhone 6s or Samsung Galaxy s7?</code> | <code>Which is better: the iPhone 6S Plus or the Samsung Galaxy S7 Edge?</code> |
409
+ | <code>What are the differences between transgression and regression?</code> | <code>What is the difference between transgression and regression?</code> |
410
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
411
+ ```json
412
+ {
413
+ "scale": 20.0,
414
+ "similarity_fct": "cos_sim"
415
+ }
416
+ ```
417
+ </details>
418
+ <details><summary>natural-questions</summary>
419
+
420
+ #### natural-questions
421
+
422
+ * Dataset: [natural-questions](https://huggingface.co/datasets/sentence-transformers/natural-questions) at [f9e894e](https://huggingface.co/datasets/sentence-transformers/natural-questions/tree/f9e894e1081e206e577b4eaa9ee6de2b06ae6f17)
423
+ * Size: 1,000 evaluation samples
424
+ * Columns: <code>query</code> and <code>answer</code>
425
+ * Approximate statistics based on the first 1000 samples:
426
+ | | query | answer |
427
+ |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
428
+ | type | string | string |
429
+ | details | <ul><li>min: 9 tokens</li><li>mean: 12.51 tokens</li><li>max: 26 tokens</li></ul> | <ul><li>min: 19 tokens</li><li>mean: 140.84 tokens</li><li>max: 585 tokens</li></ul> |
430
+ * Samples:
431
+ | query | answer |
432
+ |:--------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
433
+ | <code>where does the waikato river begin and end</code> | <code>Waikato River The Waikato River is the longest river in New Zealand, running for 425 kilometres (264 mi) through the North Island. It rises in the eastern slopes of Mount Ruapehu, joining the Tongariro River system and flowing through Lake Taupo, New Zealand's largest lake. It then drains Taupo at the lake's northeastern edge, creates the Huka Falls, and flows northwest through the Waikato Plains. It empties into the Tasman Sea south of Auckland, at Port Waikato. It gives its name to the Waikato Region that surrounds the Waikato Plains. The present course of the river was largely formed about 17,000 years ago. Contributing factors were climate warming, forest being reestablished in the river headwaters and the deepening, rather than widening, of the existing river channel. The channel was gradually eroded as far up river as Piarere, leaving the old Hinuera channel high and dry.[2] The remains of the old river path can be clearly seen at Hinuera where the cliffs mark the ancient river ...</code> |
434
+ | <code>what type of gas is produced during fermentation</code> | <code>Fermentation Fermentation reacts NADH with an endogenous, organic electron acceptor.[1] Usually this is pyruvate formed from sugar through glycolysis. The reaction produces NAD+ and an organic product, typical examples being ethanol, lactic acid, carbon dioxide, and hydrogen gas (H2). However, more exotic compounds can be produced by fermentation, such as butyric acid and acetone. Fermentation products contain chemical energy (they are not fully oxidized), but are considered waste products, since they cannot be metabolized further without the use of oxygen.</code> |
435
+ | <code>why was star wars episode iv released first</code> | <code>Star Wars (film) Star Wars (later retitled Star Wars: Episode IV – A New Hope) is a 1977 American epic space opera film written and directed by George Lucas. It is the first film in the original Star Wars trilogy and the beginning of the Star Wars franchise. Starring Mark Hamill, Harrison Ford, Carrie Fisher, Peter Cushing, Alec Guinness, David Prowse, James Earl Jones, Anthony Daniels, Kenny Baker, and Peter Mayhew, the film's plot focuses on the Rebel Alliance, led by Princess Leia (Fisher), and its attempt to destroy the Galactic Empire's space station, the Death Star. This conflict disrupts the isolated life of farmhand Luke Skywalker (Hamill), who inadvertently acquires two droids that possess stolen architectural plans for the Death Star. When the Empire begins a destructive search for the missing droids, Skywalker accompanies Jedi Master Obi-Wan Kenobi (Guinness) on a mission to return the plans to the Rebel Alliance and rescue Leia from her imprisonment by the Empire.</code> |
436
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
437
+ ```json
438
+ {
439
+ "scale": 20.0,
440
+ "similarity_fct": "cos_sim"
441
+ }
442
+ ```
443
+ </details>
444
+
445
+ ### Training Hyperparameters
446
+ #### Non-Default Hyperparameters
447
+
448
+ - `eval_strategy`: steps
449
+ - `per_device_train_batch_size`: 16
450
+ - `per_device_eval_batch_size`: 16
451
+ - `learning_rate`: 2e-05
452
+ - `num_train_epochs`: 4
453
+ - `warmup_ratio`: 0.1
454
+ - `fp16`: True
455
+
456
+ #### All Hyperparameters
457
+ <details><summary>Click to expand</summary>
458
+
459
+ - `overwrite_output_dir`: False
460
+ - `do_predict`: False
461
+ - `eval_strategy`: steps
462
+ - `prediction_loss_only`: True
463
+ - `per_device_train_batch_size`: 16
464
+ - `per_device_eval_batch_size`: 16
465
+ - `per_gpu_train_batch_size`: None
466
+ - `per_gpu_eval_batch_size`: None
467
+ - `gradient_accumulation_steps`: 1
468
+ - `eval_accumulation_steps`: None
469
+ - `torch_empty_cache_steps`: None
470
+ - `learning_rate`: 2e-05
471
+ - `weight_decay`: 0.0
472
+ - `adam_beta1`: 0.9
473
+ - `adam_beta2`: 0.999
474
+ - `adam_epsilon`: 1e-08
475
+ - `max_grad_norm`: 1.0
476
+ - `num_train_epochs`: 4
477
+ - `max_steps`: -1
478
+ - `lr_scheduler_type`: linear
479
+ - `lr_scheduler_kwargs`: {}
480
+ - `warmup_ratio`: 0.1
481
+ - `warmup_steps`: 0
482
+ - `log_level`: passive
483
+ - `log_level_replica`: warning
484
+ - `log_on_each_node`: True
485
+ - `logging_nan_inf_filter`: True
486
+ - `save_safetensors`: True
487
+ - `save_on_each_node`: False
488
+ - `save_only_model`: False
489
+ - `restore_callback_states_from_checkpoint`: False
490
+ - `no_cuda`: False
491
+ - `use_cpu`: False
492
+ - `use_mps_device`: False
493
+ - `seed`: 42
494
+ - `data_seed`: None
495
+ - `jit_mode_eval`: False
496
+ - `use_ipex`: False
497
+ - `bf16`: False
498
+ - `fp16`: True
499
+ - `fp16_opt_level`: O1
500
+ - `half_precision_backend`: auto
501
+ - `bf16_full_eval`: False
502
+ - `fp16_full_eval`: False
503
+ - `tf32`: None
504
+ - `local_rank`: 0
505
+ - `ddp_backend`: None
506
+ - `tpu_num_cores`: None
507
+ - `tpu_metrics_debug`: False
508
+ - `debug`: []
509
+ - `dataloader_drop_last`: False
510
+ - `dataloader_num_workers`: 0
511
+ - `dataloader_prefetch_factor`: None
512
+ - `past_index`: -1
513
+ - `disable_tqdm`: False
514
+ - `remove_unused_columns`: True
515
+ - `label_names`: None
516
+ - `load_best_model_at_end`: False
517
+ - `ignore_data_skip`: False
518
+ - `fsdp`: []
519
+ - `fsdp_min_num_params`: 0
520
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
521
+ - `fsdp_transformer_layer_cls_to_wrap`: None
522
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
523
+ - `deepspeed`: None
524
+ - `label_smoothing_factor`: 0.0
525
+ - `optim`: adamw_torch
526
+ - `optim_args`: None
527
+ - `adafactor`: False
528
+ - `group_by_length`: False
529
+ - `length_column_name`: length
530
+ - `ddp_find_unused_parameters`: None
531
+ - `ddp_bucket_cap_mb`: None
532
+ - `ddp_broadcast_buffers`: False
533
+ - `dataloader_pin_memory`: True
534
+ - `dataloader_persistent_workers`: False
535
+ - `skip_memory_metrics`: True
536
+ - `use_legacy_prediction_loop`: False
537
+ - `push_to_hub`: False
538
+ - `resume_from_checkpoint`: None
539
+ - `hub_model_id`: None
540
+ - `hub_strategy`: every_save
541
+ - `hub_private_repo`: None
542
+ - `hub_always_push`: False
543
+ - `gradient_checkpointing`: False
544
+ - `gradient_checkpointing_kwargs`: None
545
+ - `include_inputs_for_metrics`: False
546
+ - `include_for_metrics`: []
547
+ - `eval_do_concat_batches`: True
548
+ - `fp16_backend`: auto
549
+ - `push_to_hub_model_id`: None
550
+ - `push_to_hub_organization`: None
551
+ - `mp_parameters`:
552
+ - `auto_find_batch_size`: False
553
+ - `full_determinism`: False
554
+ - `torchdynamo`: None
555
+ - `ray_scope`: last
556
+ - `ddp_timeout`: 1800
557
+ - `torch_compile`: False
558
+ - `torch_compile_backend`: None
559
+ - `torch_compile_mode`: None
560
+ - `dispatch_batches`: None
561
+ - `split_batches`: None
562
+ - `include_tokens_per_second`: False
563
+ - `include_num_input_tokens_seen`: False
564
+ - `neftune_noise_alpha`: None
565
+ - `optim_target_modules`: None
566
+ - `batch_eval_metrics`: False
567
+ - `eval_on_start`: False
568
+ - `use_liger_kernel`: False
569
+ - `eval_use_gather_object`: False
570
+ - `average_tokens_across_devices`: False
571
+ - `prompts`: None
572
+ - `batch_sampler`: batch_sampler
573
+ - `multi_dataset_batch_sampler`: proportional
574
+
575
+ </details>
576
+
577
+ ### Training Logs
578
+ <details><summary>Click to expand</summary>
579
+
580
+ | Epoch | Step | Training Loss | all-nli-triplet loss | stsb loss | quora loss | natural-questions loss |
581
+ |:------:|:-----:|:-------------:|:--------------------:|:---------:|:----------:|:----------------------:|
582
+ | 0.0243 | 100 | 2.8163 | 2.6011 | 4.6235 | 1.6762 | 2.2254 |
583
+ | 0.0487 | 200 | 2.6522 | 2.0674 | 4.5288 | 1.0381 | 1.7565 |
584
+ | 0.0730 | 300 | 2.5478 | 1.1872 | 5.1274 | 0.0883 | 0.8453 |
585
+ | 0.0973 | 400 | 2.3013 | 0.9126 | 5.3516 | 0.0443 | 0.6953 |
586
+ | 0.1217 | 500 | 1.9177 | 0.8462 | 5.6431 | 0.0343 | 0.5612 |
587
+ | 0.1460 | 600 | 1.7186 | 0.7144 | 5.8698 | 0.0264 | 0.3991 |
588
+ | 0.1703 | 700 | 2.0748 | 0.7219 | 5.2972 | 0.0255 | 0.2856 |
589
+ | 0.1946 | 800 | 1.9132 | 0.6691 | 5.3757 | 0.0196 | 0.2245 |
590
+ | 0.2190 | 900 | 1.8559 | 0.6198 | 5.5028 | 0.0185 | 0.1659 |
591
+ | 0.2433 | 1000 | 2.1453 | 0.5851 | 5.8587 | 0.0177 | 0.1280 |
592
+ | 0.2676 | 1100 | 2.0303 | 0.6331 | 5.1522 | 0.0222 | 0.1381 |
593
+ | 0.2920 | 1200 | 1.8612 | 0.5579 | 5.7026 | 0.0156 | 0.1016 |
594
+ | 0.3163 | 1300 | 1.8465 | 0.6045 | 5.0309 | 0.0187 | 0.1062 |
595
+ | 0.3406 | 1400 | 1.7208 | 0.5491 | 5.5651 | 0.0174 | 0.0864 |
596
+ | 0.3650 | 1500 | 1.5479 | 0.5337 | 5.9317 | 0.0170 | 0.0809 |
597
+ | 0.3893 | 1600 | 1.5605 | 0.5604 | 5.4574 | 0.0210 | 0.0765 |
598
+ | 0.4136 | 1700 | 1.7457 | 0.5528 | 5.2572 | 0.0188 | 0.0750 |
599
+ | 0.4380 | 1800 | 1.6724 | 0.4923 | 5.6488 | 0.0169 | 0.0790 |
600
+ | 0.4623 | 1900 | 1.4122 | 0.4718 | 5.3825 | 0.0163 | 0.0647 |
601
+ | 0.4866 | 2000 | 1.848 | 0.4594 | 5.6606 | 0.0189 | 0.0658 |
602
+ | 0.5109 | 2100 | 2.0782 | 0.5167 | 4.9055 | 0.0210 | 0.0712 |
603
+ | 0.5353 | 2200 | 1.5413 | 0.4396 | 5.3588 | 0.0210 | 0.0580 |
604
+ | 0.5596 | 2300 | 1.6705 | 0.4588 | 5.5433 | 0.0192 | 0.0550 |
605
+ | 0.5839 | 2400 | 1.5674 | 0.4351 | 5.3304 | 0.0180 | 0.0582 |
606
+ | 0.6083 | 2500 | 1.5238 | 0.4812 | 5.2534 | 0.0163 | 0.0530 |
607
+ | 0.6326 | 2600 | 1.4025 | 0.4470 | 5.4626 | 0.0156 | 0.0513 |
608
+ | 0.6569 | 2700 | 1.5916 | 0.4489 | 5.5590 | 0.0159 | 0.0513 |
609
+ | 0.6813 | 2800 | 1.6206 | 0.4611 | 5.1904 | 0.0156 | 0.0536 |
610
+ | 0.7056 | 2900 | 1.7873 | 0.4742 | 5.1292 | 0.0153 | 0.0472 |
611
+ | 0.7299 | 3000 | 1.9452 | 0.4752 | 4.9931 | 0.0163 | 0.0542 |
612
+ | 0.7543 | 3100 | 1.563 | 0.4722 | 5.3862 | 0.0175 | 0.0513 |
613
+ | 0.7786 | 3200 | 1.3493 | 0.4525 | 5.4255 | 0.0163 | 0.0423 |
614
+ | 0.8029 | 3300 | 1.606 | 0.4657 | 5.3005 | 0.0179 | 0.0431 |
615
+ | 0.8273 | 3400 | 1.6305 | 0.4466 | 5.5017 | 0.0163 | 0.0432 |
616
+ | 0.8516 | 3500 | 1.3496 | 0.4144 | 5.3454 | 0.0170 | 0.0440 |
617
+ | 0.8759 | 3600 | 1.5866 | 0.4014 | 5.8260 | 0.0167 | 0.0481 |
618
+ | 0.9002 | 3700 | 1.495 | 0.4094 | 5.5550 | 0.0173 | 0.0454 |
619
+ | 0.9246 | 3800 | 1.2604 | 0.4125 | 5.9704 | 0.0179 | 0.0376 |
620
+ | 0.9489 | 3900 | 1.6432 | 0.4223 | 5.1097 | 0.0176 | 0.0450 |
621
+ | 0.9732 | 4000 | 1.6194 | 0.4322 | 5.1807 | 0.0166 | 0.0400 |
622
+ | 0.9976 | 4100 | 1.3006 | 0.4209 | 5.3493 | 0.0176 | 0.0412 |
623
+ | 1.0219 | 4200 | 1.3557 | 0.4080 | 5.5556 | 0.0167 | 0.0395 |
624
+ | 1.0462 | 4300 | 1.2346 | 0.3944 | 5.6652 | 0.0164 | 0.0395 |
625
+ | 1.0706 | 4400 | 1.6212 | 0.4036 | 5.6948 | 0.0157 | 0.0407 |
626
+ | 1.0949 | 4500 | 1.7511 | 0.3909 | 5.5846 | 0.0159 | 0.0410 |
627
+ | 1.1192 | 4600 | 1.1087 | 0.3827 | 5.7067 | 0.0175 | 0.0384 |
628
+ | 1.1436 | 4700 | 1.1356 | 0.3947 | 6.0833 | 0.0181 | 0.0412 |
629
+ | 1.1679 | 4800 | 1.4649 | 0.3816 | 5.6926 | 0.0187 | 0.0407 |
630
+ | 1.1922 | 4900 | 1.2354 | 0.4000 | 5.8187 | 0.0181 | 0.0401 |
631
+ | 1.2165 | 5000 | 1.2099 | 0.3967 | 5.8184 | 0.0183 | 0.0428 |
632
+ | 1.2409 | 5100 | 1.279 | 0.3784 | 5.8931 | 0.0176 | 0.0418 |
633
+ | 1.2652 | 5200 | 1.0431 | 0.3845 | 5.8284 | 0.0167 | 0.0395 |
634
+ | 1.2895 | 5300 | 1.2217 | 0.3883 | 5.6984 | 0.0195 | 0.0380 |
635
+ | 1.3139 | 5400 | 1.6192 | 0.3858 | 5.7183 | 0.0192 | 0.0381 |
636
+ | 1.3382 | 5500 | 1.5792 | 0.3704 | 5.8270 | 0.0196 | 0.0437 |
637
+ | 1.3625 | 5600 | 1.4467 | 0.3885 | 5.7460 | 0.0179 | 0.0411 |
638
+ | 1.3869 | 5700 | 1.217 | 0.3778 | 5.6724 | 0.0185 | 0.0407 |
639
+ | 1.4112 | 5800 | 1.3599 | 0.3824 | 5.8521 | 0.0155 | 0.0392 |
640
+ | 1.4355 | 5900 | 1.3571 | 0.3674 | 6.0293 | 0.0158 | 0.0379 |
641
+ | 1.4599 | 6000 | 1.4408 | 0.3667 | 5.9265 | 0.0140 | 0.0379 |
642
+ | 1.4842 | 6100 | 1.1629 | 0.3612 | 5.6663 | 0.0151 | 0.0367 |
643
+ | 1.5085 | 6200 | 1.21 | 0.3765 | 5.7513 | 0.0176 | 0.0407 |
644
+ | 1.5328 | 6300 | 1.4469 | 0.3722 | 5.8795 | 0.0162 | 0.0431 |
645
+ | 1.5572 | 6400 | 1.8419 | 0.3687 | 5.6081 | 0.0145 | 0.0382 |
646
+ | 1.5815 | 6500 | 1.4978 | 0.3739 | 5.6302 | 0.0156 | 0.0372 |
647
+ | 1.6058 | 6600 | 1.3954 | 0.3658 | 5.9182 | 0.0160 | 0.0405 |
648
+ | 1.6302 | 6700 | 1.262 | 0.3702 | 5.6119 | 0.0158 | 0.0370 |
649
+ | 1.6545 | 6800 | 0.9204 | 0.3723 | 5.7449 | 0.0147 | 0.0378 |
650
+ | 1.6788 | 6900 | 1.0658 | 0.3738 | 5.7127 | 0.0132 | 0.0410 |
651
+ | 1.7032 | 7000 | 1.286 | 0.3740 | 5.7997 | 0.0143 | 0.0405 |
652
+ | 1.7275 | 7100 | 1.3771 | 0.3650 | 5.7853 | 0.0142 | 0.0411 |
653
+ | 1.7518 | 7200 | 1.205 | 0.3728 | 5.8454 | 0.0149 | 0.0423 |
654
+ | 1.7762 | 7300 | 0.9881 | 0.3691 | 5.7261 | 0.0147 | 0.0461 |
655
+ | 1.8005 | 7400 | 1.3962 | 0.3751 | 5.6620 | 0.0135 | 0.0427 |
656
+ | 1.8248 | 7500 | 1.1804 | 0.3812 | 5.6814 | 0.0136 | 0.0396 |
657
+ | 1.8491 | 7600 | 1.4312 | 0.3722 | 5.7919 | 0.0141 | 0.0368 |
658
+ | 1.8735 | 7700 | 1.1161 | 0.3700 | 5.7718 | 0.0140 | 0.0397 |
659
+ | 1.8978 | 7800 | 1.389 | 0.3815 | 5.8770 | 0.0127 | 0.0415 |
660
+ | 1.9221 | 7900 | 1.5896 | 0.3726 | 5.6467 | 0.0132 | 0.0382 |
661
+ | 1.9465 | 8000 | 1.6873 | 0.3706 | 5.5875 | 0.0132 | 0.0380 |
662
+ | 1.9708 | 8100 | 1.513 | 0.3658 | 5.6106 | 0.0130 | 0.0371 |
663
+ | 1.9951 | 8200 | 0.9243 | 0.3611 | 5.7932 | 0.0135 | 0.0378 |
664
+ | 2.0195 | 8300 | 1.1086 | 0.3510 | 5.8341 | 0.0133 | 0.0386 |
665
+ | 2.0438 | 8400 | 0.7918 | 0.3715 | 6.0229 | 0.0138 | 0.0382 |
666
+ | 2.0681 | 8500 | 1.1291 | 0.3708 | 6.0243 | 0.0146 | 0.0397 |
667
+ | 2.0925 | 8600 | 0.9846 | 0.3775 | 6.0437 | 0.0139 | 0.0380 |
668
+ | 2.1168 | 8700 | 0.7928 | 0.3732 | 6.1154 | 0.0145 | 0.0408 |
669
+ | 2.1411 | 8800 | 1.0726 | 0.3786 | 5.9249 | 0.0151 | 0.0387 |
670
+ | 2.1655 | 8900 | 1.3123 | 0.3720 | 6.0072 | 0.0146 | 0.0395 |
671
+ | 2.1898 | 9000 | 0.752 | 0.3741 | 6.1952 | 0.0148 | 0.0411 |
672
+ | 2.2141 | 9100 | 1.1021 | 0.3708 | 6.0910 | 0.0140 | 0.0391 |
673
+ | 2.2384 | 9200 | 0.8425 | 0.3646 | 6.1572 | 0.0150 | 0.0398 |
674
+ | 2.2628 | 9300 | 1.0123 | 0.3582 | 6.2371 | 0.0146 | 0.0399 |
675
+ | 2.2871 | 9400 | 1.0528 | 0.3742 | 6.2364 | 0.0142 | 0.0412 |
676
+ | 2.3114 | 9500 | 0.7329 | 0.3674 | 6.1969 | 0.0141 | 0.0439 |
677
+ | 2.3358 | 9600 | 1.2522 | 0.3667 | 6.2403 | 0.0140 | 0.0431 |
678
+ | 2.3601 | 9700 | 1.1872 | 0.3634 | 6.0391 | 0.0143 | 0.0430 |
679
+ | 2.3844 | 9800 | 1.0789 | 0.3698 | 6.0625 | 0.0132 | 0.0404 |
680
+ | 2.4088 | 9900 | 0.9211 | 0.3623 | 6.1184 | 0.0133 | 0.0421 |
681
+ | 2.4331 | 10000 | 0.957 | 0.3704 | 6.0958 | 0.0136 | 0.0412 |
682
+ | 2.4574 | 10100 | 1.0247 | 0.3665 | 6.0707 | 0.0131 | 0.0465 |
683
+ | 2.4818 | 10200 | 0.868 | 0.3684 | 6.0532 | 0.0130 | 0.0466 |
684
+ | 2.5061 | 10300 | 1.0651 | 0.3752 | 6.1146 | 0.0134 | 0.0463 |
685
+ | 2.5304 | 10400 | 0.8479 | 0.3751 | 6.1622 | 0.0132 | 0.0449 |
686
+ | 2.5547 | 10500 | 1.3458 | 0.3629 | 6.0291 | 0.0141 | 0.0449 |
687
+ | 2.5791 | 10600 | 1.0735 | 0.3683 | 5.9601 | 0.0139 | 0.0446 |
688
+ | 2.6034 | 10700 | 1.0609 | 0.3547 | 5.9667 | 0.0143 | 0.0410 |
689
+ | 2.6277 | 10800 | 0.8736 | 0.3676 | 6.0968 | 0.0137 | 0.0411 |
690
+ | 2.6521 | 10900 | 0.8848 | 0.3702 | 6.1259 | 0.0139 | 0.0406 |
691
+ | 2.6764 | 11000 | 0.8544 | 0.3751 | 6.1025 | 0.0142 | 0.0399 |
692
+ | 2.7007 | 11100 | 0.8619 | 0.3733 | 6.1460 | 0.0146 | 0.0388 |
693
+ | 2.7251 | 11200 | 0.8889 | 0.3770 | 6.1766 | 0.0148 | 0.0395 |
694
+ | 2.7494 | 11300 | 1.0385 | 0.3781 | 6.1172 | 0.0140 | 0.0405 |
695
+ | 2.7737 | 11400 | 0.811 | 0.3918 | 6.2225 | 0.0138 | 0.0389 |
696
+ | 2.7981 | 11500 | 0.9761 | 0.3834 | 6.1362 | 0.0142 | 0.0372 |
697
+ | 2.8224 | 11600 | 0.994 | 0.3791 | 6.2333 | 0.0139 | 0.0398 |
698
+ | 2.8467 | 11700 | 0.9336 | 0.3634 | 6.1495 | 0.0142 | 0.0397 |
699
+ | 2.8710 | 11800 | 0.9836 | 0.3719 | 6.1206 | 0.0141 | 0.0399 |
700
+ | 2.8954 | 11900 | 0.9395 | 0.3702 | 6.1925 | 0.0140 | 0.0413 |
701
+ | 2.9197 | 12000 | 1.0279 | 0.3718 | 6.1865 | 0.0138 | 0.0412 |
702
+ | 2.9440 | 12100 | 0.9084 | 0.3683 | 6.1300 | 0.0139 | 0.0423 |
703
+ | 2.9684 | 12200 | 0.7663 | 0.3692 | 6.2223 | 0.0140 | 0.0400 |
704
+ | 2.9927 | 12300 | 1.0803 | 0.3629 | 6.1623 | 0.0147 | 0.0413 |
705
+ | 3.0170 | 12400 | 0.6931 | 0.3709 | 6.2628 | 0.0151 | 0.0436 |
706
+ | 3.0414 | 12500 | 0.7655 | 0.3712 | 6.3208 | 0.0150 | 0.0428 |
707
+ | 3.0657 | 12600 | 0.7602 | 0.3779 | 6.4310 | 0.0139 | 0.0438 |
708
+ | 3.0900 | 12700 | 0.6897 | 0.3703 | 6.2320 | 0.0147 | 0.0427 |
709
+ | 3.1144 | 12800 | 0.7364 | 0.3815 | 6.3647 | 0.0147 | 0.0429 |
710
+ | 3.1387 | 12900 | 0.9105 | 0.3859 | 6.4185 | 0.0147 | 0.0429 |
711
+ | 3.1630 | 13000 | 0.5886 | 0.3845 | 6.3379 | 0.0149 | 0.0441 |
712
+ | 3.1873 | 13100 | 0.7225 | 0.3848 | 6.4305 | 0.0150 | 0.0455 |
713
+ | 3.2117 | 13200 | 0.771 | 0.3772 | 6.4205 | 0.0150 | 0.0452 |
714
+ | 3.2360 | 13300 | 0.7322 | 0.3790 | 6.3979 | 0.0148 | 0.0442 |
715
+ | 3.2603 | 13400 | 0.753 | 0.3744 | 6.4105 | 0.0152 | 0.0441 |
716
+ | 3.2847 | 13500 | 0.5427 | 0.3771 | 6.4288 | 0.0150 | 0.0459 |
717
+ | 3.3090 | 13600 | 0.7725 | 0.3727 | 6.3567 | 0.0152 | 0.0454 |
718
+ | 3.3333 | 13700 | 0.8041 | 0.3755 | 6.3754 | 0.0147 | 0.0456 |
719
+ | 3.3577 | 13800 | 0.6132 | 0.3804 | 6.4203 | 0.0151 | 0.0458 |
720
+ | 3.3820 | 13900 | 0.8572 | 0.3812 | 6.4300 | 0.0149 | 0.0461 |
721
+ | 3.4063 | 14000 | 0.5685 | 0.3845 | 6.4947 | 0.0147 | 0.0459 |
722
+ | 3.4307 | 14100 | 0.7893 | 0.3812 | 6.4488 | 0.0151 | 0.0468 |
723
+ | 3.4550 | 14200 | 0.6362 | 0.3857 | 6.4628 | 0.0153 | 0.0456 |
724
+ | 3.4793 | 14300 | 0.7303 | 0.3845 | 6.4720 | 0.0150 | 0.0462 |
725
+ | 3.5036 | 14400 | 0.5845 | 0.3881 | 6.4713 | 0.0149 | 0.0464 |
726
+ | 3.5280 | 14500 | 0.6069 | 0.3877 | 6.5055 | 0.0151 | 0.0454 |
727
+ | 3.5523 | 14600 | 0.6865 | 0.3816 | 6.4564 | 0.0149 | 0.0452 |
728
+ | 3.5766 | 14700 | 0.7699 | 0.3833 | 6.4560 | 0.0156 | 0.0462 |
729
+ | 3.6010 | 14800 | 0.923 | 0.3822 | 6.4682 | 0.0157 | 0.0464 |
730
+ | 3.6253 | 14900 | 0.737 | 0.3806 | 6.4656 | 0.0154 | 0.0462 |
731
+ | 3.6496 | 15000 | 0.7309 | 0.3853 | 6.4923 | 0.0152 | 0.0456 |
732
+ | 3.6740 | 15100 | 0.6811 | 0.3837 | 6.5052 | 0.0153 | 0.0458 |
733
+ | 3.6983 | 15200 | 0.5556 | 0.3848 | 6.5081 | 0.0151 | 0.0456 |
734
+ | 3.7226 | 15300 | 0.6696 | 0.3860 | 6.5200 | 0.0152 | 0.0459 |
735
+ | 3.7470 | 15400 | 0.6366 | 0.3864 | 6.5324 | 0.0150 | 0.0448 |
736
+ | 3.7713 | 15500 | 0.7848 | 0.3879 | 6.5547 | 0.0150 | 0.0448 |
737
+ | 3.7956 | 15600 | 0.8423 | 0.3861 | 6.5463 | 0.0151 | 0.0450 |
738
+ | 3.8200 | 15700 | 0.6599 | 0.3849 | 6.5421 | 0.0150 | 0.0451 |
739
+ | 3.8443 | 15800 | 0.5292 | 0.3851 | 6.5450 | 0.0150 | 0.0452 |
740
+ | 3.8686 | 15900 | 0.5983 | 0.3841 | 6.5396 | 0.0149 | 0.0450 |
741
+ | 3.8929 | 16000 | 0.5917 | 0.3823 | 6.5236 | 0.0149 | 0.0449 |
742
+ | 3.9173 | 16100 | 0.762 | 0.3825 | 6.5278 | 0.0150 | 0.0451 |
743
+ | 3.9416 | 16200 | 0.7396 | 0.3832 | 6.5380 | 0.0150 | 0.0453 |
744
+ | 3.9659 | 16300 | 0.574 | 0.3835 | 6.5399 | 0.0151 | 0.0452 |
745
+ | 3.9903 | 16400 | 0.5849 | 0.3835 | 6.5374 | 0.0151 | 0.0452 |
746
+
747
+ </details>
748
+
749
+ ### Framework Versions
750
+ - Python: 3.10.10
751
+ - Sentence Transformers: 3.4.0.dev0
752
+ - Transformers: 4.49.0.dev0
753
+ - PyTorch: 2.2.1+cu121
754
+ - Accelerate: 1.3.0
755
+ - Datasets: 3.2.0
756
+ - Tokenizers: 0.21.0
757
+
758
+ ## Citation
759
+
760
+ ### BibTeX
761
+
762
+ #### Sentence Transformers and SoftmaxLoss
763
+ ```bibtex
764
+ @inproceedings{reimers-2019-sentence-bert,
765
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
766
+ author = "Reimers, Nils and Gurevych, Iryna",
767
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
768
+ month = "11",
769
+ year = "2019",
770
+ publisher = "Association for Computational Linguistics",
771
+ url = "https://arxiv.org/abs/1908.10084",
772
+ }
773
+ ```
774
+
775
+ #### MultipleNegativesRankingLoss
776
+ ```bibtex
777
+ @misc{henderson2017efficient,
778
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
779
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
780
+ year={2017},
781
+ eprint={1705.00652},
782
+ archivePrefix={arXiv},
783
+ primaryClass={cs.CL}
784
+ }
785
+ ```
786
+
787
+ #### CoSENTLoss
788
+ ```bibtex
789
+ @online{kexuefm-8847,
790
+ title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
791
+ author={Su Jianlin},
792
+ year={2022},
793
+ month={Jan},
794
+ url={https://kexue.fm/archives/8847},
795
+ }
796
+ ```
797
+
798
+ <!--
799
+ ## Glossary
800
+
801
+ *Clearly define terms in order to be accessible across audiences.*
802
+ -->
803
+
804
+ <!--
805
+ ## Model Card Authors
806
+
807
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
808
+ -->
809
+
810
+ <!--
811
+ ## Model Card Contact
812
+
813
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
814
+ -->
config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "answerdotai/ModernBERT-base",
3
+ "architectures": [
4
+ "ModernBertModel"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 768,
23
+ "initializer_cutoff_factor": 2.0,
24
+ "initializer_range": 0.02,
25
+ "intermediate_size": 1152,
26
+ "layer_norm_eps": 1e-05,
27
+ "local_attention": 128,
28
+ "local_rope_theta": 10000.0,
29
+ "max_position_embeddings": 8192,
30
+ "mlp_bias": false,
31
+ "mlp_dropout": 0.0,
32
+ "model_type": "modernbert",
33
+ "norm_bias": false,
34
+ "norm_eps": 1e-05,
35
+ "num_attention_heads": 12,
36
+ "num_hidden_layers": 22,
37
+ "pad_token_id": 50283,
38
+ "position_embedding_type": "absolute",
39
+ "reference_compile": true,
40
+ "repad_logits_with_grad": false,
41
+ "sep_token_id": 50282,
42
+ "sparse_pred_ignore_index": -100,
43
+ "sparse_prediction": false,
44
+ "torch_dtype": "float32",
45
+ "transformers_version": "4.49.0.dev0",
46
+ "vocab_size": 50368
47
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.4.0.dev0",
4
+ "transformers": "4.49.0.dev0",
5
+ "pytorch": "2.2.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efc56b192c6eb9f6d29145228681afbc5087a7aeeb219d0204bfbef343e549f4
3
+ size 596070136
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 8192,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }