antonkirk commited on
Commit
ddf9470
1 Parent(s): 95b365a

Upload folder using huggingface_hub

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md CHANGED
@@ -1,3 +1,922 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: []
3
+ library_name: sentence-transformers
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - generated_from_trainer
9
+ - dataset_size:98928
10
+ - loss:MultipleNegativesRankingLoss
11
+ base_model: sentence-transformers/multi-qa-mpnet-base-dot-v1
12
+ datasets: []
13
+ metrics:
14
+ - cosine_accuracy@1
15
+ - cosine_accuracy@3
16
+ - cosine_accuracy@5
17
+ - cosine_accuracy@10
18
+ - cosine_precision@1
19
+ - cosine_precision@3
20
+ - cosine_precision@5
21
+ - cosine_precision@10
22
+ - cosine_recall@1
23
+ - cosine_recall@3
24
+ - cosine_recall@5
25
+ - cosine_recall@10
26
+ - cosine_ndcg@10
27
+ - cosine_mrr@10
28
+ - cosine_map@100
29
+ - dot_accuracy@1
30
+ - dot_accuracy@3
31
+ - dot_accuracy@5
32
+ - dot_accuracy@10
33
+ - dot_precision@1
34
+ - dot_precision@3
35
+ - dot_precision@5
36
+ - dot_precision@10
37
+ - dot_recall@1
38
+ - dot_recall@3
39
+ - dot_recall@5
40
+ - dot_recall@10
41
+ - dot_ndcg@10
42
+ - dot_mrr@10
43
+ - dot_map@100
44
+ widget:
45
+ - source_sentence: Abnormal wakefulness, loss of self-awareness, involuntary trance
46
+ sentences:
47
+ - 'See also: Neonatal lupus erythematosus'
48
+ - 'Scalp-ear-nipple syndrome, as its name suggests, is a condition characterized
49
+ by abnormalities of the scalp, ears, and nipples. Less frequently, affected individuals
50
+ have problems affecting other parts of the body. The features of this disorder
51
+ can vary even within the same family.
52
+
53
+
54
+ Babies with scalp-ear-nipple syndrome are born with a condition called aplasia
55
+ cutis congenita, which involves patchy abnormal areas (lesions) on the scalp.
56
+ These lesions are firm, raised, hairless nodules that resemble open wounds or
57
+ ulcers at birth, but that heal during childhood.
58
+
59
+
60
+ The external ears of people with scalp-ear-nipple syndrome may be small, cup-shaped,
61
+ folded over, or otherwise mildly misshapen. Hearing is generally normal. Affected
62
+ individuals also have nipples that are underdeveloped (hypothelia) or absent (athelia).
63
+ In some cases the underlying breast tissue is absent as well (amastia).'
64
+ - "Abnormal state of wakefulness or altered state of consciousness\n\nFor other\
65
+ \ uses, see Trance (disambiguation).\n\nThis article needs additional citations\
66
+ \ for verification. Please help improve this article by adding citations to reliable\
67
+ \ sources. Unsourced material may be challenged and removed. \nFind sources:\
68
+ \ \"Trance\" – news · newspapers · books · scholar · JSTOR (August 2010) (Learn\
69
+ \ how and when to remove this template message) \n \nDissociative trance \n\
70
+ The Oracle at Delphi was famous for her divinatory trances throughout the ancient\
71
+ \ Mediterranean world. Oil painting, John Collier, 1891 \nSpecialtyPsychiatry\
72
+ \ \n \nTrance is an abnormal state of wakefulness in which a person is not self-aware\
73
+ \ and is either altogether unresponsive to external stimuli (but nevertheless\
74
+ \ capable of pursuing and realizing an aim) or is selectively responsive in following\
75
+ \ the directions of the person (if any) who has induced the trance. Trance states\
76
+ \ may occur involuntarily and unbidden."
77
+ - source_sentence: respiratory infections, recurrent infections, primary immunodeficiency
78
+ sentences:
79
+ - 'A number sign (#) is used with this entry because of evidence that autosomal
80
+ dominant common variable immunodeficiency-12 (CVID12) is caused by heterozygous
81
+ mutation in the NFKB1 gene (164011) on chromosome 4q24.
82
+
83
+
84
+ Description
85
+
86
+
87
+ Common variable immunodeficiency-12 is an autosomal dominant primary immunodeficiency
88
+ characterized by recurrent infections, mainly respiratory, associated with hypogammaglobulinemia.
89
+ The disorder shows a highly variable age at onset and highly variable disease
90
+ severity, even within the same family. Some patients have features of autoimmunity
91
+ (summary by Fliegauf et al., 2015).
92
+
93
+
94
+ For a general description and a discussion of genetic heterogeneity of common
95
+ variable immunodeficiency, see CVID1 (607594).
96
+
97
+
98
+ Clinical Features'
99
+ - Kyasanura forest disease (KFD), caused by the KFD virus, is an arbovirus characterized
100
+ by an initial fever, headache and myalgia that can progress to a hemorrhagic disease
101
+ and that in some cases is followed by a second phase characterized by neurological
102
+ manifestations.
103
+ - 'A number sign (#) is used with this entry because of evidence that X-linked syndromic
104
+ mental retardation-33 (MRXS33) is caused by mutation in the TAF1 gene (313650)
105
+ on chromosome Xq13.
106
+
107
+
108
+ Description
109
+
110
+
111
+ X-linked syndromic mental retardation-33 is an X-linked recessive neurodevelopmental
112
+ disorder characterized by delayed psychomotor development, intellectual disability,
113
+ and characteristic facial features (summary by O''Rawe et al., 2015).
114
+
115
+
116
+ Clinical Features'
117
+ - source_sentence: Common variable immunodeficiency, recurrent infections, impaired
118
+ antibody production
119
+ sentences:
120
+ - 'A number sign (#) is used with this entry because this form of common variable
121
+ immunodeficiency (CVID), referred to here as CVID5, is caused by homozygous mutation
122
+ in the CD20 gene (MS4A1; 112210) on chromosome 11q13.
123
+
124
+
125
+ For a general description and a discussion of genetic heterogeneity of common
126
+ variable immunodeficiency, see CVID1 (607594).
127
+
128
+
129
+ Clinical Features'
130
+ - 'A number sign (#) is used with this entry because of evidence that the Stanescu
131
+ type of spondyloepiphyseal dysplasia (SEDSTN) is caused by heterozygous mutation
132
+ in the COL2A1 gene (120140) on chromosome 12q13.
133
+
134
+
135
+ Description'
136
+ - '## Description
137
+
138
+
139
+ Macular dystrophies are inherited retinal dystrophies in which various forms of
140
+ deposits, pigmentary changes, and atrophic lesions are observed in the macula
141
+ lutea, the cone-rich region of the central retina. Vitelliform macular dystrophies
142
+ (VMDs) form a subset of macular dystrophies characterized by round yellow deposits,
143
+ usually at the center of the macula and containing lipofuscin, a chemically heterogeneous
144
+ pigment visualized by autofluorescence imaging of the fundus (summary by Manes
145
+ et al., 2013). In contrast to typical VMD (see 153700), patients with atypical
146
+ VMD may exhibit normal electrooculography, even when severe loss of vision is
147
+ present, and fluorescein angiography is thus the most reliable test for identifying
148
+ affected individuals (Hittner et al., 1984).
149
+
150
+
151
+ ### Genetic Heterogeneity of Vitelliform Macular Dystrophy'
152
+ - source_sentence: Growth retardation, hearing impairment, joint hypermobility, sacral
153
+ caudal remnant
154
+ sentences:
155
+ - 'A number sign (#) is used with this entry because Bruck syndrome-2 (BRKS2) is
156
+ caused by homozygous mutation in the PLOD2 gene (601865), which encodes telopeptide
157
+ lysyl hydroxylase, on chromosome 3q24.
158
+
159
+
160
+ For a phenotypic description and a discussion of genetic heterogeneity of Bruck
161
+ syndrome, see Bruck syndrome-1 (259450).
162
+
163
+
164
+ Clinical Features
165
+
166
+
167
+ Ha-Vinh et al. (2004) described a child with Bruck syndrome who was the offspring
168
+ of healthy nonconsanguineous Turkish parents. At birth, pterygia were present
169
+ at the left elbow and at both knees, and extension of these joints was limited.
170
+ Contractures were also present at the wrists, and there were bilateral clubfeet.
171
+ Bilateral inguinal hernias were present. A fracture of the left arm was recognized
172
+ immediately after birth, and the boy had 2 more fractures in the first 3 months
173
+ of life. His urine contained high levels of hydroxyproline but low levels of collagen
174
+ crosslinks degradation products.'
175
+ - '## Summary
176
+
177
+
178
+ ### Clinical characteristics.
179
+
180
+
181
+ Thrombocytopenia absent radius (TAR) syndrome is characterized by bilateral absence
182
+ of the radii with the presence of both thumbs and thrombocytopenia (<50 platelets/nL)
183
+ that is generally transient. Thrombocytopenia may be congenital or may develop
184
+ within the first few weeks to months of life; in general, thrombocytopenic episodes
185
+ decrease with age. Cow''s milk allergy is common and can be associated with exacerbation
186
+ of thrombocytopenia. Other anomalies of the skeleton (upper and lower limbs, ribs,
187
+ and vertebrae), heart, and genitourinary system (renal anomalies and agenesis
188
+ of uterus, cervix, and upper part of the vagina) can occur.
189
+
190
+
191
+ ### Diagnosis/testing.'
192
+ - A rare multiple congenital anomalies/dysmorphic syndrome characterized by global
193
+ developmental delay, intellectual disability, growth retardation, hearing impairment,
194
+ characteristic facial dysmorphology (including prominent supraorbital ridges,
195
+ downslanting palpebral fissures, deep-set eyes, long face, sagging cheeks, anteverted
196
+ nares, and pointed chin), generalized hypotonia, joint hypermobility, gluteal
197
+ crease with sacral caudal remnant and sacral dimple, and variable neurological
198
+ features. Various ophthalmic, cutaneous, musculoskeletal, gastrointestinal, and
199
+ cardiovascular anomalies have also been described.
200
+ - source_sentence: ear malformations, nipple abnormalities, dental anomalies
201
+ sentences:
202
+ - "This article is an orphan, as no other articles link to it. Please introduce\
203
+ \ links to this page from related articles; try the Find link tool for suggestions.\
204
+ \ (July 2016) \n \nInguinal lymphadenopathy \nInguinal lymphadenopathy \n\
205
+ \ \nInguinal lymphadenopathy causes swollen lymph nodes in the groin area. It\
206
+ \ can be a symptom of infective or neoplastic processes. Infective aetiologies\
207
+ \ include Tuberculosis, HIV, non-specific or reactive lymphadenopathy to recent\
208
+ \ lower limb infection or groin infections. Another notable infectious cause is\
209
+ \ Lymphogranuloma venereum, which is a sexually transmitted infection of the lymphatic\
210
+ \ system. Neoplastic aetiologies include lymphoma, leukaemia and metastatic disease\
211
+ \ from primary tumours in the lower limb, external genitalia or perianal region\
212
+ \ and melanoma.\n\n## References[edit]\n\n * Ferrer R (October 1998). \"Lymphadenopathy:\
213
+ \ differential diagnosis and evaluation\". Am Fam Physician. 58 (6): 1313–20.\
214
+ \ PMID 9803196.\n\n## Further reading[edit]"
215
+ - 'A number sign (#) is used with this entry because scalp-ear-nipple syndrome (SENS)
216
+ is caused by heterozygous mutation in the KCTD1 gene (613420) on chromosome 18q11.
217
+
218
+
219
+ Description
220
+
221
+
222
+ Scalp-ear-nipple syndrome is characterized by aplasia cutis congenita of the scalp,
223
+ breast anomalies that range from hypothelia or athelia to amastia, and minor anomalies
224
+ of the external ears. Less frequent clinical characteristics include nail dystrophy,
225
+ dental anomalies, cutaneous syndactyly of the digits, and renal malformations.
226
+ Penetrance appears to be high, although there is substantial variable expressivity
227
+ within families (Marneros et al., 2013).
228
+
229
+
230
+ Clinical Features'
231
+ - Familial multiple meningioma is a rare, benign neoplasm of the central nervous
232
+ system characterized by the development of multiple or, rarely, solitary meningiomas
233
+ in two or more blood relatives, without other apparent syndromic manifestations.
234
+ Depending on the localization, growth rate and size of the tumors, patients can
235
+ present with subtle, gradually worsening or abrupt and severe neurological compromise
236
+ or can be completely asymptomatic.
237
+ pipeline_tag: sentence-similarity
238
+ model-index:
239
+ - name: SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
240
+ results:
241
+ - task:
242
+ type: information-retrieval
243
+ name: Information Retrieval
244
+ dataset:
245
+ name: Unknown
246
+ type: unknown
247
+ metrics:
248
+ - type: cosine_accuracy@1
249
+ value: 0.18070477009024494
250
+ name: Cosine Accuracy@1
251
+ - type: cosine_accuracy@3
252
+ value: 0.5426514825956167
253
+ name: Cosine Accuracy@3
254
+ - type: cosine_accuracy@5
255
+ value: 0.7380747743876236
256
+ name: Cosine Accuracy@5
257
+ - type: cosine_accuracy@10
258
+ value: 0.8160721959604641
259
+ name: Cosine Accuracy@10
260
+ - type: cosine_precision@1
261
+ value: 0.18070477009024494
262
+ name: Cosine Precision@1
263
+ - type: cosine_precision@3
264
+ value: 0.1808838275318722
265
+ name: Cosine Precision@3
266
+ - type: cosine_precision@5
267
+ value: 0.1476149548775247
268
+ name: Cosine Precision@5
269
+ - type: cosine_precision@10
270
+ value: 0.08160721959604642
271
+ name: Cosine Precision@10
272
+ - type: cosine_recall@1
273
+ value: 0.18070477009024494
274
+ name: Cosine Recall@1
275
+ - type: cosine_recall@3
276
+ value: 0.5426514825956167
277
+ name: Cosine Recall@3
278
+ - type: cosine_recall@5
279
+ value: 0.7380747743876236
280
+ name: Cosine Recall@5
281
+ - type: cosine_recall@10
282
+ value: 0.8160721959604641
283
+ name: Cosine Recall@10
284
+ - type: cosine_ndcg@10
285
+ value: 0.49469594615283
286
+ name: Cosine Ndcg@10
287
+ - type: cosine_mrr@10
288
+ value: 0.39074511770043246
289
+ name: Cosine Mrr@10
290
+ - type: cosine_map@100
291
+ value: 0.3952600557331103
292
+ name: Cosine Map@100
293
+ - type: dot_accuracy@1
294
+ value: 0.18274602492479589
295
+ name: Dot Accuracy@1
296
+ - type: dot_accuracy@3
297
+ value: 0.5412548345509239
298
+ name: Dot Accuracy@3
299
+ - type: dot_accuracy@5
300
+ value: 0.7430167597765364
301
+ name: Dot Accuracy@5
302
+ - type: dot_accuracy@10
303
+ value: 0.8167168027503223
304
+ name: Dot Accuracy@10
305
+ - type: dot_precision@1
306
+ value: 0.18274602492479589
307
+ name: Dot Precision@1
308
+ - type: dot_precision@3
309
+ value: 0.18041827818364134
310
+ name: Dot Precision@3
311
+ - type: dot_precision@5
312
+ value: 0.1486033519553073
313
+ name: Dot Precision@5
314
+ - type: dot_precision@10
315
+ value: 0.08167168027503223
316
+ name: Dot Precision@10
317
+ - type: dot_recall@1
318
+ value: 0.18274602492479589
319
+ name: Dot Recall@1
320
+ - type: dot_recall@3
321
+ value: 0.5412548345509239
322
+ name: Dot Recall@3
323
+ - type: dot_recall@5
324
+ value: 0.7430167597765364
325
+ name: Dot Recall@5
326
+ - type: dot_recall@10
327
+ value: 0.8167168027503223
328
+ name: Dot Recall@10
329
+ - type: dot_ndcg@10
330
+ value: 0.4956715454485796
331
+ name: Dot Ndcg@10
332
+ - type: dot_mrr@10
333
+ value: 0.391808804169147
334
+ name: Dot Mrr@10
335
+ - type: dot_map@100
336
+ value: 0.39626188359327835
337
+ name: Dot Map@100
338
+ ---
339
+
340
+ # SentenceTransformer based on sentence-transformers/multi-qa-mpnet-base-dot-v1
341
+
342
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
343
+
344
+ ## Model Details
345
+
346
+ ### Model Description
347
+ - **Model Type:** Sentence Transformer
348
+ - **Base model:** [sentence-transformers/multi-qa-mpnet-base-dot-v1](https://huggingface.co/sentence-transformers/multi-qa-mpnet-base-dot-v1) <!-- at revision 3af7c6da5b3e1bea796ef6c97fe237538cbe6e7f -->
349
+ - **Maximum Sequence Length:** 512 tokens
350
+ - **Output Dimensionality:** 768 tokens
351
+ - **Similarity Function:** Dot Product
352
+ <!-- - **Training Dataset:** Unknown -->
353
+ <!-- - **Language:** Unknown -->
354
+ <!-- - **License:** Unknown -->
355
+
356
+ ### Model Sources
357
+
358
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
359
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
360
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
361
+
362
+ ### Full Model Architecture
363
+
364
+ ```
365
+ SentenceTransformer(
366
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
367
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
368
+ )
369
+ ```
370
+
371
+ ## Usage
372
+
373
+ ### Direct Usage (Sentence Transformers)
374
+
375
+ First install the Sentence Transformers library:
376
+
377
+ ```bash
378
+ pip install -U sentence-transformers
379
+ ```
380
+
381
+ Then you can load this model and run inference.
382
+ ```python
383
+ from sentence_transformers import SentenceTransformer
384
+
385
+ # Download from the 🤗 Hub
386
+ model = SentenceTransformer("sentence_transformers_model_id")
387
+ # Run inference
388
+ sentences = [
389
+ 'ear malformations, nipple abnormalities, dental anomalies',
390
+ 'A number sign (#) is used with this entry because scalp-ear-nipple syndrome (SENS) is caused by heterozygous mutation in the KCTD1 gene (613420) on chromosome 18q11.\n\nDescription\n\nScalp-ear-nipple syndrome is characterized by aplasia cutis congenita of the scalp, breast anomalies that range from hypothelia or athelia to amastia, and minor anomalies of the external ears. Less frequent clinical characteristics include nail dystrophy, dental anomalies, cutaneous syndactyly of the digits, and renal malformations. Penetrance appears to be high, although there is substantial variable expressivity within families (Marneros et al., 2013).\n\nClinical Features',
391
+ 'This article is an orphan, as no other articles link to it. Please introduce links to this page from related articles; try the Find link tool for suggestions. (July 2016) \n \nInguinal lymphadenopathy \nInguinal lymphadenopathy \n \nInguinal lymphadenopathy causes swollen lymph nodes in the groin area. It can be a symptom of infective or neoplastic processes. Infective aetiologies include Tuberculosis, HIV, non-specific or reactive lymphadenopathy to recent lower limb infection or groin infections. Another notable infectious cause is Lymphogranuloma venereum, which is a sexually transmitted infection of the lymphatic system. Neoplastic aetiologies include lymphoma, leukaemia and metastatic disease from primary tumours in the lower limb, external genitalia or perianal region and melanoma.\n\n## References[edit]\n\n * Ferrer R (October 1998). "Lymphadenopathy: differential diagnosis and evaluation". Am Fam Physician. 58 (6): 1313–20. PMID 9803196.\n\n## Further reading[edit]',
392
+ ]
393
+ embeddings = model.encode(sentences)
394
+ print(embeddings.shape)
395
+ # [3, 768]
396
+
397
+ # Get the similarity scores for the embeddings
398
+ similarities = model.similarity(embeddings, embeddings)
399
+ print(similarities.shape)
400
+ # [3, 3]
401
+ ```
402
+
403
+ <!--
404
+ ### Direct Usage (Transformers)
405
+
406
+ <details><summary>Click to see the direct usage in Transformers</summary>
407
+
408
+ </details>
409
+ -->
410
+
411
+ <!--
412
+ ### Downstream Usage (Sentence Transformers)
413
+
414
+ You can finetune this model on your own dataset.
415
+
416
+ <details><summary>Click to expand</summary>
417
+
418
+ </details>
419
+ -->
420
+
421
+ <!--
422
+ ### Out-of-Scope Use
423
+
424
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
425
+ -->
426
+
427
+ ## Evaluation
428
+
429
+ ### Metrics
430
+
431
+ #### Information Retrieval
432
+
433
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
434
+
435
+ | Metric | Value |
436
+ |:--------------------|:-----------|
437
+ | cosine_accuracy@1 | 0.1807 |
438
+ | cosine_accuracy@3 | 0.5427 |
439
+ | cosine_accuracy@5 | 0.7381 |
440
+ | cosine_accuracy@10 | 0.8161 |
441
+ | cosine_precision@1 | 0.1807 |
442
+ | cosine_precision@3 | 0.1809 |
443
+ | cosine_precision@5 | 0.1476 |
444
+ | cosine_precision@10 | 0.0816 |
445
+ | cosine_recall@1 | 0.1807 |
446
+ | cosine_recall@3 | 0.5427 |
447
+ | cosine_recall@5 | 0.7381 |
448
+ | cosine_recall@10 | 0.8161 |
449
+ | cosine_ndcg@10 | 0.4947 |
450
+ | cosine_mrr@10 | 0.3907 |
451
+ | cosine_map@100 | 0.3953 |
452
+ | dot_accuracy@1 | 0.1827 |
453
+ | dot_accuracy@3 | 0.5413 |
454
+ | dot_accuracy@5 | 0.743 |
455
+ | dot_accuracy@10 | 0.8167 |
456
+ | dot_precision@1 | 0.1827 |
457
+ | dot_precision@3 | 0.1804 |
458
+ | dot_precision@5 | 0.1486 |
459
+ | dot_precision@10 | 0.0817 |
460
+ | dot_recall@1 | 0.1827 |
461
+ | dot_recall@3 | 0.5413 |
462
+ | dot_recall@5 | 0.743 |
463
+ | dot_recall@10 | 0.8167 |
464
+ | dot_ndcg@10 | 0.4957 |
465
+ | dot_mrr@10 | 0.3918 |
466
+ | **dot_map@100** | **0.3963** |
467
+
468
+ <!--
469
+ ## Bias, Risks and Limitations
470
+
471
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
472
+ -->
473
+
474
+ <!--
475
+ ### Recommendations
476
+
477
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
478
+ -->
479
+
480
+ ## Training Details
481
+
482
+ ### Training Dataset
483
+
484
+ #### Unnamed Dataset
485
+
486
+
487
+ * Size: 98,928 training samples
488
+ * Columns: <code>queries</code> and <code>chunks</code>
489
+ * Approximate statistics based on the first 1000 samples:
490
+ | | queries | chunks |
491
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
492
+ | type | string | string |
493
+ | details | <ul><li>min: 7 tokens</li><li>mean: 17.4 tokens</li><li>max: 76 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 159.93 tokens</li><li>max: 334 tokens</li></ul> |
494
+ * Samples:
495
+ | queries | chunks |
496
+ |:-------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
497
+ | <code>fever, malaise, headaches, lymphadenopathy</code> | <code>A rare, acquired, self-limiting, infectious disease due to the mite-borne bacteria Rickettsia akari characterized by an asymptomatic, 0.5 to 2 cm in diameter papulovesicle that typically ulcerates and forms an eschar, followed by a generalized papulovesicular rash associating variable constitutional symptoms, such as localized lymphadenopathy, fever, malaise, and headaches. Additonal symptoms may include diaphoresis, myalgia and, less frequently, rhinorrhea, pharyngitis, nausea, vomiting, splenomegaly, conjunctival hyperemia, and abdominal pain. Systemic symtoms resolve within 6-10 days.</code> |
498
+ | <code>rash, papulovesicular, generalized, constitutional symptoms</code> | <code>A rare, acquired, self-limiting, infectious disease due to the mite-borne bacteria Rickettsia akari characterized by an asymptomatic, 0.5 to 2 cm in diameter papulovesicle that typically ulcerates and forms an eschar, followed by a generalized papulovesicular rash associating variable constitutional symptoms, such as localized lymphadenopathy, fever, malaise, and headaches. Additonal symptoms may include diaphoresis, myalgia and, less frequently, rhinorrhea, pharyngitis, nausea, vomiting, splenomegaly, conjunctival hyperemia, and abdominal pain. Systemic symtoms resolve within 6-10 days.</code> |
499
+ | <code>myalgia, diaphoresis, nausea, vomiting</code> | <code>A rare, acquired, self-limiting, infectious disease due to the mite-borne bacteria Rickettsia akari characterized by an asymptomatic, 0.5 to 2 cm in diameter papulovesicle that typically ulcerates and forms an eschar, followed by a generalized papulovesicular rash associating variable constitutional symptoms, such as localized lymphadenopathy, fever, malaise, and headaches. Additonal symptoms may include diaphoresis, myalgia and, less frequently, rhinorrhea, pharyngitis, nausea, vomiting, splenomegaly, conjunctival hyperemia, and abdominal pain. Systemic symtoms resolve within 6-10 days.</code> |
500
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
501
+ ```json
502
+ {
503
+ "scale": 1,
504
+ "similarity_fct": "dot_score"
505
+ }
506
+ ```
507
+
508
+ ### Evaluation Dataset
509
+
510
+ #### Unnamed Dataset
511
+
512
+
513
+ * Size: 9,308 evaluation samples
514
+ * Columns: <code>queries</code> and <code>chunks</code>
515
+ * Approximate statistics based on the first 1000 samples:
516
+ | | queries | chunks |
517
+ |:--------|:---------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
518
+ | type | string | string |
519
+ | details | <ul><li>min: 7 tokens</li><li>mean: 17.8 tokens</li><li>max: 48 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 166.19 tokens</li><li>max: 299 tokens</li></ul> |
520
+ * Samples:
521
+ | queries | chunks |
522
+ |:-------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
523
+ | <code>facial features, overgrowth, learning disabilities, delayed development</code> | <code>Sotos syndrome is a condition characterized mainly by distinctive facial features; overgrowth in childhood; and learning disabilities or delayed development. Facial features may include a long, narrow face; a high forehead; flushed (reddened) cheeks; a small, pointed chin; and down-slanting palpebral fissures. Affected infants and children tend to grow quickly; they are significantly taller than their siblings and peers and have a large head. Other signs and symptoms may include intellectual disability; behavioral problems; problems with speech and language; and/or weak muscle tone (hypotonia). Sotos syndrome is usually caused by a mutation in the NSD1 gene and is inherited in an autosomal dominant manner. About 95% of cases are due to a new mutation in the affected person and occur sporadically (are not inherited).</code> |
524
+ | <code>long face, high forehead, flushed cheeks, small chin, down-slanting palpebral fissures</code> | <code>Sotos syndrome is a condition characterized mainly by distinctive facial features; overgrowth in childhood; and learning disabilities or delayed development. Facial features may include a long, narrow face; a high forehead; flushed (reddened) cheeks; a small, pointed chin; and down-slanting palpebral fissures. Affected infants and children tend to grow quickly; they are significantly taller than their siblings and peers and have a large head. Other signs and symptoms may include intellectual disability; behavioral problems; problems with speech and language; and/or weak muscle tone (hypotonia). Sotos syndrome is usually caused by a mutation in the NSD1 gene and is inherited in an autosomal dominant manner. About 95% of cases are due to a new mutation in the affected person and occur sporadically (are not inherited).</code> |
525
+ | <code>intellectual disability, behavioral problems, speech and language difficulties, hypotonia</code> | <code>Sotos syndrome is a condition characterized mainly by distinctive facial features; overgrowth in childhood; and learning disabilities or delayed development. Facial features may include a long, narrow face; a high forehead; flushed (reddened) cheeks; a small, pointed chin; and down-slanting palpebral fissures. Affected infants and children tend to grow quickly; they are significantly taller than their siblings and peers and have a large head. Other signs and symptoms may include intellectual disability; behavioral problems; problems with speech and language; and/or weak muscle tone (hypotonia). Sotos syndrome is usually caused by a mutation in the NSD1 gene and is inherited in an autosomal dominant manner. About 95% of cases are due to a new mutation in the affected person and occur sporadically (are not inherited).</code> |
526
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
527
+ ```json
528
+ {
529
+ "scale": 1,
530
+ "similarity_fct": "dot_score"
531
+ }
532
+ ```
533
+
534
+ ### Training Hyperparameters
535
+ #### Non-Default Hyperparameters
536
+
537
+ - `eval_strategy`: steps
538
+ - `per_device_train_batch_size`: 32
539
+ - `per_device_eval_batch_size`: 32
540
+ - `learning_rate`: 2e-05
541
+ - `num_train_epochs`: 25
542
+ - `warmup_ratio`: 0.1
543
+ - `fp16`: True
544
+ - `load_best_model_at_end`: True
545
+ - `eval_on_start`: True
546
+ - `batch_sampler`: no_duplicates
547
+
548
+ #### All Hyperparameters
549
+ <details><summary>Click to expand</summary>
550
+
551
+ - `overwrite_output_dir`: False
552
+ - `do_predict`: False
553
+ - `eval_strategy`: steps
554
+ - `prediction_loss_only`: True
555
+ - `per_device_train_batch_size`: 32
556
+ - `per_device_eval_batch_size`: 32
557
+ - `per_gpu_train_batch_size`: None
558
+ - `per_gpu_eval_batch_size`: None
559
+ - `gradient_accumulation_steps`: 1
560
+ - `eval_accumulation_steps`: None
561
+ - `torch_empty_cache_steps`: None
562
+ - `learning_rate`: 2e-05
563
+ - `weight_decay`: 0.0
564
+ - `adam_beta1`: 0.9
565
+ - `adam_beta2`: 0.999
566
+ - `adam_epsilon`: 1e-08
567
+ - `max_grad_norm`: 1.0
568
+ - `num_train_epochs`: 25
569
+ - `max_steps`: -1
570
+ - `lr_scheduler_type`: linear
571
+ - `lr_scheduler_kwargs`: {}
572
+ - `warmup_ratio`: 0.1
573
+ - `warmup_steps`: 0
574
+ - `log_level`: passive
575
+ - `log_level_replica`: warning
576
+ - `log_on_each_node`: True
577
+ - `logging_nan_inf_filter`: True
578
+ - `save_safetensors`: True
579
+ - `save_on_each_node`: False
580
+ - `save_only_model`: False
581
+ - `restore_callback_states_from_checkpoint`: False
582
+ - `no_cuda`: False
583
+ - `use_cpu`: False
584
+ - `use_mps_device`: False
585
+ - `seed`: 42
586
+ - `data_seed`: None
587
+ - `jit_mode_eval`: False
588
+ - `use_ipex`: False
589
+ - `bf16`: False
590
+ - `fp16`: True
591
+ - `fp16_opt_level`: O1
592
+ - `half_precision_backend`: auto
593
+ - `bf16_full_eval`: False
594
+ - `fp16_full_eval`: False
595
+ - `tf32`: None
596
+ - `local_rank`: 0
597
+ - `ddp_backend`: None
598
+ - `tpu_num_cores`: None
599
+ - `tpu_metrics_debug`: False
600
+ - `debug`: []
601
+ - `dataloader_drop_last`: True
602
+ - `dataloader_num_workers`: 0
603
+ - `dataloader_prefetch_factor`: None
604
+ - `past_index`: -1
605
+ - `disable_tqdm`: False
606
+ - `remove_unused_columns`: True
607
+ - `label_names`: None
608
+ - `load_best_model_at_end`: True
609
+ - `ignore_data_skip`: False
610
+ - `fsdp`: []
611
+ - `fsdp_min_num_params`: 0
612
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
613
+ - `fsdp_transformer_layer_cls_to_wrap`: None
614
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
615
+ - `deepspeed`: None
616
+ - `label_smoothing_factor`: 0.0
617
+ - `optim`: adamw_torch
618
+ - `optim_args`: None
619
+ - `adafactor`: False
620
+ - `group_by_length`: False
621
+ - `length_column_name`: length
622
+ - `ddp_find_unused_parameters`: None
623
+ - `ddp_bucket_cap_mb`: None
624
+ - `ddp_broadcast_buffers`: False
625
+ - `dataloader_pin_memory`: True
626
+ - `dataloader_persistent_workers`: False
627
+ - `skip_memory_metrics`: True
628
+ - `use_legacy_prediction_loop`: False
629
+ - `push_to_hub`: False
630
+ - `resume_from_checkpoint`: None
631
+ - `hub_model_id`: None
632
+ - `hub_strategy`: every_save
633
+ - `hub_private_repo`: False
634
+ - `hub_always_push`: False
635
+ - `gradient_checkpointing`: False
636
+ - `gradient_checkpointing_kwargs`: None
637
+ - `include_inputs_for_metrics`: False
638
+ - `eval_do_concat_batches`: True
639
+ - `fp16_backend`: auto
640
+ - `push_to_hub_model_id`: None
641
+ - `push_to_hub_organization`: None
642
+ - `mp_parameters`:
643
+ - `auto_find_batch_size`: False
644
+ - `full_determinism`: False
645
+ - `torchdynamo`: None
646
+ - `ray_scope`: last
647
+ - `ddp_timeout`: 1800
648
+ - `torch_compile`: False
649
+ - `torch_compile_backend`: None
650
+ - `torch_compile_mode`: None
651
+ - `dispatch_batches`: None
652
+ - `split_batches`: None
653
+ - `include_tokens_per_second`: False
654
+ - `include_num_input_tokens_seen`: False
655
+ - `neftune_noise_alpha`: None
656
+ - `optim_target_modules`: None
657
+ - `batch_eval_metrics`: False
658
+ - `eval_on_start`: True
659
+ - `eval_use_gather_object`: False
660
+ - `batch_sampler`: no_duplicates
661
+ - `multi_dataset_batch_sampler`: proportional
662
+
663
+ </details>
664
+
665
+ ### Training Logs
666
+ <details><summary>Click to expand</summary>
667
+
668
+ | Epoch | Step | Training Loss | loss | dot_map@100 |
669
+ |:----------:|:--------:|:-------------:|:---------:|:-----------:|
670
+ | 0 | 0 | - | 1.8701 | 0.2095 |
671
+ | 0.1295 | 100 | 1.5494 | - | - |
672
+ | 0.2591 | 200 | 0.9993 | - | - |
673
+ | 0.3886 | 300 | 0.7225 | - | - |
674
+ | 0.5181 | 400 | 0.6533 | - | - |
675
+ | 0.6477 | 500 | 0.6618 | 0.5939 | 0.3722 |
676
+ | 0.7772 | 600 | 0.6454 | - | - |
677
+ | 0.9067 | 700 | 0.5568 | - | - |
678
+ | 1.0363 | 800 | 0.5435 | - | - |
679
+ | 1.1658 | 900 | 0.499 | - | - |
680
+ | 1.2953 | 1000 | 0.5386 | 0.4768 | 0.3842 |
681
+ | 1.4249 | 1100 | 0.5077 | - | - |
682
+ | 1.5544 | 1200 | 0.4929 | - | - |
683
+ | 1.6839 | 1300 | 0.5194 | - | - |
684
+ | 1.8135 | 1400 | 0.5157 | - | - |
685
+ | 1.9430 | 1500 | 0.4337 | 0.4455 | 0.3894 |
686
+ | 2.0725 | 1600 | 0.4373 | - | - |
687
+ | 2.2021 | 1700 | 0.4569 | - | - |
688
+ | 2.3316 | 1800 | 0.4084 | - | - |
689
+ | 2.4611 | 1900 | 0.42 | - | - |
690
+ | 2.5907 | 2000 | 0.4112 | 0.4578 | 0.3886 |
691
+ | 2.7202 | 2100 | 0.4498 | - | - |
692
+ | 2.8497 | 2200 | 0.415 | - | - |
693
+ | 2.9793 | 2300 | 0.3734 | - | - |
694
+ | 3.1088 | 2400 | 0.3359 | - | - |
695
+ | 3.2383 | 2500 | 0.3923 | 0.4339 | 0.3929 |
696
+ | 3.3679 | 2600 | 0.3345 | - | - |
697
+ | 3.4974 | 2700 | 0.3324 | - | - |
698
+ | 3.6269 | 2800 | 0.3574 | - | - |
699
+ | 3.7565 | 2900 | 0.4078 | - | - |
700
+ | 3.8860 | 3000 | 0.3221 | 0.4293 | 0.3904 |
701
+ | 4.0155 | 3100 | 0.2895 | - | - |
702
+ | 4.1451 | 3200 | 0.2821 | - | - |
703
+ | 4.2746 | 3300 | 0.3192 | - | - |
704
+ | 4.4041 | 3400 | 0.28 | - | - |
705
+ | 4.5337 | 3500 | 0.2716 | 0.4486 | 0.3885 |
706
+ | 4.6632 | 3600 | 0.3147 | - | - |
707
+ | 4.7927 | 3700 | 0.3565 | - | - |
708
+ | 4.9223 | 3800 | 0.2465 | - | - |
709
+ | 5.0518 | 3900 | 0.2436 | - | - |
710
+ | 5.1813 | 4000 | 0.2297 | 0.4486 | 0.3917 |
711
+ | 5.3109 | 4100 | 0.2538 | - | - |
712
+ | 5.4404 | 4200 | 0.2448 | - | - |
713
+ | 5.5699 | 4300 | 0.2433 | - | - |
714
+ | 5.6995 | 4400 | 0.3017 | - | - |
715
+ | 5.8290 | 4500 | 0.2958 | 0.4737 | 0.3934 |
716
+ | 5.9585 | 4600 | 0.2142 | - | - |
717
+ | 6.0881 | 4700 | 0.1939 | - | - |
718
+ | 6.2176 | 4800 | 0.2449 | - | - |
719
+ | 6.3472 | 4900 | 0.2026 | - | - |
720
+ | 6.4767 | 5000 | 0.2006 | 0.4901 | 0.3895 |
721
+ | 6.6062 | 5100 | 0.2118 | - | - |
722
+ | 6.7358 | 5200 | 0.3064 | - | - |
723
+ | 6.8653 | 5300 | 0.2276 | - | - |
724
+ | 6.9948 | 5400 | 0.1809 | - | - |
725
+ | 7.1244 | 5500 | 0.1782 | 0.4992 | 0.3915 |
726
+ | 7.2539 | 5600 | 0.2211 | - | - |
727
+ | 7.3834 | 5700 | 0.1728 | - | - |
728
+ | 7.5130 | 5800 | 0.1651 | - | - |
729
+ | 7.6425 | 5900 | 0.2158 | - | - |
730
+ | 7.7720 | 6000 | 0.2864 | 0.5113 | 0.3892 |
731
+ | 7.9016 | 6100 | 0.179 | - | - |
732
+ | 8.0311 | 6200 | 0.1677 | - | - |
733
+ | 8.1606 | 6300 | 0.1517 | - | - |
734
+ | 8.2902 | 6400 | 0.1851 | - | - |
735
+ | 8.4197 | 6500 | 0.1646 | 0.5030 | 0.3933 |
736
+ | 8.5492 | 6600 | 0.1608 | - | - |
737
+ | 8.6788 | 6700 | 0.217 | - | - |
738
+ | 8.8083 | 6800 | 0.2357 | - | - |
739
+ | 8.9378 | 6900 | 0.1404 | - | - |
740
+ | 9.0674 | 7000 | 0.1465 | 0.5153 | 0.3877 |
741
+ | 9.1969 | 7100 | 0.1791 | - | - |
742
+ | 9.3264 | 7200 | 0.1261 | - | - |
743
+ | 9.4560 | 7300 | 0.1406 | - | - |
744
+ | 9.5855 | 7400 | 0.1626 | - | - |
745
+ | 9.7150 | 7500 | 0.223 | 0.5326 | 0.3939 |
746
+ | 9.8446 | 7600 | 0.1806 | - | - |
747
+ | 9.9741 | 7700 | 0.1289 | - | - |
748
+ | 10.1036 | 7800 | 0.1269 | - | - |
749
+ | 10.2332 | 7900 | 0.1609 | - | - |
750
+ | 10.3627 | 8000 | 0.1279 | 0.5113 | 0.3933 |
751
+ | 10.4922 | 8100 | 0.1264 | - | - |
752
+ | 10.6218 | 8200 | 0.1453 | - | - |
753
+ | 10.7513 | 8300 | 0.2227 | - | - |
754
+ | 10.8808 | 8400 | 0.1314 | - | - |
755
+ | 11.0104 | 8500 | 0.1192 | 0.5444 | 0.3925 |
756
+ | 11.1399 | 8600 | 0.1164 | - | - |
757
+ | 11.2694 | 8700 | 0.1418 | - | - |
758
+ | 11.3990 | 8800 | 0.1202 | - | - |
759
+ | 11.5285 | 8900 | 0.1152 | - | - |
760
+ | **11.658** | **9000** | **0.1454** | **0.529** | **0.3963** |
761
+ | 11.7876 | 9100 | 0.1952 | - | - |
762
+ | 11.9171 | 9200 | 0.1079 | - | - |
763
+ | 12.0466 | 9300 | 0.1139 | - | - |
764
+ | 12.1762 | 9400 | 0.1067 | - | - |
765
+ | 12.3057 | 9500 | 0.1219 | 0.5257 | 0.3938 |
766
+ | 12.4352 | 9600 | 0.119 | - | - |
767
+ | 12.5648 | 9700 | 0.1195 | - | - |
768
+ | 12.6943 | 9800 | 0.158 | - | - |
769
+ | 12.8238 | 9900 | 0.156 | - | - |
770
+ | 12.9534 | 10000 | 0.0974 | 0.5434 | 0.3934 |
771
+ | 13.0829 | 10100 | 0.0928 | - | - |
772
+ | 13.2124 | 10200 | 0.1266 | - | - |
773
+ | 13.3420 | 10300 | 0.0964 | - | - |
774
+ | 13.4715 | 10400 | 0.1007 | - | - |
775
+ | 13.6010 | 10500 | 0.112 | 0.5789 | 0.3893 |
776
+ | 13.7306 | 10600 | 0.1699 | - | - |
777
+ | 13.8601 | 10700 | 0.1084 | - | - |
778
+ | 13.9896 | 10800 | 0.0967 | - | - |
779
+ | 14.1192 | 10900 | 0.0856 | - | - |
780
+ | 14.2487 | 11000 | 0.1142 | 0.5252 | 0.3933 |
781
+ | 14.3782 | 11100 | 0.0891 | - | - |
782
+ | 14.5078 | 11200 | 0.0911 | - | - |
783
+ | 14.6373 | 11300 | 0.1128 | - | - |
784
+ | 14.7668 | 11400 | 0.1686 | - | - |
785
+ | 14.8964 | 11500 | 0.0874 | 0.5874 | 0.3945 |
786
+ | 15.0259 | 11600 | 0.0909 | - | - |
787
+ | 15.1554 | 11700 | 0.0778 | - | - |
788
+ | 15.2850 | 11800 | 0.1055 | - | - |
789
+ | 15.4145 | 11900 | 0.0872 | - | - |
790
+ | 15.5440 | 12000 | 0.0884 | 0.5894 | 0.3934 |
791
+ | 15.6736 | 12100 | 0.1101 | - | - |
792
+ | 15.8031 | 12200 | 0.1354 | - | - |
793
+ | 15.9326 | 12300 | 0.0762 | - | - |
794
+ | 16.0622 | 12400 | 0.0782 | - | - |
795
+ | 16.1917 | 12500 | 0.0936 | 0.5589 | 0.3919 |
796
+ | 16.3212 | 12600 | 0.072 | - | - |
797
+ | 16.4508 | 12700 | 0.0806 | - | - |
798
+ | 16.5803 | 12800 | 0.0929 | - | - |
799
+ | 16.7098 | 12900 | 0.1215 | - | - |
800
+ | 16.8394 | 13000 | 0.1039 | 0.6025 | 0.3926 |
801
+ | 16.9689 | 13100 | 0.0738 | - | - |
802
+ | 17.0984 | 13200 | 0.0651 | - | - |
803
+ | 17.2280 | 13300 | 0.0943 | - | - |
804
+ | 17.3575 | 13400 | 0.0678 | - | - |
805
+ | 17.4870 | 13500 | 0.077 | 0.6002 | 0.3941 |
806
+ | 17.6166 | 13600 | 0.0839 | - | - |
807
+ | 17.7461 | 13700 | 0.1268 | - | - |
808
+ | 17.8756 | 13800 | 0.0764 | - | - |
809
+ | 18.0052 | 13900 | 0.0686 | - | - |
810
+ | 18.1347 | 14000 | 0.0697 | 0.5898 | 0.3913 |
811
+ | 18.2642 | 14100 | 0.0871 | - | - |
812
+ | 18.3938 | 14200 | 0.0699 | - | - |
813
+ | 18.5233 | 14300 | 0.0611 | - | - |
814
+ | 18.6528 | 14400 | 0.0872 | - | - |
815
+ | 18.7824 | 14500 | 0.1281 | 0.6087 | 0.3927 |
816
+ | 18.9119 | 14600 | 0.0583 | - | - |
817
+ | 19.0415 | 14700 | 0.0658 | - | - |
818
+ | 19.1710 | 14800 | 0.0595 | - | - |
819
+ | 19.3005 | 14900 | 0.0816 | - | - |
820
+ | 19.4301 | 15000 | 0.0699 | 0.6078 | 0.3965 |
821
+ | 19.5596 | 15100 | 0.0729 | - | - |
822
+ | 19.6891 | 15200 | 0.0908 | - | - |
823
+ | 19.8187 | 15300 | 0.0978 | - | - |
824
+ | 19.9482 | 15400 | 0.0585 | - | - |
825
+ | 20.0777 | 15500 | 0.0557 | 0.5861 | 0.3925 |
826
+ | 20.2073 | 15600 | 0.0787 | - | - |
827
+ | 20.3368 | 15700 | 0.061 | - | - |
828
+ | 20.4663 | 15800 | 0.0638 | - | - |
829
+ | 20.5959 | 15900 | 0.0656 | - | - |
830
+ | 20.7254 | 16000 | 0.1003 | 0.6032 | 0.3923 |
831
+ | 20.8549 | 16100 | 0.0718 | - | - |
832
+ | 20.9845 | 16200 | 0.0625 | - | - |
833
+ | 21.1140 | 16300 | 0.0532 | - | - |
834
+ | 21.2435 | 16400 | 0.0739 | - | - |
835
+ | 21.3731 | 16500 | 0.0552 | 0.6080 | 0.3942 |
836
+ | 21.5026 | 16600 | 0.0588 | - | - |
837
+ | 21.6321 | 16700 | 0.0716 | - | - |
838
+ | 21.7617 | 16800 | 0.1078 | - | - |
839
+ | 21.8912 | 16900 | 0.0559 | - | - |
840
+ | 22.0207 | 17000 | 0.0596 | 0.6044 | 0.3922 |
841
+ | 22.1503 | 17100 | 0.0512 | - | - |
842
+ | 22.2798 | 17200 | 0.0716 | - | - |
843
+ | 22.4093 | 17300 | 0.0574 | - | - |
844
+ | 22.5389 | 17400 | 0.058 | - | - |
845
+ | 22.6684 | 17500 | 0.07 | 0.6117 | 0.3942 |
846
+ | 22.7979 | 17600 | 0.0965 | - | - |
847
+ | 22.9275 | 17700 | 0.0507 | - | - |
848
+ | 23.0570 | 17800 | 0.0498 | - | - |
849
+ | 23.1865 | 17900 | 0.0524 | - | - |
850
+ | 23.3161 | 18000 | 0.0656 | 0.5936 | 0.3936 |
851
+ | 23.4456 | 18100 | 0.057 | - | - |
852
+ | 23.5751 | 18200 | 0.0619 | - | - |
853
+ | 23.7047 | 18300 | 0.0785 | - | - |
854
+ | 23.8342 | 18400 | 0.0729 | - | - |
855
+ | 23.9637 | 18500 | 0.0541 | 0.6174 | 0.3979 |
856
+ | 24.0933 | 18600 | 0.0456 | - | - |
857
+ | 24.2228 | 18700 | 0.0696 | - | - |
858
+ | 24.3523 | 18800 | 0.048 | - | - |
859
+ | 24.4819 | 18900 | 0.0547 | - | - |
860
+ | 24.6114 | 19000 | 0.0553 | 0.6146 | 0.3962 |
861
+ | 24.7409 | 19100 | 0.0936 | - | - |
862
+ | 24.8705 | 19200 | 0.0579 | - | - |
863
+ | 25.0 | 19300 | 0.0498 | 0.5290 | 0.3963 |
864
+
865
+ * The bold row denotes the saved checkpoint.
866
+ </details>
867
+
868
+ ### Framework Versions
869
+ - Python: 3.11.9
870
+ - Sentence Transformers: 3.0.1
871
+ - Transformers: 4.43.3
872
+ - PyTorch: 2.3.1+cu121
873
+ - Accelerate: 0.30.1
874
+ - Datasets: 2.19.2
875
+ - Tokenizers: 0.19.1
876
+
877
+ ## Citation
878
+
879
+ ### BibTeX
880
+
881
+ #### Sentence Transformers
882
+ ```bibtex
883
+ @inproceedings{reimers-2019-sentence-bert,
884
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
885
+ author = "Reimers, Nils and Gurevych, Iryna",
886
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
887
+ month = "11",
888
+ year = "2019",
889
+ publisher = "Association for Computational Linguistics",
890
+ url = "https://arxiv.org/abs/1908.10084",
891
+ }
892
+ ```
893
+
894
+ #### MultipleNegativesRankingLoss
895
+ ```bibtex
896
+ @misc{henderson2017efficient,
897
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
898
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
899
+ year={2017},
900
+ eprint={1705.00652},
901
+ archivePrefix={arXiv},
902
+ primaryClass={cs.CL}
903
+ }
904
+ ```
905
+
906
+ <!--
907
+ ## Glossary
908
+
909
+ *Clearly define terms in order to be accessible across audiences.*
910
+ -->
911
+
912
+ <!--
913
+ ## Model Card Authors
914
+
915
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
916
+ -->
917
+
918
+ <!--
919
+ ## Model Card Contact
920
+
921
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
922
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/multi-qa-mpnet-base-dot-v1",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.43.3",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.43.3",
5
+ "pytorch": "2.3.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "dot"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:637f23fe331a894a00b3d3ca5b1adba864c0cc580d015e1831a50f31c42833ea
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": true,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 250,
59
+ "model_max_length": 512,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9ef40408ecf22b716024d13c6a793a5b9557dc9168fca29e795898b19b67916a
3
+ size 5624
vocab.txt ADDED
The diff for this file is too large to render. See raw diff