RishuD7 commited on
Commit
30f7184
1 Parent(s): 24f0d85

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,986 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:3305
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: '
35
+
36
+ Limitation of Liability
37
+
38
+
39
+ CUSTOMER’S ENTIRE LIABILITY AND PACNET’S EXCLUSIVE REMEDIES AGAINST CUSTOMER FOR
40
+ ANY DAMAGES ARISING
41
+
42
+ FROM ANY ACT OR OMISSION RELATING TO THE SERVICES, REGARDLESS OF THE FORM OF ACTION,
43
+ WHETHER IN CONTRACT,
44
+
45
+ UNDER STATUTE, IN TORT OR OTHERWISE, INCLUDING NEGLIGENCE, WILL BE LIMITED, FOR
46
+ EACH EVENT OR SERIES OF
47
+
48
+ CONNECTED EVENTS, AS FOLLOWS:
49
+
50
+
51
+ FOR PERSONAL INJURY OR DEATH, UNLIMITED, BUT SUBJECT TO PROVEN DIRECT DAMAGES;
52
+ AND
53
+
54
+ FOR ALL OTHER EVENTS, SUBJECT TO A MAXIMUM EQUAL TO THE AGGREGATE MONTHLY SERVICE
55
+ CHARGES PAID OR
56
+
57
+ PAYBALE BY THE CUSTOMER UNDER THE AGREEMENT.
58
+
59
+
60
+
61
+ PACNET’S ENTIRE LIABILITY AND CUSTOMER’S EXCLUSIVE REMEDIES AGAINST PACNET OR
62
+ ITS AFFILIATES FOR
63
+
64
+ ANY DAMAGES ARISING FROM ANY ACT OR OMISSION RELATING TO THE AGREEMENT, REGARDLESS
65
+ OF THE
66
+
67
+ FORM OF ACTION, WHETHER IN CONTRACT, UNDER STATUTE, IN TORT OR OTHERWISE, INCLUDING
68
+
69
+ NEGLIGENCE, WILL BE LIMITED, FOR EACH EVENT OR SERIES OF CONNECTED EVENTS, AS
70
+ FOLLOWS:
71
+
72
+
73
+ {i} | FOR PERSONAL INJURY OR DEATH, UNLIMITED, BUT SUBJECT TO PROVEN DIRECT DAMAGES;
74
+
75
+
76
+ (ii) FOR FAILURE TO COMPLY WITH SERVICE LEVELS, TO THE AMOUNT OF CREDITS SET OUT
77
+ IN THE
78
+
79
+ RELEVANT SPECIFIC CONDITIONS OF THE RELEVANT SERVICE; AND
80
+
81
+
82
+ (iii) FOR ALL OTHER EVENTS, SUBJECT TO A MAXIMUM EQUAL TO THE AGGREGATE MONTHLY
83
+ SERVICE
84
+
85
+ CHARGES PAID OR PAYABLE BY THE CUSTOMER UNDER THE AGREEMENT.
86
+
87
+ .
88
+
89
+
90
+ PACNET WILL IN NO CIRCUMSTANCES BE LIABLE FOR ANY DAMAGES (EXCEPT RESULTING IN
91
+ PERSONAL INJURY
92
+
93
+ OR DEATH) ATTRIBUTABLE TO ANY SERVICE, PRODUCT OR ACTIONS OF ANY PERSON OTHER
94
+ THAN PACNET, ITS
95
+
96
+ EMPLOYEES AND AGENTS.
97
+
98
+ '
99
+ sentences:
100
+ - Auto Renewal Cancellation Notice Period
101
+ - Assignment
102
+ - Absolute Maximum Amount of Liability
103
+ - source_sentence: '
104
+
105
+ Subcontracting
106
+
107
+
108
+ (a) The Supplier must not subcontract any of its
109
+
110
+ obligations under this Agreement, without the
111
+
112
+ Company''s prior written consent (which will not
113
+
114
+ be unreasonably withheld).
115
+
116
+
117
+ (b) The Supplier remains fully responsible for acts
118
+
119
+ and omissions of its subcontractors and Supplier
120
+
121
+ Personnel in connection with this Agreement or a
122
+
123
+ Statement of Work as if they were its acts and
124
+
125
+ omissions.
126
+
127
+
128
+
129
+ Personnel
130
+
131
+
132
+ (a) At the Company''s reasonable request the
133
+
134
+ Supplier must, at its cost, immediately (or by any
135
+
136
+ date nominated by the Company) remove any
137
+
138
+ person nominated by the Company from the
139
+
140
+ performance of the Services and, if requested by
141
+
142
+ the Company, provide an alternative person
143
+
144
+ acceptable to the Company (acting reasonably).
145
+
146
+
147
+ (b) The Supplier will not remove (temporarily or
148
+
149
+ permanently) or replace a Key Personnel without
150
+
151
+ the Company’s prior written consent (which must
152
+
153
+ not be unreasonably withheld). Any substitute
154
+
155
+ personnel must be at least equally qualified for
156
+
157
+ the duties of the position as the person for whom
158
+
159
+ they are substituted. The Supplier must use
160
+
161
+ reasonable endeavours to provide uninterrupted
162
+
163
+ transition between Key Personnel and their
164
+
165
+ replacements.
166
+
167
+ '
168
+ sentences:
169
+ - Audit Rights
170
+ - Severability
171
+ - Subcontracting
172
+ - source_sentence: All Intellectual Property shall be deemed to be owned by the Employer
173
+ and Executive hereby relinquishes any right or claim to any such Intellectual
174
+ Property except to the extent necessary to transfer the ownership of any such
175
+ Intellectual Property to Employer. Executive shall promptly disclose to the Employer
176
+ all Intellectual Property. Without royalty or separate consideration, Executive
177
+ hereby assigns and agrees to assign to the Employer (or as otherwise directed
178
+ by the Employer) Executive’s full right, title and interest in and to all Intellectual
179
+ Property, including without limitation all copyright interests therein. Executive
180
+ agrees to cooperate with Employer and to execute any and all applications for
181
+ domestic and foreign patents, copyrights or other proprietary rights and to do
182
+ such other acts (including, among other things, the execution and delivery of
183
+ instruments of further assurance or confirmation) requested by the Employer to
184
+ assign the Intellectual Property to the Employer and to permit the Employer to
185
+ file, obtain and enforce any patents, copyrights or other proprietary rights in
186
+ the Intellectual Property. Executive agrees that Executive’s obligation to cooperate
187
+ and to execute, or cause to be executed, when it is in Executive’s power to do
188
+ so, any such instrument or paper, will continue after termination of this Agreement.
189
+ Executive agrees to make and maintain adequate and current written records of
190
+ all Intellectual Property, in the form of notes, sketches, drawings, or reports
191
+ relating hereto, which records shall be and remain the property of and available
192
+ to the Employer at all times. The parties agree that the Intellectual Property
193
+ does not include the items listed in the attached Exhibit A to this Agreement.
194
+ sentences:
195
+ - General Indemnities
196
+ - Intellectual Property Ownership
197
+ - Governing Law
198
+ - source_sentence: "CBRE\n.\n\nHEVERTECH LTD\n.\n\n.\n \n.\n\nPreferred Supplier\
199
+ \ Light/Agreement\n.\n\n.\n \n.\n \n.\n\nAgreement; Number: NMS/16/050 |\n.\n\
200
+ \nQUALIFIED SERVICE LEVEL AGREEMENT\nBETWEEN\nCBRE MANAGED SERVICES LIMITED\n\
201
+ AND\nHEVERTECH LTDCBRE\n.\n\nHEVERTECH LTD\n.\n\n.\n \n.\n \n.\n\n.\n \n.\n\
202
+ \ \n.\n\n.\n \n.\n \n.\n\nPreferred!Supplier Light Agreement\ni]\n.\n\nW\\\
203
+ olaclelealelaters Ulin elsiea Niky AeyAOkLY)\n.\n\nTABLE OF CONTENTS:\n.\n\nQualified\
204
+ \ Service Level Agreement Pages 03 to 10 inclusive\n.\n\nAppendix 1 — Schedule\
205
+ \ of Rates Page\n.\n\nAppendix 2 — Key Contacts and Escalation Process Pages 8\
206
+ \ to 9 inclusive\n.\n\nAppendix 3 - Working Capital Scheme Pages 10\n.\n\n.\n\
207
+ \ \n.\n\nCBRE Managed Services Lid\nFebruary 2016 Page 2 of LOPreferred Supplier\
208
+ \ Light'A greement HEVERTECH LTD\n.\n\nAgreement Number: NMS/16/050\n.\n\n.\n\
209
+ \ \n\n\nTHIS AGREEMENT is made on 1° June 2016\nBETWEEN\n\n(1) CBRE Managed Services\
210
+ \ Limited (Registered in England No. 1799580) whose registered\noffice is at City\
211
+ \ Bridge House, 57 Southwark Street, London, SE1 1RU (“CBRE”); and\n\n(2) Hevertech\
212
+ \ Ltd (Registered in England No. 2803522) whose registered office is at: Unit\
213
+ \ 2\nTreefield Industrial Estate, Gildersome, Leeds, LS27 7JU (the “Supplier’).\n"
214
+ sentences:
215
+ - Non Solicitation
216
+ - Intellectual Property Infringement Indemnity
217
+ - Title of Agreement
218
+ - source_sentence: "\nThe management of each individual entity within a suppliers\
219
+ \ organization is responsible for\nimplementing the VAT Supplier Code of Conduct\
220
+ \ in their respective area of responsibility. They are\nobliged to take all appropriate\
221
+ \ action and provide the required structures and resources to ensure\nthat all\
222
+ \ employees in the entity are familiar with the VAT Supplier Code of Conduct and\
223
+ \ that its\nprinciples are fully implemented.\n.\n\nAll VAT suppliers are encouraged\
224
+ \ to direct any questions they might have with regard to the\ncontents, interpretation\
225
+ \ or implementation of the VAT Supplier Code of Conduct to the VAT Strategic\n\
226
+ Procurement function.\n.\n\n.\n \n.\n\nDocument created Release\nName Index Date\n\
227
+ .\n\n.\n \n.\n\nFile name\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\nPMS\
228
+ \ Document BPO1FO30EA MEY A 18.11.2014\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n\
229
+ .\n \n.\n\n.\n \n.\n\nWAT Strategic Procurement BP01FO30E\n.\n\nVakuumventile\
230
+ \ AG Supplier Code of Conduct Page 3 of 3\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\
231
+ \nWe, the undersigned, hereby confirm and declare in the name and on behalf of\
232
+ \ our company that\n.\n\n1. we have received the VAT Supplier Code of Condex;\n\
233
+ .\n\n2. by signing this declaration, we accept and commit to complying with all\
234
+ \ rules and requirements as\nlaid out in the VAT Supplier Code of Conduct;\n.\n\
235
+ \n3. we accept that this declaration shall be exclusively governed by the material\
236
+ \ laws of Switzerland,\nexcluding the UN Law of Sales (CISG).\n.\n\nPlacelDate\
237
+ \ —-Singagore. / tone 2077\nCompany Kien Ann Engineering Pe ad\nStreet 3c 500\
238
+ \ kovo Cirle\n.\n\nPost codelcity Singapore 627035\n.\n\nName of authorized signatory\
239
+ \ Jameson Low\n.\n\nL. Ze\nSignature << : Ly eA\n* fh\n20,\nXn _A\nCETES\n.\n\n\
240
+ 1. Please sign one (1) original c Of this document.\n2. Please note that only\
241
+ \ duly authorized personnel of your company may sign this document.\n3. Please\
242
+ \ send the duly signed original copy by conventional mail to:\nVAT VAKUUMVENTILE\
243
+ \ AG, SEELISTRASSE 1, STRATEGISCHER EINKAUF, CH-9469 HAAG\n.\n\n.\n \n.\n\n.\n\
244
+ \ \n.\n\n.\n \n.\n\nDocument created Release\n.\n\n.\n \n.\n\nFile name\nName\
245
+ \ Index Date\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n\
246
+ \ \n.\n\n.\n \n.\n\nPMS Document BPO1FO30EA MEY A 18.11.2014\n\n \n\n \n\nAll\
247
+ \ business is conducted in compliance with governing national and international\
248
+ \ laws and\nregulations. As a matter of principle, we honor agreements and obligations\
249
+ \ we have entered into\nvoluntarily. All suppliers are obliged to carefully study\
250
+ \ the rules and regulations pertinent to their\narea of responsibility and ensure\
251
+ \ full compliance. In case of doubt or queries, they are obliged to\nseek additional\
252
+ \ information and guidance from the appropriate channels or persons in charge.\
253
+ \ VAT\nhas a zero tolerance policy with regard to violations of its Supplier Code\
254
+ \ of Conduct. Violations may\nlead to appropriate action being taken against the\
255
+ \ supplier.\n.\n\n2. Fair competition\n"
256
+ sentences:
257
+ - Absolute Maximum Amount of Liability
258
+ - Governing Law
259
+ - Third Party Beneficiary
260
+ model-index:
261
+ - name: BGE base Financial Matryoshka
262
+ results:
263
+ - task:
264
+ type: information-retrieval
265
+ name: Information Retrieval
266
+ dataset:
267
+ name: dim 768
268
+ type: dim_768
269
+ metrics:
270
+ - type: cosine_accuracy@1
271
+ value: 0.007263922518159807
272
+ name: Cosine Accuracy@1
273
+ - type: cosine_accuracy@3
274
+ value: 0.021791767554479417
275
+ name: Cosine Accuracy@3
276
+ - type: cosine_accuracy@5
277
+ value: 0.03026634382566586
278
+ name: Cosine Accuracy@5
279
+ - type: cosine_accuracy@10
280
+ value: 0.06174334140435835
281
+ name: Cosine Accuracy@10
282
+ - type: cosine_precision@1
283
+ value: 0.007263922518159807
284
+ name: Cosine Precision@1
285
+ - type: cosine_precision@3
286
+ value: 0.007263922518159807
287
+ name: Cosine Precision@3
288
+ - type: cosine_precision@5
289
+ value: 0.006053268765133172
290
+ name: Cosine Precision@5
291
+ - type: cosine_precision@10
292
+ value: 0.006174334140435836
293
+ name: Cosine Precision@10
294
+ - type: cosine_recall@1
295
+ value: 0.007263922518159807
296
+ name: Cosine Recall@1
297
+ - type: cosine_recall@3
298
+ value: 0.021791767554479417
299
+ name: Cosine Recall@3
300
+ - type: cosine_recall@5
301
+ value: 0.03026634382566586
302
+ name: Cosine Recall@5
303
+ - type: cosine_recall@10
304
+ value: 0.06174334140435835
305
+ name: Cosine Recall@10
306
+ - type: cosine_ndcg@10
307
+ value: 0.028939379669254476
308
+ name: Cosine Ndcg@10
309
+ - type: cosine_mrr@10
310
+ value: 0.019243149237095962
311
+ name: Cosine Mrr@10
312
+ - type: cosine_map@100
313
+ value: 0.029673742520760122
314
+ name: Cosine Map@100
315
+ - task:
316
+ type: information-retrieval
317
+ name: Information Retrieval
318
+ dataset:
319
+ name: dim 512
320
+ type: dim_512
321
+ metrics:
322
+ - type: cosine_accuracy@1
323
+ value: 0.006053268765133172
324
+ name: Cosine Accuracy@1
325
+ - type: cosine_accuracy@3
326
+ value: 0.021791767554479417
327
+ name: Cosine Accuracy@3
328
+ - type: cosine_accuracy@5
329
+ value: 0.031476997578692496
330
+ name: Cosine Accuracy@5
331
+ - type: cosine_accuracy@10
332
+ value: 0.06174334140435835
333
+ name: Cosine Accuracy@10
334
+ - type: cosine_precision@1
335
+ value: 0.006053268765133172
336
+ name: Cosine Precision@1
337
+ - type: cosine_precision@3
338
+ value: 0.007263922518159805
339
+ name: Cosine Precision@3
340
+ - type: cosine_precision@5
341
+ value: 0.006295399515738498
342
+ name: Cosine Precision@5
343
+ - type: cosine_precision@10
344
+ value: 0.006174334140435836
345
+ name: Cosine Precision@10
346
+ - type: cosine_recall@1
347
+ value: 0.006053268765133172
348
+ name: Cosine Recall@1
349
+ - type: cosine_recall@3
350
+ value: 0.021791767554479417
351
+ name: Cosine Recall@3
352
+ - type: cosine_recall@5
353
+ value: 0.031476997578692496
354
+ name: Cosine Recall@5
355
+ - type: cosine_recall@10
356
+ value: 0.06174334140435835
357
+ name: Cosine Recall@10
358
+ - type: cosine_ndcg@10
359
+ value: 0.028312145815995213
360
+ name: Cosine Ndcg@10
361
+ - type: cosine_mrr@10
362
+ value: 0.018378876974518614
363
+ name: Cosine Mrr@10
364
+ - type: cosine_map@100
365
+ value: 0.029262713498052723
366
+ name: Cosine Map@100
367
+ - task:
368
+ type: information-retrieval
369
+ name: Information Retrieval
370
+ dataset:
371
+ name: dim 256
372
+ type: dim_256
373
+ metrics:
374
+ - type: cosine_accuracy@1
375
+ value: 0.007263922518159807
376
+ name: Cosine Accuracy@1
377
+ - type: cosine_accuracy@3
378
+ value: 0.018159806295399514
379
+ name: Cosine Accuracy@3
380
+ - type: cosine_accuracy@5
381
+ value: 0.02784503631961259
382
+ name: Cosine Accuracy@5
383
+ - type: cosine_accuracy@10
384
+ value: 0.05811138014527845
385
+ name: Cosine Accuracy@10
386
+ - type: cosine_precision@1
387
+ value: 0.007263922518159807
388
+ name: Cosine Precision@1
389
+ - type: cosine_precision@3
390
+ value: 0.006053268765133171
391
+ name: Cosine Precision@3
392
+ - type: cosine_precision@5
393
+ value: 0.0055690072639225175
394
+ name: Cosine Precision@5
395
+ - type: cosine_precision@10
396
+ value: 0.0058111380145278455
397
+ name: Cosine Precision@10
398
+ - type: cosine_recall@1
399
+ value: 0.007263922518159807
400
+ name: Cosine Recall@1
401
+ - type: cosine_recall@3
402
+ value: 0.018159806295399514
403
+ name: Cosine Recall@3
404
+ - type: cosine_recall@5
405
+ value: 0.02784503631961259
406
+ name: Cosine Recall@5
407
+ - type: cosine_recall@10
408
+ value: 0.05811138014527845
409
+ name: Cosine Recall@10
410
+ - type: cosine_ndcg@10
411
+ value: 0.026798615255571104
412
+ name: Cosine Ndcg@10
413
+ - type: cosine_mrr@10
414
+ value: 0.017617414197317337
415
+ name: Cosine Mrr@10
416
+ - type: cosine_map@100
417
+ value: 0.029447278389058605
418
+ name: Cosine Map@100
419
+ - task:
420
+ type: information-retrieval
421
+ name: Information Retrieval
422
+ dataset:
423
+ name: dim 128
424
+ type: dim_128
425
+ metrics:
426
+ - type: cosine_accuracy@1
427
+ value: 0.0036319612590799033
428
+ name: Cosine Accuracy@1
429
+ - type: cosine_accuracy@3
430
+ value: 0.01694915254237288
431
+ name: Cosine Accuracy@3
432
+ - type: cosine_accuracy@5
433
+ value: 0.02784503631961259
434
+ name: Cosine Accuracy@5
435
+ - type: cosine_accuracy@10
436
+ value: 0.06295399515738499
437
+ name: Cosine Accuracy@10
438
+ - type: cosine_precision@1
439
+ value: 0.0036319612590799033
440
+ name: Cosine Precision@1
441
+ - type: cosine_precision@3
442
+ value: 0.005649717514124293
443
+ name: Cosine Precision@3
444
+ - type: cosine_precision@5
445
+ value: 0.005569007263922519
446
+ name: Cosine Precision@5
447
+ - type: cosine_precision@10
448
+ value: 0.006295399515738499
449
+ name: Cosine Precision@10
450
+ - type: cosine_recall@1
451
+ value: 0.0036319612590799033
452
+ name: Cosine Recall@1
453
+ - type: cosine_recall@3
454
+ value: 0.01694915254237288
455
+ name: Cosine Recall@3
456
+ - type: cosine_recall@5
457
+ value: 0.02784503631961259
458
+ name: Cosine Recall@5
459
+ - type: cosine_recall@10
460
+ value: 0.06295399515738499
461
+ name: Cosine Recall@10
462
+ - type: cosine_ndcg@10
463
+ value: 0.027052121582546967
464
+ name: Cosine Ndcg@10
465
+ - type: cosine_mrr@10
466
+ value: 0.01650236365732733
467
+ name: Cosine Mrr@10
468
+ - type: cosine_map@100
469
+ value: 0.028509723825826283
470
+ name: Cosine Map@100
471
+ - task:
472
+ type: information-retrieval
473
+ name: Information Retrieval
474
+ dataset:
475
+ name: dim 64
476
+ type: dim_64
477
+ metrics:
478
+ - type: cosine_accuracy@1
479
+ value: 0.004842615012106538
480
+ name: Cosine Accuracy@1
481
+ - type: cosine_accuracy@3
482
+ value: 0.025423728813559324
483
+ name: Cosine Accuracy@3
484
+ - type: cosine_accuracy@5
485
+ value: 0.03753026634382567
486
+ name: Cosine Accuracy@5
487
+ - type: cosine_accuracy@10
488
+ value: 0.06053268765133172
489
+ name: Cosine Accuracy@10
490
+ - type: cosine_precision@1
491
+ value: 0.004842615012106538
492
+ name: Cosine Precision@1
493
+ - type: cosine_precision@3
494
+ value: 0.008474576271186439
495
+ name: Cosine Precision@3
496
+ - type: cosine_precision@5
497
+ value: 0.007506053268765135
498
+ name: Cosine Precision@5
499
+ - type: cosine_precision@10
500
+ value: 0.006053268765133172
501
+ name: Cosine Precision@10
502
+ - type: cosine_recall@1
503
+ value: 0.004842615012106538
504
+ name: Cosine Recall@1
505
+ - type: cosine_recall@3
506
+ value: 0.025423728813559324
507
+ name: Cosine Recall@3
508
+ - type: cosine_recall@5
509
+ value: 0.03753026634382567
510
+ name: Cosine Recall@5
511
+ - type: cosine_recall@10
512
+ value: 0.06053268765133172
513
+ name: Cosine Recall@10
514
+ - type: cosine_ndcg@10
515
+ value: 0.028532073992406013
516
+ name: Cosine Ndcg@10
517
+ - type: cosine_mrr@10
518
+ value: 0.018836715477151305
519
+ name: Cosine Mrr@10
520
+ - type: cosine_map@100
521
+ value: 0.03024491886170751
522
+ name: Cosine Map@100
523
+ ---
524
+
525
+ # BGE base Financial Matryoshka
526
+
527
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
528
+
529
+ ## Model Details
530
+
531
+ ### Model Description
532
+ - **Model Type:** Sentence Transformer
533
+ - **Base model:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) <!-- at revision a5beb1e3e68b9ab74eb54cfd186867f64f240e1a -->
534
+ - **Maximum Sequence Length:** 512 tokens
535
+ - **Output Dimensionality:** 768 tokens
536
+ - **Similarity Function:** Cosine Similarity
537
+ <!-- - **Training Dataset:** Unknown -->
538
+ - **Language:** en
539
+ - **License:** apache-2.0
540
+
541
+ ### Model Sources
542
+
543
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
544
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
545
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
546
+
547
+ ### Full Model Architecture
548
+
549
+ ```
550
+ SentenceTransformer(
551
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
552
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
553
+ (2): Normalize()
554
+ )
555
+ ```
556
+
557
+ ## Usage
558
+
559
+ ### Direct Usage (Sentence Transformers)
560
+
561
+ First install the Sentence Transformers library:
562
+
563
+ ```bash
564
+ pip install -U sentence-transformers
565
+ ```
566
+
567
+ Then you can load this model and run inference.
568
+ ```python
569
+ from sentence_transformers import SentenceTransformer
570
+
571
+ # Download from the 🤗 Hub
572
+ model = SentenceTransformer("RishuD7/exigent-bge-base-financial-matryoshka")
573
+ # Run inference
574
+ sentences = [
575
+ '\nThe management of each individual entity within a suppliers organization is responsible for\nimplementing the VAT Supplier Code of Conduct in their respective area of responsibility. They are\nobliged to take all appropriate action and provide the required structures and resources to ensure\nthat all employees in the entity are familiar with the VAT Supplier Code of Conduct and that its\nprinciples are fully implemented.\n.\n\nAll VAT suppliers are encouraged to direct any questions they might have with regard to the\ncontents, interpretation or implementation of the VAT Supplier Code of Conduct to the VAT Strategic\nProcurement function.\n.\n\n.\n \n.\n\nDocument created Release\nName Index Date\n.\n\n.\n \n.\n\nFile name\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\nPMS Document BPO1FO30EA MEY A 18.11.2014\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\nWAT Strategic Procurement BP01FO30E\n.\n\nVakuumventile AG Supplier Code of Conduct Page 3 of 3\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\nWe, the undersigned, hereby confirm and declare in the name and on behalf of our company that\n.\n\n1. we have received the VAT Supplier Code of Condex;\n.\n\n2. by signing this declaration, we accept and commit to complying with all rules and requirements as\nlaid out in the VAT Supplier Code of Conduct;\n.\n\n3. we accept that this declaration shall be exclusively governed by the material laws of Switzerland,\nexcluding the UN Law of Sales (CISG).\n.\n\nPlacelDate —-Singagore. / tone 2077\nCompany Kien Ann Engineering Pe ad\nStreet 3c 500 kovo Cirle\n.\n\nPost codelcity Singapore 627035\n.\n\nName of authorized signatory Jameson Low\n.\n\nL. Ze\nSignature << : Ly eA\n* fh\n20,\nXn _A\nCETES\n.\n\n1. Please sign one (1) original c Of this document.\n2. Please note that only duly authorized personnel of your company may sign this document.\n3. Please send the duly signed original copy by conventional mail to:\nVAT VAKUUMVENTILE AG, SEELISTRASSE 1, STRATEGISCHER EINKAUF, CH-9469 HAAG\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\nDocument created Release\n.\n\n.\n \n.\n\nFile name\nName Index Date\n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\n.\n \n.\n\nPMS Document BPO1FO30EA MEY A 18.11.2014\n\n \n\n \n\nAll business is conducted in compliance with governing national and international laws and\nregulations. As a matter of principle, we honor agreements and obligations we have entered into\nvoluntarily. All suppliers are obliged to carefully study the rules and regulations pertinent to their\narea of responsibility and ensure full compliance. In case of doubt or queries, they are obliged to\nseek additional information and guidance from the appropriate channels or persons in charge. VAT\nhas a zero tolerance policy with regard to violations of its Supplier Code of Conduct. Violations may\nlead to appropriate action being taken against the supplier.\n.\n\n2. Fair competition\n',
576
+ 'Governing Law',
577
+ 'Absolute Maximum Amount of Liability',
578
+ ]
579
+ embeddings = model.encode(sentences)
580
+ print(embeddings.shape)
581
+ # [3, 768]
582
+
583
+ # Get the similarity scores for the embeddings
584
+ similarities = model.similarity(embeddings, embeddings)
585
+ print(similarities.shape)
586
+ # [3, 3]
587
+ ```
588
+
589
+ <!--
590
+ ### Direct Usage (Transformers)
591
+
592
+ <details><summary>Click to see the direct usage in Transformers</summary>
593
+
594
+ </details>
595
+ -->
596
+
597
+ <!--
598
+ ### Downstream Usage (Sentence Transformers)
599
+
600
+ You can finetune this model on your own dataset.
601
+
602
+ <details><summary>Click to expand</summary>
603
+
604
+ </details>
605
+ -->
606
+
607
+ <!--
608
+ ### Out-of-Scope Use
609
+
610
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
611
+ -->
612
+
613
+ ## Evaluation
614
+
615
+ ### Metrics
616
+
617
+ #### Information Retrieval
618
+ * Dataset: `dim_768`
619
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
620
+
621
+ | Metric | Value |
622
+ |:--------------------|:-----------|
623
+ | cosine_accuracy@1 | 0.0073 |
624
+ | cosine_accuracy@3 | 0.0218 |
625
+ | cosine_accuracy@5 | 0.0303 |
626
+ | cosine_accuracy@10 | 0.0617 |
627
+ | cosine_precision@1 | 0.0073 |
628
+ | cosine_precision@3 | 0.0073 |
629
+ | cosine_precision@5 | 0.0061 |
630
+ | cosine_precision@10 | 0.0062 |
631
+ | cosine_recall@1 | 0.0073 |
632
+ | cosine_recall@3 | 0.0218 |
633
+ | cosine_recall@5 | 0.0303 |
634
+ | cosine_recall@10 | 0.0617 |
635
+ | cosine_ndcg@10 | 0.0289 |
636
+ | cosine_mrr@10 | 0.0192 |
637
+ | **cosine_map@100** | **0.0297** |
638
+
639
+ #### Information Retrieval
640
+ * Dataset: `dim_512`
641
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
642
+
643
+ | Metric | Value |
644
+ |:--------------------|:-----------|
645
+ | cosine_accuracy@1 | 0.0061 |
646
+ | cosine_accuracy@3 | 0.0218 |
647
+ | cosine_accuracy@5 | 0.0315 |
648
+ | cosine_accuracy@10 | 0.0617 |
649
+ | cosine_precision@1 | 0.0061 |
650
+ | cosine_precision@3 | 0.0073 |
651
+ | cosine_precision@5 | 0.0063 |
652
+ | cosine_precision@10 | 0.0062 |
653
+ | cosine_recall@1 | 0.0061 |
654
+ | cosine_recall@3 | 0.0218 |
655
+ | cosine_recall@5 | 0.0315 |
656
+ | cosine_recall@10 | 0.0617 |
657
+ | cosine_ndcg@10 | 0.0283 |
658
+ | cosine_mrr@10 | 0.0184 |
659
+ | **cosine_map@100** | **0.0293** |
660
+
661
+ #### Information Retrieval
662
+ * Dataset: `dim_256`
663
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
664
+
665
+ | Metric | Value |
666
+ |:--------------------|:-----------|
667
+ | cosine_accuracy@1 | 0.0073 |
668
+ | cosine_accuracy@3 | 0.0182 |
669
+ | cosine_accuracy@5 | 0.0278 |
670
+ | cosine_accuracy@10 | 0.0581 |
671
+ | cosine_precision@1 | 0.0073 |
672
+ | cosine_precision@3 | 0.0061 |
673
+ | cosine_precision@5 | 0.0056 |
674
+ | cosine_precision@10 | 0.0058 |
675
+ | cosine_recall@1 | 0.0073 |
676
+ | cosine_recall@3 | 0.0182 |
677
+ | cosine_recall@5 | 0.0278 |
678
+ | cosine_recall@10 | 0.0581 |
679
+ | cosine_ndcg@10 | 0.0268 |
680
+ | cosine_mrr@10 | 0.0176 |
681
+ | **cosine_map@100** | **0.0294** |
682
+
683
+ #### Information Retrieval
684
+ * Dataset: `dim_128`
685
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
686
+
687
+ | Metric | Value |
688
+ |:--------------------|:-----------|
689
+ | cosine_accuracy@1 | 0.0036 |
690
+ | cosine_accuracy@3 | 0.0169 |
691
+ | cosine_accuracy@5 | 0.0278 |
692
+ | cosine_accuracy@10 | 0.063 |
693
+ | cosine_precision@1 | 0.0036 |
694
+ | cosine_precision@3 | 0.0056 |
695
+ | cosine_precision@5 | 0.0056 |
696
+ | cosine_precision@10 | 0.0063 |
697
+ | cosine_recall@1 | 0.0036 |
698
+ | cosine_recall@3 | 0.0169 |
699
+ | cosine_recall@5 | 0.0278 |
700
+ | cosine_recall@10 | 0.063 |
701
+ | cosine_ndcg@10 | 0.0271 |
702
+ | cosine_mrr@10 | 0.0165 |
703
+ | **cosine_map@100** | **0.0285** |
704
+
705
+ #### Information Retrieval
706
+ * Dataset: `dim_64`
707
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
708
+
709
+ | Metric | Value |
710
+ |:--------------------|:-----------|
711
+ | cosine_accuracy@1 | 0.0048 |
712
+ | cosine_accuracy@3 | 0.0254 |
713
+ | cosine_accuracy@5 | 0.0375 |
714
+ | cosine_accuracy@10 | 0.0605 |
715
+ | cosine_precision@1 | 0.0048 |
716
+ | cosine_precision@3 | 0.0085 |
717
+ | cosine_precision@5 | 0.0075 |
718
+ | cosine_precision@10 | 0.0061 |
719
+ | cosine_recall@1 | 0.0048 |
720
+ | cosine_recall@3 | 0.0254 |
721
+ | cosine_recall@5 | 0.0375 |
722
+ | cosine_recall@10 | 0.0605 |
723
+ | cosine_ndcg@10 | 0.0285 |
724
+ | cosine_mrr@10 | 0.0188 |
725
+ | **cosine_map@100** | **0.0302** |
726
+
727
+ <!--
728
+ ## Bias, Risks and Limitations
729
+
730
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
731
+ -->
732
+
733
+ <!--
734
+ ### Recommendations
735
+
736
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
737
+ -->
738
+
739
+ ## Training Details
740
+
741
+ ### Training Dataset
742
+
743
+ #### Unnamed Dataset
744
+
745
+
746
+ * Size: 3,305 training samples
747
+ * Columns: <code>positive</code> and <code>anchor</code>
748
+ * Approximate statistics based on the first 1000 samples:
749
+ | | positive | anchor |
750
+ |:--------|:--------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|
751
+ | type | string | string |
752
+ | details | <ul><li>min: 123 tokens</li><li>mean: 353.07 tokens</li><li>max: 512 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 5.37 tokens</li><li>max: 8 tokens</li></ul> |
753
+ * Samples:
754
+ | positive | anchor |
755
+ |:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------|
756
+ | <code>In no event shall CBRE, Client, or their respective affiliates incur liability under this agreement or otherwise relating to the Services beyond the insurance proceeds available with respect to the particular matter under the Insurance Policies required to be carried by CBRE AND Client under Article 6 above including, if applicable, proceeds of self-insurance. Each party shall and shall cause its affiliates to look solely to such insurance proceeds (and any such proceeds paid through self-insurance) to satisfy its claims against the released parties and agrees that it shall have no right of recovery beyond such proceeds; provided, however, that if insurance proceeds under such policies are not paid because a party has failed to maintain such policies, comply with policy requirements or, in the case of self-insurance, unreasonably denied a claim, such party shall be liable for the amounts that otherwise would have been payable under such policies had such party maintained such policies, complied with the policy requirement or not unreasonably denied such claim, as the case may be.</code> | <code>Absolute Maximum Amount of Liability</code> |
757
+ | <code>4. Rent. <br>4.01 From and after the Commencement Date, Tenant shall pay Landlord, without any<br>setoff or deduction, unless expressly set forth in this Lease, all Base Rent and Additional Rent<br>due for the Term (collectively referred to as "Rent"). "Additional Rent" means all sums<br>(exclusive of Base Rent) that Tenant is required to pay Landlord under this Lease. Tenant shall<br>pay and be liable for all rental, sales and use taxes (but excluding income taxes), if any,<br>imposed upon or measured by Rent. Base Rent and recurring monthly charges of Additional<br>Rent shall be due and payable in advance on the first day of each calendar month without<br>notice or demand, provided that the installment of Base Rent attributable to the first (1st) full<br>calendar month of the Term following the Abatement Period shall be due concurrently with the<br>execution of this Lease by Tenant. All other items of Rent shall be due and payable on or<br>before thirty (30) days after billing by Landlord. Rent shall be made payable to the entity, and<br>sent to the address, that Landlord designates and shall be made by good and sufficient check or<br>by other means acceptable to Landlord. Landlord may return to Tenant, at any time within<br>fifteen (15) days after receiving same, any payment of Rent (a) made following any Default<br>(irrespective of whether Landlord has commenced the exercise of any remedy), or (b) that is<br>less than the amount due. Each such returned payment (whether made by returning Tenant's<br>actual check, or by issuing a refund in the event Tenant's check was deposited) shall be<br>conclusively presumed not to have been received or approved by Landlord. If Tenant does not<br>pay any Rent when due hereunder, Tenant shall pay Landlord an administration fee in the<br>amount of five percent (5%) of the past due amount. In addition, past due Rent shall accrue<br>interest at a rate equal to the lesser of (i) twelve percent (12%) per annum or (ii) the maximum<br>legal rate, and Tenant shall pay Landlord a fee for any checks returned by Tenant's bank for<br>any reason. Notwithstanding the foregoing, no such late charge or of interest shall be imposed<br>with respect to the first (1st) late payment in any calendar year, but not with respect to more<br>than three (3) such late payments during the initial Term of this Lease. </code> | <code>Late Payment Charges</code> |
758
+ | <code>Term This Agreement shall come into force and shall last unlimited from such date. Either Party may however terminate this Agreement at any time by giving upon thirty (30) days' written notice to the other Party. The Receiving Party's obligations contained in this Agreement to keep confidential and restrict use of the Disclosing Party's Confidential Information shall sur- vive for a period of five (5) years from the date of its termination for any reason whatsoever. lX. Contractual penalty<br>For the purposes of this Non-Disclosure Agreement, " Confidential Information" includes all technical and/or commercial and/or financial information in the field designated in section 1., which a contracting Party (hereinafter referred to as the "EQ€i1gPedy") makes, or has made, accessible to the other contracting Party (hereinafter referred to as the ".&eiyi!g Partv") in oral, written, tangible or other form (e.9. disk, data carrier) directly or indirectly, in- cluding but not limited to, drawings, models, components, and other material. Confidential In- formation is to be identified as such. Orally communicated or visually, information having been designated as confidential at the time of disclosure will be confirmed as such in writing by the Disclosing Party within 30 (thirty) days from such disclosure being understood thatlhe ./A information will be considered Confidential Information during that period of 30 (thirty) days. /L t'-4 PF 0233 (September 2016) page 1 of 5 ä =.<br> PFEIFFER F<br>.<br> F<br>.<br> VACUUM<br></code> | <code>Termination for Convenience</code> |
759
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
760
+ ```json
761
+ {
762
+ "loss": "MultipleNegativesRankingLoss",
763
+ "matryoshka_dims": [
764
+ 768,
765
+ 512,
766
+ 256,
767
+ 128,
768
+ 64
769
+ ],
770
+ "matryoshka_weights": [
771
+ 1,
772
+ 1,
773
+ 1,
774
+ 1,
775
+ 1
776
+ ],
777
+ "n_dims_per_step": -1
778
+ }
779
+ ```
780
+
781
+ ### Training Hyperparameters
782
+ #### Non-Default Hyperparameters
783
+
784
+ - `eval_strategy`: epoch
785
+ - `per_device_train_batch_size`: 32
786
+ - `per_device_eval_batch_size`: 16
787
+ - `gradient_accumulation_steps`: 16
788
+ - `learning_rate`: 2e-05
789
+ - `num_train_epochs`: 5
790
+ - `lr_scheduler_type`: cosine
791
+ - `warmup_ratio`: 0.1
792
+ - `tf32`: False
793
+ - `load_best_model_at_end`: True
794
+ - `optim`: adamw_torch_fused
795
+ - `batch_sampler`: no_duplicates
796
+
797
+ #### All Hyperparameters
798
+ <details><summary>Click to expand</summary>
799
+
800
+ - `overwrite_output_dir`: False
801
+ - `do_predict`: False
802
+ - `eval_strategy`: epoch
803
+ - `prediction_loss_only`: True
804
+ - `per_device_train_batch_size`: 32
805
+ - `per_device_eval_batch_size`: 16
806
+ - `per_gpu_train_batch_size`: None
807
+ - `per_gpu_eval_batch_size`: None
808
+ - `gradient_accumulation_steps`: 16
809
+ - `eval_accumulation_steps`: None
810
+ - `learning_rate`: 2e-05
811
+ - `weight_decay`: 0.0
812
+ - `adam_beta1`: 0.9
813
+ - `adam_beta2`: 0.999
814
+ - `adam_epsilon`: 1e-08
815
+ - `max_grad_norm`: 1.0
816
+ - `num_train_epochs`: 5
817
+ - `max_steps`: -1
818
+ - `lr_scheduler_type`: cosine
819
+ - `lr_scheduler_kwargs`: {}
820
+ - `warmup_ratio`: 0.1
821
+ - `warmup_steps`: 0
822
+ - `log_level`: passive
823
+ - `log_level_replica`: warning
824
+ - `log_on_each_node`: True
825
+ - `logging_nan_inf_filter`: True
826
+ - `save_safetensors`: True
827
+ - `save_on_each_node`: False
828
+ - `save_only_model`: False
829
+ - `restore_callback_states_from_checkpoint`: False
830
+ - `no_cuda`: False
831
+ - `use_cpu`: False
832
+ - `use_mps_device`: False
833
+ - `seed`: 42
834
+ - `data_seed`: None
835
+ - `jit_mode_eval`: False
836
+ - `use_ipex`: False
837
+ - `bf16`: False
838
+ - `fp16`: False
839
+ - `fp16_opt_level`: O1
840
+ - `half_precision_backend`: auto
841
+ - `bf16_full_eval`: False
842
+ - `fp16_full_eval`: False
843
+ - `tf32`: False
844
+ - `local_rank`: 0
845
+ - `ddp_backend`: None
846
+ - `tpu_num_cores`: None
847
+ - `tpu_metrics_debug`: False
848
+ - `debug`: []
849
+ - `dataloader_drop_last`: False
850
+ - `dataloader_num_workers`: 0
851
+ - `dataloader_prefetch_factor`: None
852
+ - `past_index`: -1
853
+ - `disable_tqdm`: False
854
+ - `remove_unused_columns`: True
855
+ - `label_names`: None
856
+ - `load_best_model_at_end`: True
857
+ - `ignore_data_skip`: False
858
+ - `fsdp`: []
859
+ - `fsdp_min_num_params`: 0
860
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
861
+ - `fsdp_transformer_layer_cls_to_wrap`: None
862
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
863
+ - `deepspeed`: None
864
+ - `label_smoothing_factor`: 0.0
865
+ - `optim`: adamw_torch_fused
866
+ - `optim_args`: None
867
+ - `adafactor`: False
868
+ - `group_by_length`: False
869
+ - `length_column_name`: length
870
+ - `ddp_find_unused_parameters`: None
871
+ - `ddp_bucket_cap_mb`: None
872
+ - `ddp_broadcast_buffers`: False
873
+ - `dataloader_pin_memory`: True
874
+ - `dataloader_persistent_workers`: False
875
+ - `skip_memory_metrics`: True
876
+ - `use_legacy_prediction_loop`: False
877
+ - `push_to_hub`: False
878
+ - `resume_from_checkpoint`: None
879
+ - `hub_model_id`: None
880
+ - `hub_strategy`: every_save
881
+ - `hub_private_repo`: False
882
+ - `hub_always_push`: False
883
+ - `gradient_checkpointing`: False
884
+ - `gradient_checkpointing_kwargs`: None
885
+ - `include_inputs_for_metrics`: False
886
+ - `eval_do_concat_batches`: True
887
+ - `fp16_backend`: auto
888
+ - `push_to_hub_model_id`: None
889
+ - `push_to_hub_organization`: None
890
+ - `mp_parameters`:
891
+ - `auto_find_batch_size`: False
892
+ - `full_determinism`: False
893
+ - `torchdynamo`: None
894
+ - `ray_scope`: last
895
+ - `ddp_timeout`: 1800
896
+ - `torch_compile`: False
897
+ - `torch_compile_backend`: None
898
+ - `torch_compile_mode`: None
899
+ - `dispatch_batches`: None
900
+ - `split_batches`: None
901
+ - `include_tokens_per_second`: False
902
+ - `include_num_input_tokens_seen`: False
903
+ - `neftune_noise_alpha`: None
904
+ - `optim_target_modules`: None
905
+ - `batch_eval_metrics`: False
906
+ - `batch_sampler`: no_duplicates
907
+ - `multi_dataset_batch_sampler`: proportional
908
+
909
+ </details>
910
+
911
+ ### Training Logs
912
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
913
+ |:----------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
914
+ | 1.5385 | 10 | 7.981 | - | - | - | - | - |
915
+ | 3.0769 | 20 | 0.9258 | - | - | - | - | - |
916
+ | **4.6154** | **30** | **0.1708** | **0.0285** | **0.0294** | **0.0293** | **0.0302** | **0.0297** |
917
+
918
+ * The bold row denotes the saved checkpoint.
919
+
920
+ ### Framework Versions
921
+ - Python: 3.10.12
922
+ - Sentence Transformers: 3.0.1
923
+ - Transformers: 4.41.2
924
+ - PyTorch: 2.1.2+cu121
925
+ - Accelerate: 0.32.1
926
+ - Datasets: 2.19.1
927
+ - Tokenizers: 0.19.1
928
+
929
+ ## Citation
930
+
931
+ ### BibTeX
932
+
933
+ #### Sentence Transformers
934
+ ```bibtex
935
+ @inproceedings{reimers-2019-sentence-bert,
936
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
937
+ author = "Reimers, Nils and Gurevych, Iryna",
938
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
939
+ month = "11",
940
+ year = "2019",
941
+ publisher = "Association for Computational Linguistics",
942
+ url = "https://arxiv.org/abs/1908.10084",
943
+ }
944
+ ```
945
+
946
+ #### MatryoshkaLoss
947
+ ```bibtex
948
+ @misc{kusupati2024matryoshka,
949
+ title={Matryoshka Representation Learning},
950
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
951
+ year={2024},
952
+ eprint={2205.13147},
953
+ archivePrefix={arXiv},
954
+ primaryClass={cs.LG}
955
+ }
956
+ ```
957
+
958
+ #### MultipleNegativesRankingLoss
959
+ ```bibtex
960
+ @misc{henderson2017efficient,
961
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
962
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
963
+ year={2017},
964
+ eprint={1705.00652},
965
+ archivePrefix={arXiv},
966
+ primaryClass={cs.CL}
967
+ }
968
+ ```
969
+
970
+ <!--
971
+ ## Glossary
972
+
973
+ *Clearly define terms in order to be accessible across audiences.*
974
+ -->
975
+
976
+ <!--
977
+ ## Model Card Authors
978
+
979
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
980
+ -->
981
+
982
+ <!--
983
+ ## Model Card Contact
984
+
985
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
986
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.41.2",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6c14618cf815a56268598d34741b854e979b284a29bff453c9bdccdbadf5d86
3
+ size 437951328
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff