philipp-zettl commited on
Commit
bbd7653
1 Parent(s): 3822642

Add new SentenceTransformer model.

Browse files
Files changed (2) hide show
  1. README.md +101 -103
  2. model.safetensors +1 -1
README.md CHANGED
@@ -22,31 +22,31 @@ metrics:
22
  - pearson_max
23
  - spearman_max
24
  widget:
25
- - source_sentence: Hilf mir, das Software-Update durchzuführen
26
  sentences:
27
  - order query
28
- - support query
29
  - faq query
30
- - source_sentence: 马上给我提供这个商品的跟踪信息
31
- sentences:
32
- - payment query
33
  - technical support query
34
- - support query
35
- - source_sentence: Downgrade my subscription plan
36
  sentences:
37
- - support query
 
38
  - product query
39
- - product query
40
- - source_sentence: Help resolve issues with my operating system
41
  sentences:
42
- - technical support query
43
- - product query
44
  - product query
45
- - source_sentence: Ayúdame a solucionar problemas de red
 
 
 
 
 
 
46
  sentences:
47
  - product query
48
- - support query
49
- - product query
50
  pipeline_tag: sentence-similarity
51
  model-index:
52
  - name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
@@ -59,34 +59,34 @@ model-index:
59
  type: MiniLM-dev
60
  metrics:
61
  - type: pearson_cosine
62
- value: 0.7960441122484267
63
  name: Pearson Cosine
64
  - type: spearman_cosine
65
- value: 0.8189711310679958
66
  name: Spearman Cosine
67
  - type: pearson_manhattan
68
- value: 0.6824455970208276
69
  name: Pearson Manhattan
70
  - type: spearman_manhattan
71
- value: 0.701004701178111
72
  name: Spearman Manhattan
73
  - type: pearson_euclidean
74
- value: 0.6821384996384094
75
  name: Pearson Euclidean
76
  - type: spearman_euclidean
77
- value: 0.7065633287645454
78
  name: Spearman Euclidean
79
  - type: pearson_dot
80
- value: 0.7871337514786776
81
  name: Pearson Dot
82
  - type: spearman_dot
83
- value: 0.7979718712970215
84
  name: Spearman Dot
85
  - type: pearson_max
86
- value: 0.7960441122484267
87
  name: Pearson Max
88
  - type: spearman_max
89
- value: 0.8189711310679958
90
  name: Spearman Max
91
  - task:
92
  type: semantic-similarity
@@ -96,34 +96,34 @@ model-index:
96
  type: MiniLM-test
97
  metrics:
98
  - type: pearson_cosine
99
- value: 0.7614418952584415
100
  name: Pearson Cosine
101
  - type: spearman_cosine
102
- value: 0.7585961676423125
103
  name: Spearman Cosine
104
  - type: pearson_manhattan
105
- value: 0.620319727073133
106
  name: Pearson Manhattan
107
  - type: spearman_manhattan
108
- value: 0.6192118311486844
109
  name: Spearman Manhattan
110
  - type: pearson_euclidean
111
- value: 0.6116132687052156
112
  name: Pearson Euclidean
113
  - type: spearman_euclidean
114
- value: 0.6124276377795256
115
  name: Spearman Euclidean
116
  - type: pearson_dot
117
- value: 0.7670292333817905
118
  name: Pearson Dot
119
  - type: spearman_dot
120
- value: 0.7764817683428225
121
  name: Spearman Dot
122
  - type: pearson_max
123
- value: 0.7670292333817905
124
  name: Pearson Max
125
  - type: spearman_max
126
- value: 0.7764817683428225
127
  name: Spearman Max
128
  ---
129
 
@@ -176,9 +176,9 @@ from sentence_transformers import SentenceTransformer
176
  model = SentenceTransformer("philipp-zettl/MiniLM-similarity-small")
177
  # Run inference
178
  sentences = [
179
- 'Ayúdame a solucionar problemas de red',
180
- 'support query',
181
- 'product query',
182
  ]
183
  embeddings = model.encode(sentences)
184
  print(embeddings.shape)
@@ -222,18 +222,18 @@ You can finetune this model on your own dataset.
222
  * Dataset: `MiniLM-dev`
223
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
224
 
225
- | Metric | Value |
226
- |:--------------------|:----------|
227
- | pearson_cosine | 0.796 |
228
- | **spearman_cosine** | **0.819** |
229
- | pearson_manhattan | 0.6824 |
230
- | spearman_manhattan | 0.701 |
231
- | pearson_euclidean | 0.6821 |
232
- | spearman_euclidean | 0.7066 |
233
- | pearson_dot | 0.7871 |
234
- | spearman_dot | 0.798 |
235
- | pearson_max | 0.796 |
236
- | spearman_max | 0.819 |
237
 
238
  #### Semantic Similarity
239
  * Dataset: `MiniLM-test`
@@ -241,16 +241,16 @@ You can finetune this model on your own dataset.
241
 
242
  | Metric | Value |
243
  |:--------------------|:-----------|
244
- | pearson_cosine | 0.7614 |
245
- | **spearman_cosine** | **0.7586** |
246
- | pearson_manhattan | 0.6203 |
247
- | spearman_manhattan | 0.6192 |
248
- | pearson_euclidean | 0.6116 |
249
- | spearman_euclidean | 0.6124 |
250
- | pearson_dot | 0.767 |
251
- | spearman_dot | 0.7765 |
252
- | pearson_max | 0.767 |
253
- | spearman_max | 0.7765 |
254
 
255
  <!--
256
  ## Bias, Risks and Limitations
@@ -274,16 +274,16 @@ You can finetune this model on your own dataset.
274
  * Size: 844 training samples
275
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
276
  * Approximate statistics based on the first 1000 samples:
277
- | | sentence1 | sentence2 | score |
278
- |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:--------------------------------------------------------------|
279
- | type | string | string | float |
280
- | details | <ul><li>min: 6 tokens</li><li>mean: 10.83 tokens</li><li>max: 19 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.34 tokens</li><li>max: 6 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.5</li><li>max: 1.0</li></ul> |
281
  * Samples:
282
- | sentence1 | sentence2 | score |
283
- |:-----------------------------------------------------------------|:---------------------------|:-----------------|
284
- | <code>Покажите мне доступные гостиницы в Москве</code> | <code>product query</code> | <code>1.0</code> |
285
- | <code>أرني العروض المتاحة على الهواتف الذكية</code> | <code>product query</code> | <code>1.0</code> |
286
- | <code>Tengo problemas con el micrófono, ¿puedes ayudarme?</code> | <code>product query</code> | <code>0.0</code> |
287
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
288
  ```json
289
  {
@@ -303,13 +303,13 @@ You can finetune this model on your own dataset.
303
  | | sentence1 | sentence2 | score |
304
  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
305
  | type | string | string | float |
306
- | details | <ul><li>min: 7 tokens</li><li>mean: 10.63 tokens</li><li>max: 17 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.32 tokens</li><li>max: 6 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.47</li><li>max: 1.0</li></ul> |
307
  * Samples:
308
- | sentence1 | sentence2 | score |
309
- |:---------------------------------------------------------|:---------------------------|:-----------------|
310
- | <code>Help me with device driver installation</code> | <code>product query</code> | <code>0.0</code> |
311
- | <code>Check the status of my account verification</code> | <code>product query</code> | <code>0.0</code> |
312
- | <code>我怎样重置我的密码?</code> | <code>product query</code> | <code>0.0</code> |
313
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
314
  ```json
315
  {
@@ -322,10 +322,8 @@ You can finetune this model on your own dataset.
322
  #### Non-Default Hyperparameters
323
 
324
  - `eval_strategy`: steps
325
- - `per_device_train_batch_size`: 32
326
- - `per_device_eval_batch_size`: 32
327
  - `learning_rate`: 2e-05
328
- - `num_train_epochs`: 8
329
  - `warmup_ratio`: 0.1
330
  - `fp16`: True
331
  - `batch_sampler`: no_duplicates
@@ -337,8 +335,8 @@ You can finetune this model on your own dataset.
337
  - `do_predict`: False
338
  - `eval_strategy`: steps
339
  - `prediction_loss_only`: True
340
- - `per_device_train_batch_size`: 32
341
- - `per_device_eval_batch_size`: 32
342
  - `per_gpu_train_batch_size`: None
343
  - `per_gpu_eval_batch_size`: None
344
  - `gradient_accumulation_steps`: 1
@@ -349,7 +347,7 @@ You can finetune this model on your own dataset.
349
  - `adam_beta2`: 0.999
350
  - `adam_epsilon`: 1e-08
351
  - `max_grad_norm`: 1.0
352
- - `num_train_epochs`: 8
353
  - `max_steps`: -1
354
  - `lr_scheduler_type`: linear
355
  - `lr_scheduler_kwargs`: {}
@@ -447,28 +445,28 @@ You can finetune this model on your own dataset.
447
  ### Training Logs
448
  | Epoch | Step | Training Loss | loss | MiniLM-dev_spearman_cosine | MiniLM-test_spearman_cosine |
449
  |:------:|:----:|:-------------:|:------:|:--------------------------:|:---------------------------:|
450
- | 0.3704 | 10 | 5.613 | 1.4994 | 0.2761 | - |
451
- | 0.7407 | 20 | 4.8872 | 1.3690 | 0.3483 | - |
452
- | 1.1111 | 30 | 3.2993 | 1.0579 | 0.4657 | - |
453
- | 1.4815 | 40 | 2.1968 | 0.6858 | 0.5935 | - |
454
- | 1.8519 | 50 | 0.7306 | 0.5191 | 0.6528 | - |
455
- | 2.2222 | 60 | 0.9746 | 0.3735 | 0.6998 | - |
456
- | 2.5926 | 70 | 0.3889 | 0.3532 | 0.7393 | - |
457
- | 2.9630 | 80 | 0.1857 | 0.3598 | 0.7554 | - |
458
- | 3.3333 | 90 | 0.2923 | 0.2795 | 0.7714 | - |
459
- | 3.7037 | 100 | 0.6776 | 0.2881 | 0.7825 | - |
460
- | 4.0741 | 110 | 0.2404 | 0.2679 | 0.7887 | - |
461
- | 4.4444 | 120 | 0.0168 | 0.2583 | 0.7918 | - |
462
- | 4.8148 | 130 | 0.0179 | 0.2273 | 0.7980 | - |
463
- | 5.1852 | 140 | 0.0006 | 0.2196 | 0.8023 | - |
464
- | 5.5556 | 150 | 0.0276 | 0.2068 | 0.8066 | - |
465
- | 5.9259 | 160 | 0.061 | 0.2063 | 0.8103 | - |
466
- | 6.2963 | 170 | 0.0265 | 0.2259 | 0.8103 | - |
467
- | 6.6667 | 180 | 0.0105 | 0.2236 | 0.8165 | - |
468
- | 7.0370 | 190 | 0.0008 | 0.2208 | 0.8177 | - |
469
- | 7.4074 | 200 | 0.361 | 0.2340 | 0.8171 | - |
470
- | 7.7778 | 210 | 0.0 | 0.2345 | 0.8190 | - |
471
- | 8.0 | 216 | - | - | - | 0.7586 |
472
 
473
 
474
  ### Framework Versions
 
22
  - pearson_max
23
  - spearman_max
24
  widget:
25
+ - source_sentence: Help fix a problem with my device’s battery life
26
  sentences:
27
  - order query
 
28
  - faq query
 
 
 
29
  - technical support query
30
+ - source_sentence: 订购一双运动鞋
 
31
  sentences:
32
+ - service request
33
+ - feedback query
34
  - product query
35
+ - source_sentence: 告诉我如何更改我的密码
 
36
  sentences:
37
+ - support query
 
38
  - product query
39
+ - faq query
40
+ - source_sentence: Get information on the next local festival
41
+ sentences:
42
+ - event inquiry
43
+ - service request
44
+ - account query
45
+ - source_sentence: Change the currency for my payment
46
  sentences:
47
  - product query
48
+ - payment query
49
+ - faq query
50
  pipeline_tag: sentence-similarity
51
  model-index:
52
  - name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
 
59
  type: MiniLM-dev
60
  metrics:
61
  - type: pearson_cosine
62
+ value: 0.7356955662825808
63
  name: Pearson Cosine
64
  - type: spearman_cosine
65
+ value: 0.7320761390174187
66
  name: Spearman Cosine
67
  - type: pearson_manhattan
68
+ value: 0.6240041985776243
69
  name: Pearson Manhattan
70
  - type: spearman_manhattan
71
+ value: 0.6179783414452009
72
  name: Spearman Manhattan
73
  - type: pearson_euclidean
74
+ value: 0.6321466982201008
75
  name: Pearson Euclidean
76
  - type: spearman_euclidean
77
+ value: 0.6296964936282937
78
  name: Spearman Euclidean
79
  - type: pearson_dot
80
+ value: 0.7491168439451736
81
  name: Pearson Dot
82
  - type: spearman_dot
83
+ value: 0.7592129124940543
84
  name: Spearman Dot
85
  - type: pearson_max
86
+ value: 0.7491168439451736
87
  name: Pearson Max
88
  - type: spearman_max
89
+ value: 0.7592129124940543
90
  name: Spearman Max
91
  - task:
92
  type: semantic-similarity
 
96
  type: MiniLM-test
97
  metrics:
98
  - type: pearson_cosine
99
+ value: 0.7687106130417081
100
  name: Pearson Cosine
101
  - type: spearman_cosine
102
+ value: 0.7552108666502075
103
  name: Spearman Cosine
104
  - type: pearson_manhattan
105
+ value: 0.7462708006775693
106
  name: Pearson Manhattan
107
  - type: spearman_manhattan
108
+ value: 0.7365483246407295
109
  name: Spearman Manhattan
110
  - type: pearson_euclidean
111
+ value: 0.7545194410402545
112
  name: Pearson Euclidean
113
  - type: spearman_euclidean
114
+ value: 0.7465016803791179
115
  name: Spearman Euclidean
116
  - type: pearson_dot
117
+ value: 0.7251488155932073
118
  name: Pearson Dot
119
  - type: spearman_dot
120
+ value: 0.7390366635753267
121
  name: Spearman Dot
122
  - type: pearson_max
123
+ value: 0.7687106130417081
124
  name: Pearson Max
125
  - type: spearman_max
126
+ value: 0.7552108666502075
127
  name: Spearman Max
128
  ---
129
 
 
176
  model = SentenceTransformer("philipp-zettl/MiniLM-similarity-small")
177
  # Run inference
178
  sentences = [
179
+ 'Change the currency for my payment',
180
+ 'payment query',
181
+ 'faq query',
182
  ]
183
  embeddings = model.encode(sentences)
184
  print(embeddings.shape)
 
222
  * Dataset: `MiniLM-dev`
223
  * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
224
 
225
+ | Metric | Value |
226
+ |:--------------------|:-----------|
227
+ | pearson_cosine | 0.7357 |
228
+ | **spearman_cosine** | **0.7321** |
229
+ | pearson_manhattan | 0.624 |
230
+ | spearman_manhattan | 0.618 |
231
+ | pearson_euclidean | 0.6321 |
232
+ | spearman_euclidean | 0.6297 |
233
+ | pearson_dot | 0.7491 |
234
+ | spearman_dot | 0.7592 |
235
+ | pearson_max | 0.7491 |
236
+ | spearman_max | 0.7592 |
237
 
238
  #### Semantic Similarity
239
  * Dataset: `MiniLM-test`
 
241
 
242
  | Metric | Value |
243
  |:--------------------|:-----------|
244
+ | pearson_cosine | 0.7687 |
245
+ | **spearman_cosine** | **0.7552** |
246
+ | pearson_manhattan | 0.7463 |
247
+ | spearman_manhattan | 0.7365 |
248
+ | pearson_euclidean | 0.7545 |
249
+ | spearman_euclidean | 0.7465 |
250
+ | pearson_dot | 0.7251 |
251
+ | spearman_dot | 0.739 |
252
+ | pearson_max | 0.7687 |
253
+ | spearman_max | 0.7552 |
254
 
255
  <!--
256
  ## Bias, Risks and Limitations
 
274
  * Size: 844 training samples
275
  * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
276
  * Approximate statistics based on the first 1000 samples:
277
+ | | sentence1 | sentence2 | score |
278
+ |:--------|:---------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
279
+ | type | string | string | float |
280
+ | details | <ul><li>min: 6 tokens</li><li>mean: 10.8 tokens</li><li>max: 19 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.33 tokens</li><li>max: 6 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.49</li><li>max: 1.0</li></ul> |
281
  * Samples:
282
+ | sentence1 | sentence2 | score |
283
+ |:----------------------------------------------------------------|:---------------------------|:-----------------|
284
+ | <code>Update the payment method for my order</code> | <code>order query</code> | <code>1.0</code> |
285
+ | <code>Не могу установить новое обновление, помогите!</code> | <code>support query</code> | <code>1.0</code> |
286
+ | <code>Помогите мне изменить настройки конфиденциальности</code> | <code>support query</code> | <code>1.0</code> |
287
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
288
  ```json
289
  {
 
303
  | | sentence1 | sentence2 | score |
304
  |:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
305
  | type | string | string | float |
306
+ | details | <ul><li>min: 6 tokens</li><li>mean: 10.79 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.27 tokens</li><li>max: 6 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.51</li><li>max: 1.0</li></ul> |
307
  * Samples:
308
+ | sentence1 | sentence2 | score |
309
+ |:----------------------------------------------------------------|:-------------------------------------|:-----------------|
310
+ | <code>帮我修复系统错误</code> | <code>support query</code> | <code>1.0</code> |
311
+ | <code>Je veux commander une pizza</code> | <code>product query</code> | <code>1.0</code> |
312
+ | <code>Fix problems with my device’s Bluetooth connection</code> | <code>technical support query</code> | <code>1.0</code> |
313
  * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
314
  ```json
315
  {
 
322
  #### Non-Default Hyperparameters
323
 
324
  - `eval_strategy`: steps
 
 
325
  - `learning_rate`: 2e-05
326
+ - `num_train_epochs`: 2
327
  - `warmup_ratio`: 0.1
328
  - `fp16`: True
329
  - `batch_sampler`: no_duplicates
 
335
  - `do_predict`: False
336
  - `eval_strategy`: steps
337
  - `prediction_loss_only`: True
338
+ - `per_device_train_batch_size`: 8
339
+ - `per_device_eval_batch_size`: 8
340
  - `per_gpu_train_batch_size`: None
341
  - `per_gpu_eval_batch_size`: None
342
  - `gradient_accumulation_steps`: 1
 
347
  - `adam_beta2`: 0.999
348
  - `adam_epsilon`: 1e-08
349
  - `max_grad_norm`: 1.0
350
+ - `num_train_epochs`: 2
351
  - `max_steps`: -1
352
  - `lr_scheduler_type`: linear
353
  - `lr_scheduler_kwargs`: {}
 
445
  ### Training Logs
446
  | Epoch | Step | Training Loss | loss | MiniLM-dev_spearman_cosine | MiniLM-test_spearman_cosine |
447
  |:------:|:----:|:-------------:|:------:|:--------------------------:|:---------------------------:|
448
+ | 0.0943 | 10 | 4.0771 | 2.2054 | 0.2529 | - |
449
+ | 0.1887 | 20 | 4.4668 | 1.8221 | 0.3528 | - |
450
+ | 0.2830 | 30 | 2.5459 | 1.5545 | 0.4638 | - |
451
+ | 0.3774 | 40 | 2.1926 | 1.3145 | 0.5569 | - |
452
+ | 0.4717 | 50 | 0.9001 | 1.1653 | 0.6285 | - |
453
+ | 0.5660 | 60 | 1.4049 | 1.0734 | 0.6834 | - |
454
+ | 0.6604 | 70 | 0.7204 | 0.9951 | 0.6988 | - |
455
+ | 0.7547 | 80 | 1.4023 | 1.1213 | 0.6945 | - |
456
+ | 0.8491 | 90 | 0.2315 | 1.2931 | 0.6414 | - |
457
+ | 0.9434 | 100 | 0.0018 | 1.3904 | 0.6180 | - |
458
+ | 1.0377 | 110 | 0.0494 | 1.2889 | 0.6322 | - |
459
+ | 1.1321 | 120 | 0.3156 | 1.2461 | 0.6402 | - |
460
+ | 1.2264 | 130 | 1.8153 | 1.0844 | 0.6716 | - |
461
+ | 1.3208 | 140 | 0.2638 | 0.9939 | 0.6957 | - |
462
+ | 1.4151 | 150 | 0.5454 | 0.9545 | 0.7056 | - |
463
+ | 1.5094 | 160 | 0.3421 | 0.9699 | 0.7062 | - |
464
+ | 1.6038 | 170 | 0.0035 | 0.9521 | 0.7093 | - |
465
+ | 1.6981 | 180 | 0.0401 | 0.8988 | 0.7160 | - |
466
+ | 1.7925 | 190 | 0.8138 | 0.8619 | 0.7271 | - |
467
+ | 1.8868 | 200 | 0.0236 | 0.8449 | 0.7315 | - |
468
+ | 1.9811 | 210 | 0.0012 | 0.8438 | 0.7321 | - |
469
+ | 2.0 | 212 | - | - | - | 0.7552 |
470
 
471
 
472
  ### Framework Versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9f39ac4242378e0521f1f13befca327f07a17029f5b7262ca4a7f5dcd050d435
3
  size 470637416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:944d69b0e22c70edbadcb4a35df9b7c8243f8601d9962798cbea41342b1c6406
3
  size 470637416