philipp-zettl
commited on
Commit
•
bbd7653
1
Parent(s):
3822642
Add new SentenceTransformer model.
Browse files- README.md +101 -103
- model.safetensors +1 -1
README.md
CHANGED
@@ -22,31 +22,31 @@ metrics:
|
|
22 |
- pearson_max
|
23 |
- spearman_max
|
24 |
widget:
|
25 |
-
- source_sentence:
|
26 |
sentences:
|
27 |
- order query
|
28 |
-
- support query
|
29 |
- faq query
|
30 |
-
- source_sentence: 马上给我提供这个商品的跟踪信息
|
31 |
-
sentences:
|
32 |
-
- payment query
|
33 |
- technical support query
|
34 |
-
|
35 |
-
- source_sentence: Downgrade my subscription plan
|
36 |
sentences:
|
37 |
-
-
|
|
|
38 |
- product query
|
39 |
-
|
40 |
-
- source_sentence: Help resolve issues with my operating system
|
41 |
sentences:
|
42 |
-
-
|
43 |
-
- product query
|
44 |
- product query
|
45 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
46 |
sentences:
|
47 |
- product query
|
48 |
-
-
|
49 |
-
-
|
50 |
pipeline_tag: sentence-similarity
|
51 |
model-index:
|
52 |
- name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
@@ -59,34 +59,34 @@ model-index:
|
|
59 |
type: MiniLM-dev
|
60 |
metrics:
|
61 |
- type: pearson_cosine
|
62 |
-
value: 0.
|
63 |
name: Pearson Cosine
|
64 |
- type: spearman_cosine
|
65 |
-
value: 0.
|
66 |
name: Spearman Cosine
|
67 |
- type: pearson_manhattan
|
68 |
-
value: 0.
|
69 |
name: Pearson Manhattan
|
70 |
- type: spearman_manhattan
|
71 |
-
value: 0.
|
72 |
name: Spearman Manhattan
|
73 |
- type: pearson_euclidean
|
74 |
-
value: 0.
|
75 |
name: Pearson Euclidean
|
76 |
- type: spearman_euclidean
|
77 |
-
value: 0.
|
78 |
name: Spearman Euclidean
|
79 |
- type: pearson_dot
|
80 |
-
value: 0.
|
81 |
name: Pearson Dot
|
82 |
- type: spearman_dot
|
83 |
-
value: 0.
|
84 |
name: Spearman Dot
|
85 |
- type: pearson_max
|
86 |
-
value: 0.
|
87 |
name: Pearson Max
|
88 |
- type: spearman_max
|
89 |
-
value: 0.
|
90 |
name: Spearman Max
|
91 |
- task:
|
92 |
type: semantic-similarity
|
@@ -96,34 +96,34 @@ model-index:
|
|
96 |
type: MiniLM-test
|
97 |
metrics:
|
98 |
- type: pearson_cosine
|
99 |
-
value: 0.
|
100 |
name: Pearson Cosine
|
101 |
- type: spearman_cosine
|
102 |
-
value: 0.
|
103 |
name: Spearman Cosine
|
104 |
- type: pearson_manhattan
|
105 |
-
value: 0.
|
106 |
name: Pearson Manhattan
|
107 |
- type: spearman_manhattan
|
108 |
-
value: 0.
|
109 |
name: Spearman Manhattan
|
110 |
- type: pearson_euclidean
|
111 |
-
value: 0.
|
112 |
name: Pearson Euclidean
|
113 |
- type: spearman_euclidean
|
114 |
-
value: 0.
|
115 |
name: Spearman Euclidean
|
116 |
- type: pearson_dot
|
117 |
-
value: 0.
|
118 |
name: Pearson Dot
|
119 |
- type: spearman_dot
|
120 |
-
value: 0.
|
121 |
name: Spearman Dot
|
122 |
- type: pearson_max
|
123 |
-
value: 0.
|
124 |
name: Pearson Max
|
125 |
- type: spearman_max
|
126 |
-
value: 0.
|
127 |
name: Spearman Max
|
128 |
---
|
129 |
|
@@ -176,9 +176,9 @@ from sentence_transformers import SentenceTransformer
|
|
176 |
model = SentenceTransformer("philipp-zettl/MiniLM-similarity-small")
|
177 |
# Run inference
|
178 |
sentences = [
|
179 |
-
'
|
180 |
-
'
|
181 |
-
'
|
182 |
]
|
183 |
embeddings = model.encode(sentences)
|
184 |
print(embeddings.shape)
|
@@ -222,18 +222,18 @@ You can finetune this model on your own dataset.
|
|
222 |
* Dataset: `MiniLM-dev`
|
223 |
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
224 |
|
225 |
-
| Metric | Value
|
226 |
-
|
227 |
-
| pearson_cosine | 0.
|
228 |
-
| **spearman_cosine** | **0.
|
229 |
-
| pearson_manhattan | 0.
|
230 |
-
| spearman_manhattan | 0.
|
231 |
-
| pearson_euclidean | 0.
|
232 |
-
| spearman_euclidean | 0.
|
233 |
-
| pearson_dot | 0.
|
234 |
-
| spearman_dot | 0.
|
235 |
-
| pearson_max | 0.
|
236 |
-
| spearman_max | 0.
|
237 |
|
238 |
#### Semantic Similarity
|
239 |
* Dataset: `MiniLM-test`
|
@@ -241,16 +241,16 @@ You can finetune this model on your own dataset.
|
|
241 |
|
242 |
| Metric | Value |
|
243 |
|:--------------------|:-----------|
|
244 |
-
| pearson_cosine | 0.
|
245 |
-
| **spearman_cosine** | **0.
|
246 |
-
| pearson_manhattan | 0.
|
247 |
-
| spearman_manhattan | 0.
|
248 |
-
| pearson_euclidean | 0.
|
249 |
-
| spearman_euclidean | 0.
|
250 |
-
| pearson_dot | 0.
|
251 |
-
| spearman_dot | 0.
|
252 |
-
| pearson_max | 0.
|
253 |
-
| spearman_max | 0.
|
254 |
|
255 |
<!--
|
256 |
## Bias, Risks and Limitations
|
@@ -274,16 +274,16 @@ You can finetune this model on your own dataset.
|
|
274 |
* Size: 844 training samples
|
275 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
276 |
* Approximate statistics based on the first 1000 samples:
|
277 |
-
| | sentence1
|
278 |
-
|
279 |
-
| type | string
|
280 |
-
| details | <ul><li>min: 6 tokens</li><li>mean: 10.
|
281 |
* Samples:
|
282 |
-
| sentence1
|
283 |
-
|
284 |
-
| <code
|
285 |
-
| <code
|
286 |
-
| <code
|
287 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
288 |
```json
|
289 |
{
|
@@ -303,13 +303,13 @@ You can finetune this model on your own dataset.
|
|
303 |
| | sentence1 | sentence2 | score |
|
304 |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
305 |
| type | string | string | float |
|
306 |
-
| details | <ul><li>min:
|
307 |
* Samples:
|
308 |
-
| sentence1
|
309 |
-
|
310 |
-
| <code
|
311 |
-
| <code>
|
312 |
-
| <code
|
313 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
314 |
```json
|
315 |
{
|
@@ -322,10 +322,8 @@ You can finetune this model on your own dataset.
|
|
322 |
#### Non-Default Hyperparameters
|
323 |
|
324 |
- `eval_strategy`: steps
|
325 |
-
- `per_device_train_batch_size`: 32
|
326 |
-
- `per_device_eval_batch_size`: 32
|
327 |
- `learning_rate`: 2e-05
|
328 |
-
- `num_train_epochs`:
|
329 |
- `warmup_ratio`: 0.1
|
330 |
- `fp16`: True
|
331 |
- `batch_sampler`: no_duplicates
|
@@ -337,8 +335,8 @@ You can finetune this model on your own dataset.
|
|
337 |
- `do_predict`: False
|
338 |
- `eval_strategy`: steps
|
339 |
- `prediction_loss_only`: True
|
340 |
-
- `per_device_train_batch_size`:
|
341 |
-
- `per_device_eval_batch_size`:
|
342 |
- `per_gpu_train_batch_size`: None
|
343 |
- `per_gpu_eval_batch_size`: None
|
344 |
- `gradient_accumulation_steps`: 1
|
@@ -349,7 +347,7 @@ You can finetune this model on your own dataset.
|
|
349 |
- `adam_beta2`: 0.999
|
350 |
- `adam_epsilon`: 1e-08
|
351 |
- `max_grad_norm`: 1.0
|
352 |
-
- `num_train_epochs`:
|
353 |
- `max_steps`: -1
|
354 |
- `lr_scheduler_type`: linear
|
355 |
- `lr_scheduler_kwargs`: {}
|
@@ -447,28 +445,28 @@ You can finetune this model on your own dataset.
|
|
447 |
### Training Logs
|
448 |
| Epoch | Step | Training Loss | loss | MiniLM-dev_spearman_cosine | MiniLM-test_spearman_cosine |
|
449 |
|:------:|:----:|:-------------:|:------:|:--------------------------:|:---------------------------:|
|
450 |
-
| 0.
|
451 |
-
| 0.
|
452 |
-
|
|
453 |
-
|
|
454 |
-
|
|
455 |
-
|
|
456 |
-
|
|
457 |
-
|
|
458 |
-
|
|
459 |
-
|
|
460 |
-
|
|
461 |
-
|
|
462 |
-
|
|
463 |
-
|
|
464 |
-
|
|
465 |
-
|
|
466 |
-
|
|
467 |
-
|
|
468 |
-
|
|
469 |
-
|
|
470 |
-
|
|
471 |
-
|
|
472 |
|
473 |
|
474 |
### Framework Versions
|
|
|
22 |
- pearson_max
|
23 |
- spearman_max
|
24 |
widget:
|
25 |
+
- source_sentence: Help fix a problem with my device’s battery life
|
26 |
sentences:
|
27 |
- order query
|
|
|
28 |
- faq query
|
|
|
|
|
|
|
29 |
- technical support query
|
30 |
+
- source_sentence: 订购一双运动鞋
|
|
|
31 |
sentences:
|
32 |
+
- service request
|
33 |
+
- feedback query
|
34 |
- product query
|
35 |
+
- source_sentence: 告诉我如何更改我的密码
|
|
|
36 |
sentences:
|
37 |
+
- support query
|
|
|
38 |
- product query
|
39 |
+
- faq query
|
40 |
+
- source_sentence: Get information on the next local festival
|
41 |
+
sentences:
|
42 |
+
- event inquiry
|
43 |
+
- service request
|
44 |
+
- account query
|
45 |
+
- source_sentence: Change the currency for my payment
|
46 |
sentences:
|
47 |
- product query
|
48 |
+
- payment query
|
49 |
+
- faq query
|
50 |
pipeline_tag: sentence-similarity
|
51 |
model-index:
|
52 |
- name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
|
|
59 |
type: MiniLM-dev
|
60 |
metrics:
|
61 |
- type: pearson_cosine
|
62 |
+
value: 0.7356955662825808
|
63 |
name: Pearson Cosine
|
64 |
- type: spearman_cosine
|
65 |
+
value: 0.7320761390174187
|
66 |
name: Spearman Cosine
|
67 |
- type: pearson_manhattan
|
68 |
+
value: 0.6240041985776243
|
69 |
name: Pearson Manhattan
|
70 |
- type: spearman_manhattan
|
71 |
+
value: 0.6179783414452009
|
72 |
name: Spearman Manhattan
|
73 |
- type: pearson_euclidean
|
74 |
+
value: 0.6321466982201008
|
75 |
name: Pearson Euclidean
|
76 |
- type: spearman_euclidean
|
77 |
+
value: 0.6296964936282937
|
78 |
name: Spearman Euclidean
|
79 |
- type: pearson_dot
|
80 |
+
value: 0.7491168439451736
|
81 |
name: Pearson Dot
|
82 |
- type: spearman_dot
|
83 |
+
value: 0.7592129124940543
|
84 |
name: Spearman Dot
|
85 |
- type: pearson_max
|
86 |
+
value: 0.7491168439451736
|
87 |
name: Pearson Max
|
88 |
- type: spearman_max
|
89 |
+
value: 0.7592129124940543
|
90 |
name: Spearman Max
|
91 |
- task:
|
92 |
type: semantic-similarity
|
|
|
96 |
type: MiniLM-test
|
97 |
metrics:
|
98 |
- type: pearson_cosine
|
99 |
+
value: 0.7687106130417081
|
100 |
name: Pearson Cosine
|
101 |
- type: spearman_cosine
|
102 |
+
value: 0.7552108666502075
|
103 |
name: Spearman Cosine
|
104 |
- type: pearson_manhattan
|
105 |
+
value: 0.7462708006775693
|
106 |
name: Pearson Manhattan
|
107 |
- type: spearman_manhattan
|
108 |
+
value: 0.7365483246407295
|
109 |
name: Spearman Manhattan
|
110 |
- type: pearson_euclidean
|
111 |
+
value: 0.7545194410402545
|
112 |
name: Pearson Euclidean
|
113 |
- type: spearman_euclidean
|
114 |
+
value: 0.7465016803791179
|
115 |
name: Spearman Euclidean
|
116 |
- type: pearson_dot
|
117 |
+
value: 0.7251488155932073
|
118 |
name: Pearson Dot
|
119 |
- type: spearman_dot
|
120 |
+
value: 0.7390366635753267
|
121 |
name: Spearman Dot
|
122 |
- type: pearson_max
|
123 |
+
value: 0.7687106130417081
|
124 |
name: Pearson Max
|
125 |
- type: spearman_max
|
126 |
+
value: 0.7552108666502075
|
127 |
name: Spearman Max
|
128 |
---
|
129 |
|
|
|
176 |
model = SentenceTransformer("philipp-zettl/MiniLM-similarity-small")
|
177 |
# Run inference
|
178 |
sentences = [
|
179 |
+
'Change the currency for my payment',
|
180 |
+
'payment query',
|
181 |
+
'faq query',
|
182 |
]
|
183 |
embeddings = model.encode(sentences)
|
184 |
print(embeddings.shape)
|
|
|
222 |
* Dataset: `MiniLM-dev`
|
223 |
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
|
224 |
|
225 |
+
| Metric | Value |
|
226 |
+
|:--------------------|:-----------|
|
227 |
+
| pearson_cosine | 0.7357 |
|
228 |
+
| **spearman_cosine** | **0.7321** |
|
229 |
+
| pearson_manhattan | 0.624 |
|
230 |
+
| spearman_manhattan | 0.618 |
|
231 |
+
| pearson_euclidean | 0.6321 |
|
232 |
+
| spearman_euclidean | 0.6297 |
|
233 |
+
| pearson_dot | 0.7491 |
|
234 |
+
| spearman_dot | 0.7592 |
|
235 |
+
| pearson_max | 0.7491 |
|
236 |
+
| spearman_max | 0.7592 |
|
237 |
|
238 |
#### Semantic Similarity
|
239 |
* Dataset: `MiniLM-test`
|
|
|
241 |
|
242 |
| Metric | Value |
|
243 |
|:--------------------|:-----------|
|
244 |
+
| pearson_cosine | 0.7687 |
|
245 |
+
| **spearman_cosine** | **0.7552** |
|
246 |
+
| pearson_manhattan | 0.7463 |
|
247 |
+
| spearman_manhattan | 0.7365 |
|
248 |
+
| pearson_euclidean | 0.7545 |
|
249 |
+
| spearman_euclidean | 0.7465 |
|
250 |
+
| pearson_dot | 0.7251 |
|
251 |
+
| spearman_dot | 0.739 |
|
252 |
+
| pearson_max | 0.7687 |
|
253 |
+
| spearman_max | 0.7552 |
|
254 |
|
255 |
<!--
|
256 |
## Bias, Risks and Limitations
|
|
|
274 |
* Size: 844 training samples
|
275 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
276 |
* Approximate statistics based on the first 1000 samples:
|
277 |
+
| | sentence1 | sentence2 | score |
|
278 |
+
|:--------|:---------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
279 |
+
| type | string | string | float |
|
280 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 10.8 tokens</li><li>max: 19 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.33 tokens</li><li>max: 6 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.49</li><li>max: 1.0</li></ul> |
|
281 |
* Samples:
|
282 |
+
| sentence1 | sentence2 | score |
|
283 |
+
|:----------------------------------------------------------------|:---------------------------|:-----------------|
|
284 |
+
| <code>Update the payment method for my order</code> | <code>order query</code> | <code>1.0</code> |
|
285 |
+
| <code>Не могу установить новое обновление, помогите!</code> | <code>support query</code> | <code>1.0</code> |
|
286 |
+
| <code>Помогите мне изменить настройки конфиденциальности</code> | <code>support query</code> | <code>1.0</code> |
|
287 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
288 |
```json
|
289 |
{
|
|
|
303 |
| | sentence1 | sentence2 | score |
|
304 |
|:--------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
305 |
| type | string | string | float |
|
306 |
+
| details | <ul><li>min: 6 tokens</li><li>mean: 10.79 tokens</li><li>max: 15 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 5.27 tokens</li><li>max: 6 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.51</li><li>max: 1.0</li></ul> |
|
307 |
* Samples:
|
308 |
+
| sentence1 | sentence2 | score |
|
309 |
+
|:----------------------------------------------------------------|:-------------------------------------|:-----------------|
|
310 |
+
| <code>帮我修复系统错误</code> | <code>support query</code> | <code>1.0</code> |
|
311 |
+
| <code>Je veux commander une pizza</code> | <code>product query</code> | <code>1.0</code> |
|
312 |
+
| <code>Fix problems with my device’s Bluetooth connection</code> | <code>technical support query</code> | <code>1.0</code> |
|
313 |
* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
|
314 |
```json
|
315 |
{
|
|
|
322 |
#### Non-Default Hyperparameters
|
323 |
|
324 |
- `eval_strategy`: steps
|
|
|
|
|
325 |
- `learning_rate`: 2e-05
|
326 |
+
- `num_train_epochs`: 2
|
327 |
- `warmup_ratio`: 0.1
|
328 |
- `fp16`: True
|
329 |
- `batch_sampler`: no_duplicates
|
|
|
335 |
- `do_predict`: False
|
336 |
- `eval_strategy`: steps
|
337 |
- `prediction_loss_only`: True
|
338 |
+
- `per_device_train_batch_size`: 8
|
339 |
+
- `per_device_eval_batch_size`: 8
|
340 |
- `per_gpu_train_batch_size`: None
|
341 |
- `per_gpu_eval_batch_size`: None
|
342 |
- `gradient_accumulation_steps`: 1
|
|
|
347 |
- `adam_beta2`: 0.999
|
348 |
- `adam_epsilon`: 1e-08
|
349 |
- `max_grad_norm`: 1.0
|
350 |
+
- `num_train_epochs`: 2
|
351 |
- `max_steps`: -1
|
352 |
- `lr_scheduler_type`: linear
|
353 |
- `lr_scheduler_kwargs`: {}
|
|
|
445 |
### Training Logs
|
446 |
| Epoch | Step | Training Loss | loss | MiniLM-dev_spearman_cosine | MiniLM-test_spearman_cosine |
|
447 |
|:------:|:----:|:-------------:|:------:|:--------------------------:|:---------------------------:|
|
448 |
+
| 0.0943 | 10 | 4.0771 | 2.2054 | 0.2529 | - |
|
449 |
+
| 0.1887 | 20 | 4.4668 | 1.8221 | 0.3528 | - |
|
450 |
+
| 0.2830 | 30 | 2.5459 | 1.5545 | 0.4638 | - |
|
451 |
+
| 0.3774 | 40 | 2.1926 | 1.3145 | 0.5569 | - |
|
452 |
+
| 0.4717 | 50 | 0.9001 | 1.1653 | 0.6285 | - |
|
453 |
+
| 0.5660 | 60 | 1.4049 | 1.0734 | 0.6834 | - |
|
454 |
+
| 0.6604 | 70 | 0.7204 | 0.9951 | 0.6988 | - |
|
455 |
+
| 0.7547 | 80 | 1.4023 | 1.1213 | 0.6945 | - |
|
456 |
+
| 0.8491 | 90 | 0.2315 | 1.2931 | 0.6414 | - |
|
457 |
+
| 0.9434 | 100 | 0.0018 | 1.3904 | 0.6180 | - |
|
458 |
+
| 1.0377 | 110 | 0.0494 | 1.2889 | 0.6322 | - |
|
459 |
+
| 1.1321 | 120 | 0.3156 | 1.2461 | 0.6402 | - |
|
460 |
+
| 1.2264 | 130 | 1.8153 | 1.0844 | 0.6716 | - |
|
461 |
+
| 1.3208 | 140 | 0.2638 | 0.9939 | 0.6957 | - |
|
462 |
+
| 1.4151 | 150 | 0.5454 | 0.9545 | 0.7056 | - |
|
463 |
+
| 1.5094 | 160 | 0.3421 | 0.9699 | 0.7062 | - |
|
464 |
+
| 1.6038 | 170 | 0.0035 | 0.9521 | 0.7093 | - |
|
465 |
+
| 1.6981 | 180 | 0.0401 | 0.8988 | 0.7160 | - |
|
466 |
+
| 1.7925 | 190 | 0.8138 | 0.8619 | 0.7271 | - |
|
467 |
+
| 1.8868 | 200 | 0.0236 | 0.8449 | 0.7315 | - |
|
468 |
+
| 1.9811 | 210 | 0.0012 | 0.8438 | 0.7321 | - |
|
469 |
+
| 2.0 | 212 | - | - | - | 0.7552 |
|
470 |
|
471 |
|
472 |
### Framework Versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 470637416
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:944d69b0e22c70edbadcb4a35df9b7c8243f8601d9962798cbea41342b1c6406
|
3 |
size 470637416
|