Update README.md
Browse files
README.md
CHANGED
@@ -572,7 +572,7 @@ model-index:
|
|
572 |
|
573 |
# SentenceTransformer based on intfloat/multilingual-e5-large
|
574 |
|
575 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) on
|
576 |
|
577 |
## Model Details
|
578 |
|
@@ -583,7 +583,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [i
|
|
583 |
- **Output Dimensionality:** 1024 tokens
|
584 |
- **Similarity Function:** Cosine Similarity
|
585 |
- **Training Dataset:**
|
586 |
-
|
587 |
<!-- - **Language:** Unknown -->
|
588 |
<!-- - **License:** Unknown -->
|
589 |
|
@@ -917,9 +917,9 @@ You can finetune this model on your own dataset.
|
|
917 |
|
918 |
### Training Dataset
|
919 |
|
920 |
-
####
|
921 |
|
922 |
-
* Dataset:
|
923 |
* Size: 2,697 training samples
|
924 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
925 |
* Approximate statistics based on the first 1000 samples:
|
@@ -961,9 +961,9 @@ You can finetune this model on your own dataset.
|
|
961 |
|
962 |
### Evaluation Dataset
|
963 |
|
964 |
-
####
|
965 |
|
966 |
-
* Dataset:
|
967 |
* Size: 697 evaluation samples
|
968 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
969 |
* Approximate statistics based on the first 1000 samples:
|
|
|
572 |
|
573 |
# SentenceTransformer based on intfloat/multilingual-e5-large
|
574 |
|
575 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [intfloat/multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) on an augmented version of `stsb_multi_es` dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
576 |
|
577 |
## Model Details
|
578 |
|
|
|
583 |
- **Output Dimensionality:** 1024 tokens
|
584 |
- **Similarity Function:** Cosine Similarity
|
585 |
- **Training Dataset:**
|
586 |
+
- stsb_multi_es_aug
|
587 |
<!-- - **Language:** Unknown -->
|
588 |
<!-- - **License:** Unknown -->
|
589 |
|
|
|
917 |
|
918 |
### Training Dataset
|
919 |
|
920 |
+
#### stsb_multi_es_aug
|
921 |
|
922 |
+
* Dataset: stsb_multi_es_aug
|
923 |
* Size: 2,697 training samples
|
924 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
925 |
* Approximate statistics based on the first 1000 samples:
|
|
|
961 |
|
962 |
### Evaluation Dataset
|
963 |
|
964 |
+
#### stsb_multi_es_aug
|
965 |
|
966 |
+
* Dataset: stsb_multi_es_aug
|
967 |
* Size: 697 evaluation samples
|
968 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
969 |
* Approximate statistics based on the first 1000 samples:
|