omarelshehy
commited on
Commit
•
9389a36
1
Parent(s):
52b9509
Update README.md
Browse files
README.md
CHANGED
@@ -113,11 +113,17 @@ language:
|
|
113 |
|
114 |
# SentenceTransformer based on FacebookAI/xlm-roberta-large
|
115 |
|
116 |
-
This is a
|
117 |
|
118 |
-
The model
|
119 |
|
120 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
121 |
|
122 |
## Model Details
|
123 |
|
|
|
113 |
|
114 |
# SentenceTransformer based on FacebookAI/xlm-roberta-large
|
115 |
|
116 |
+
This is a **Bilingual** (Arabic-English) [sentence-transformers](https://www.SBERT.net) model finetuned from [FacebookAI/xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for **semantic textual similarity, semantic search, paraphrase mining, text classification, clustering**, and more.
|
117 |
|
118 |
+
The model handles both languages separately 🌐, but also interchangeably, which unlocks flexible applications for developers and researchers who want to further build on Arabic models! 💡
|
119 |
|
120 |
+
📊 Metrics from MTEB are promising, but don't just rely on them — test the model yourself and see if it fits your needs! ✅
|
121 |
+
|
122 |
+
## Matryoshka Embeddings 🪆
|
123 |
+
|
124 |
+
This model supports Matryoshka embeddings, allowing you to truncate embeddings into smaller sizes to optimize performance and memory usage, based on your task requirements. Available truncation sizes include: **1024, 768, 512, 256, 128, and 64**
|
125 |
+
|
126 |
+
You can select the appropriate embedding size for your use case, ensuring flexibility in resource management.
|
127 |
|
128 |
## Model Details
|
129 |
|