Omartificial-Intelligence-Space's picture
update readme.md
79abc35 verified
---
base_model: aubmindlab/bert-base-arabertv02
datasets:
- akhooli/arabic-triplets-1m-curated-sims-len
language:
- ar
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- transformers.js
- transformers
- sentence-similarity
- feature-extraction
- dataset_size:75000
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
- mteb
model-index:
- name: Omartificial-Intelligence-Space/Arabert-matro-v4
results:
- dataset:
config: ar-ar
name: MTEB STS17 (ar-ar)
revision: faeb762787bd10488a50c8b5be4a3b82e411949c
split: test
type: mteb/sts17-crosslingual-sts
metrics:
- type: cosine_pearson
value: 84.66883392015258
- type: cosine_spearman
value: 85.30520907141938
- type: euclidean_pearson
value: 82.04306779342852
- type: euclidean_spearman
value: 84.58744201847996
- type: main_score
value: 85.30520907141938
- type: manhattan_pearson
value: 82.08829357724328
- type: manhattan_spearman
value: 84.49254541383544
task:
type: STS
license: apache-2.0
---
# Arabic-Triplet-Matryoshka-V2-Model
- This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02).
- It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining,
text classification, clustering, and more.
- This model is trained on 1M samples from the [akhooli/arabic-triplets-1m-curated-sims-len](https://huggingface.co/datasets/akhooli/arabic-triplets-1m-curated-sims-len) dataset.
- Trained for 3 epochs, with final training loss of 0.718 (using MatryoshkaLoss).
```markdown
## Citation
If you use the Arabic Matryoshka Embeddings Model, please cite it as follows:
@misc{nacar2024enhancingsemanticsimilarityunderstanding,
title={Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning},
author={Omer Nacar and Anis Koubaa},
year={2024},
eprint={2407.21139},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2407.21139},
}