Update README.md
Browse files
README.md
CHANGED
@@ -6,7 +6,7 @@ pipeline_tag: sentence-similarity
|
|
6 |
tags:
|
7 |
- ColBERT
|
8 |
base_model:
|
9 |
-
- aubmindlab/bert-base-
|
10 |
license: mit
|
11 |
library_name: RAGatouille
|
12 |
---
|
@@ -14,6 +14,8 @@ library_name: RAGatouille
|
|
14 |
|
15 |
# Arabic-ColBERT-100k
|
16 |
|
17 |
-
First version of Arabic ColBERT.
|
18 |
-
|
|
|
|
|
19 |
See https://www.linkedin.com/posts/akhooli_this-is-probably-the-first-arabic-colbert-activity-7217969205197848576-l8Cy
|
|
|
6 |
tags:
|
7 |
- ColBERT
|
8 |
base_model:
|
9 |
+
- aubmindlab/bert-base-arabertv02
|
10 |
license: mit
|
11 |
library_name: RAGatouille
|
12 |
---
|
|
|
14 |
|
15 |
# Arabic-ColBERT-100k
|
16 |
|
17 |
+
First version of Arabic ColBERT.
|
18 |
+
This model was trained on 100K random triplets of the [mMARCO dataset](https://huggingface.co/datasets/unicamp-dl/mmarco) which has around 39M Arabic (translated) triplets.
|
19 |
+
mMARCO is the multiligual version of [Microsoft's MARCO dataset](https://microsoft.github.io/msmarco/).
|
20 |
+
|
21 |
See https://www.linkedin.com/posts/akhooli_this-is-probably-the-first-arabic-colbert-activity-7217969205197848576-l8Cy
|