upload
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ tags:
|
|
7 |
- transformers
|
8 |
---
|
9 |
|
10 |
-
# msmarco-distilbert-
|
11 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and was designed for **semantic search**. It has been trained on 500K (query, answer) pairs from the [MS MARCO dataset](https://github.com/microsoft/MSMARCO-Passage-Ranking/). For an introduction to semantic search, have a look at: [SBERT.net - Semantic Search](https://www.sbert.net/examples/applications/semantic-search/README.html)
|
12 |
|
13 |
|
@@ -26,7 +26,7 @@ query = "How many people live in London?"
|
|
26 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
27 |
|
28 |
#Load the model
|
29 |
-
model = SentenceTransformer('sentence-transformers/msmarco-distilbert-
|
30 |
|
31 |
#Encode query and documents
|
32 |
query_emb = model.encode(query)
|
@@ -42,6 +42,7 @@ doc_score_pairs = list(zip(docs, scores))
|
|
42 |
doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
|
43 |
|
44 |
#Output passages & scores
|
|
|
45 |
for doc, score in doc_score_pairs:
|
46 |
print(score, doc)
|
47 |
```
|
@@ -81,8 +82,8 @@ query = "How many people live in London?"
|
|
81 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
82 |
|
83 |
# Load model from HuggingFace Hub
|
84 |
-
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/msmarco-distilbert-
|
85 |
-
model = AutoModel.from_pretrained("sentence-transformers/msmarco-distilbert-
|
86 |
|
87 |
#Encode query and docs
|
88 |
query_emb = encode(query)
|
@@ -98,6 +99,7 @@ doc_score_pairs = list(zip(docs, scores))
|
|
98 |
doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
|
99 |
|
100 |
#Output passages & scores
|
|
|
101 |
for doc, score in doc_score_pairs:
|
102 |
print(score, doc)
|
103 |
```
|
|
|
7 |
- transformers
|
8 |
---
|
9 |
|
10 |
+
# msmarco-distilbert-dot-v4
|
11 |
This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and was designed for **semantic search**. It has been trained on 500K (query, answer) pairs from the [MS MARCO dataset](https://github.com/microsoft/MSMARCO-Passage-Ranking/). For an introduction to semantic search, have a look at: [SBERT.net - Semantic Search](https://www.sbert.net/examples/applications/semantic-search/README.html)
|
12 |
|
13 |
|
|
|
26 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
27 |
|
28 |
#Load the model
|
29 |
+
model = SentenceTransformer('sentence-transformers/msmarco-distilbert-dot-v4')
|
30 |
|
31 |
#Encode query and documents
|
32 |
query_emb = model.encode(query)
|
|
|
42 |
doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
|
43 |
|
44 |
#Output passages & scores
|
45 |
+
print("Query:", query)
|
46 |
for doc, score in doc_score_pairs:
|
47 |
print(score, doc)
|
48 |
```
|
|
|
82 |
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
|
83 |
|
84 |
# Load model from HuggingFace Hub
|
85 |
+
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/msmarco-distilbert-dot-v4")
|
86 |
+
model = AutoModel.from_pretrained("sentence-transformers/msmarco-distilbert-dot-v4")
|
87 |
|
88 |
#Encode query and docs
|
89 |
query_emb = encode(query)
|
|
|
99 |
doc_score_pairs = sorted(doc_score_pairs, key=lambda x: x[1], reverse=True)
|
100 |
|
101 |
#Output passages & scores
|
102 |
+
print("Query:", query)
|
103 |
for doc, score in doc_score_pairs:
|
104 |
print(score, doc)
|
105 |
```
|