michaelfeil
commited on
Commit
•
bde25c8
1
Parent(s):
c15e519
Upload sentence-transformers/all-MiniLM-L6-v2 ctranslate2 weights
Browse files
README.md
CHANGED
@@ -38,7 +38,7 @@ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on
|
|
38 |
|
39 |
quantized version of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
|
40 |
```bash
|
41 |
-
pip install hf-hub-ctranslate2>=2.12.0 ctranslate2>=3.
|
42 |
```
|
43 |
|
44 |
```python
|
@@ -78,16 +78,20 @@ embeddings = model.encode(
|
|
78 |
print(embeddings.shape, embeddings)
|
79 |
scores = (embeddings @ embeddings.T) * 100
|
80 |
|
|
|
|
|
|
|
|
|
81 |
```
|
82 |
|
83 |
-
Checkpoint compatible to [ctranslate2>=3.
|
84 |
and [hf-hub-ctranslate2>=2.12.0](https://github.com/michaelfeil/hf-hub-ctranslate2)
|
85 |
- `compute_type=int8_float16` for `device="cuda"`
|
86 |
- `compute_type=int8` for `device="cpu"`
|
87 |
|
88 |
-
Converted on 2023-
|
89 |
```
|
90 |
-
|
91 |
```
|
92 |
|
93 |
# Licence and other remarks:
|
|
|
38 |
|
39 |
quantized version of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
|
40 |
```bash
|
41 |
+
pip install hf-hub-ctranslate2>=2.12.0 ctranslate2>=3.17.1
|
42 |
```
|
43 |
|
44 |
```python
|
|
|
78 |
print(embeddings.shape, embeddings)
|
79 |
scores = (embeddings @ embeddings.T) * 100
|
80 |
|
81 |
+
# Hint: you can also host this code via REST API and
|
82 |
+
# via github.com/michaelfeil/infinity
|
83 |
+
|
84 |
+
|
85 |
```
|
86 |
|
87 |
+
Checkpoint compatible to [ctranslate2>=3.17.1](https://github.com/OpenNMT/CTranslate2)
|
88 |
and [hf-hub-ctranslate2>=2.12.0](https://github.com/michaelfeil/hf-hub-ctranslate2)
|
89 |
- `compute_type=int8_float16` for `device="cuda"`
|
90 |
- `compute_type=int8` for `device="cpu"`
|
91 |
|
92 |
+
Converted on 2023-10-13 using
|
93 |
```
|
94 |
+
LLama-2 -> removed <pad> token.
|
95 |
```
|
96 |
|
97 |
# Licence and other remarks:
|
model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8e02198a1a1480129f35fede1751d0406a43e5ea8e7abb618ac58285e974cd6e
|
3 |
+
size 45430860
|