Commit
•
a883b02
1
Parent(s):
f47e3b5
update readme for instructions for usage with infinity (#39)
Browse files- update readme for instructions for usage with infinity (a70a1b22455b13f27202aecaa41dc8fc7f3f46df)
Co-authored-by: Michael <michaelfeil@users.noreply.huggingface.co>
README.md
CHANGED
@@ -5622,6 +5622,18 @@ scores = (embeddings[:2] @ embeddings[2:].T) * 100
|
|
5622 |
print(scores.tolist())
|
5623 |
```
|
5624 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5625 |
## Evaluation
|
5626 |
|
5627 |
### MTEB & C-MTEB
|
|
|
5622 |
print(scores.tolist())
|
5623 |
```
|
5624 |
|
5625 |
+
## Infinity_emb
|
5626 |
+
|
5627 |
+
Usage via [infinity](https://github.com/michaelfeil/infinity), a MIT Licensed inference server.
|
5628 |
+
|
5629 |
+
```
|
5630 |
+
# requires ~16-32GB VRAM NVIDIA Compute Capability >= 8.0
|
5631 |
+
docker run \
|
5632 |
+
-v $PWD/data:/app/.cache --gpus "0" -p "7997":"7997" \
|
5633 |
+
michaelf34/infinity:0.0.68-trt-onnx \
|
5634 |
+
v2 --model-id Alibaba-NLP/gte-Qwen2-7B-instruct --revision "refs/pr/38" --dtype bfloat16 --batch-size 8 --device cuda --engine torch --port 7997 --no-bettertransformer
|
5635 |
+
```
|
5636 |
+
|
5637 |
## Evaluation
|
5638 |
|
5639 |
### MTEB & C-MTEB
|