michaelfeil commited on
Commit
c36c8d2
1 Parent(s): 992e13d

Update instructions for usage with infinity

Browse files

```
docker run --gpus all -v $PWD/data:/app/.cache -e HF_TOKEN=$HF_TOKEN -p "7995":"7997" michaelf34/infinity:0.0.68 v2 --model-id BAAI/bge-multilingual-gemma2 --revision "main" --dtype bfloat16 --batch-size 4
--device cuda --engine torch --port 7997 --no-bettertransformer
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO 2024-11-13 00:08:17,113 infinity_emb INFO: infinity_server.py:89
Creating 1engines:
engines=['BAAI/bge-multilingual-gemma2']
INFO 2024-11-13 00:08:17,117 infinity_emb INFO: Anonymized telemetry.py:30
telemetry can be disabled via environment variable
`DO_NOT_TRACK=1`.
INFO 2024-11-13 00:08:17,124 infinity_emb INFO: select_model.py:64
model=`BAAI/bge-multilingual-gemma2` selected, using
engine=`torch` and device=`cuda`
INFO 2024-11-13 00:08:17,241 SentenceTransformer.py:216
sentence_transformers.SentenceTransformer
INFO: Load pretrained SentenceTransformer:
BAAI/bge-multilingual-gemma2
INFO 2024-11-13 00:08:26,938 SentenceTransformer.py:355
sentence_transformers.SentenceTransformer
INFO: 1 prompts are loaded, with the keys:
['web_search_query']
INFO 2024-11-13 00:08:29,054 infinity_emb INFO: Getting select_model.py:97
timings for batch_size=4 and avg tokens per
sentence=2
0.49 ms tokenization
36.43 ms inference
0.09 ms post-processing
37.01 ms total
embeddings/sec: 108.08
```

Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -8924,6 +8924,14 @@ print(scores.tolist())
8924
  # [[55.92064666748047, 1.6549524068832397], [-0.2698777914047241, 49.95653533935547]]
8925
  ```
8926
 
 
 
 
 
 
 
 
 
8927
 
8928
  ## Evaluation
8929
 
 
8924
  # [[55.92064666748047, 1.6549524068832397], [-0.2698777914047241, 49.95653533935547]]
8925
  ```
8926
 
8927
+ ### Usage with infinity
8928
+
8929
+ Instructions for usage with [Infinity](https://github.com/michaelfeil/infinity)
8930
+ ```bash
8931
+ docker run --gpus all -v $PWD/data:/app/.cache -e HF_TOKEN=$HF_TOKEN -p "7997":"7997" \
8932
+ michaelf34/infinity:0.0.68 \
8933
+ v2 --model-id BAAI/bge-multilingual-gemma2 --revision "b67c20b19cabf41f74b2aa1469047dade1f42738" --dtype bfloat16 --batch-size 4 --device cuda --engine torch --port 7997 --no-bettertransformer
8934
+ ```
8935
 
8936
  ## Evaluation
8937