BAAI
/

bge-reranker-v2.5-gemma2-lightweight

@@ -172,6 +172,16 @@ with torch.no_grad():
     print(scores)
 ```
 ## Load model in local
 1. make sure `gemma_config.py` and `gemma_model.py` from [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight/tree/main) in your local path.

     print(scores)
 ```
+## Infinity:
+For an OpenAI API-compatible local deploment and [Infinity](https://github.com/michaelfeil/infinity)
+```
+docker run -it --gpus all -v $volume:/app/.cache -p 7997:7997 \
+ michaelf34/infinity:0.0.70 \
+ v2 infinity_emb v2 --model-id BAAI/bge-reranker-v2.5-gemma2-lightweight --device cuda --no-bettertransformer
+```
 ## Load model in local
 1. make sure `gemma_config.py` and `gemma_model.py` from [BAAI/bge-reranker-v2.5-gemma2-lightweight](https://huggingface.co/BAAI/bge-reranker-v2.5-gemma2-lightweight/tree/main) in your local path.

tokenizer_config.json CHANGED Viewed

@@ -1746,7 +1746,7 @@
   "bos_token": "<bos>",
   "clean_up_tokenization_spaces": false,
   "eos_token": "<eos>",
-  "model_max_length": 1000000000000000019884624838656,
   "pad_token": "<pad>",
   "sp_model_kwargs": {},
   "spaces_between_special_tokens": false,

   "bos_token": "<bos>",
   "clean_up_tokenization_spaces": false,
   "eos_token": "<eos>",
+  "model_max_length": 8192,
   "pad_token": "<pad>",
   "sp_model_kwargs": {},
   "spaces_between_special_tokens": false,