Update README.md
Browse files
README.md
CHANGED
@@ -2604,34 +2604,15 @@ model-index:
|
|
2604 |
value: 78.25741142443962
|
2605 |
---
|
2606 |
|
2607 |
-
|
2608 |
-
|
2609 |
-
<p align="center">
|
2610 |
-
<img src="https://console.llmrails.com/assets/img/logo-black.svg" width="150px">
|
2611 |
-
</p>
|
2612 |
|
2613 |
This model has been trained on an extensive corpus of text pairs that encompass a broad spectrum of domains, including finance, science, medicine, law, and various others. During the training process, we incorporated techniques derived from the [RetroMAE](https://arxiv.org/abs/2205.12035) and [SetFit](https://arxiv.org/abs/2209.11055) research papers.
|
2614 |
|
2615 |
-
We are pleased to offer this model as an API service through our platform, [LLMRails](https://llmrails.com/?ref=ember-v1). If you are interested, please don't hesitate to sign up.
|
2616 |
-
|
2617 |
### Plans
|
2618 |
- The research paper will be published soon.
|
2619 |
- The v2 of the model is currently in development and will feature an extended maximum sequence length of 4,000 tokens.
|
2620 |
|
2621 |
## Usage
|
2622 |
-
Use with API request:
|
2623 |
-
```bash
|
2624 |
-
curl --location 'https://api.llmrails.com/v1/embeddings' \
|
2625 |
-
--header 'X-API-KEY: {token}' \
|
2626 |
-
--header 'Content-Type: application/json' \
|
2627 |
-
--data '{
|
2628 |
-
"input": ["This is an example sentence"],
|
2629 |
-
"model":"embedding-english-v1" # equals to ember-v1
|
2630 |
-
}'
|
2631 |
-
```
|
2632 |
-
API docs: https://docs.llmrails.com/embedding/embed-text<br>
|
2633 |
-
Langchain plugin: https://python.langchain.com/docs/integrations/text_embedding/llm_rails
|
2634 |
-
|
2635 |
Use with transformers:
|
2636 |
```python
|
2637 |
import torch.nn.functional as F
|
@@ -2692,4 +2673,15 @@ Our model achieve state-of-the-art performance on [MTEB leaderboard](https://hug
|
|
2692 |
|
2693 |
This model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens.
|
2694 |
|
2695 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2604 |
value: 78.25741142443962
|
2605 |
---
|
2606 |
|
2607 |
+
<h1 align="center">ember-v1</h1>
|
|
|
|
|
|
|
|
|
2608 |
|
2609 |
This model has been trained on an extensive corpus of text pairs that encompass a broad spectrum of domains, including finance, science, medicine, law, and various others. During the training process, we incorporated techniques derived from the [RetroMAE](https://arxiv.org/abs/2205.12035) and [SetFit](https://arxiv.org/abs/2209.11055) research papers.
|
2610 |
|
|
|
|
|
2611 |
### Plans
|
2612 |
- The research paper will be published soon.
|
2613 |
- The v2 of the model is currently in development and will feature an extended maximum sequence length of 4,000 tokens.
|
2614 |
|
2615 |
## Usage
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2616 |
Use with transformers:
|
2617 |
```python
|
2618 |
import torch.nn.functional as F
|
|
|
2673 |
|
2674 |
This model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens.
|
2675 |
|
2676 |
+
## License
|
2677 |
+
MIT
|
2678 |
+
|
2679 |
+
## Citation
|
2680 |
+
|
2681 |
+
```bibtex
|
2682 |
+
@misc{nur2024emberv1,
|
2683 |
+
title={ember-v1: SOTA embedding model},
|
2684 |
+
author={Enrike Nur and Anar Aliyev},
|
2685 |
+
year={2023},
|
2686 |
+
}
|
2687 |
+
```
|