|
--- |
|
license: mit |
|
--- |
|
|
|
|
|
This classification model is based on [sberbank-ai/ruRoberta-large](https://huggingface.co/sberbank-ai/ruRoberta-large). |
|
The model should be used to produce relevance and specificity of the last message in the context of a dialog. |
|
|
|
It is pretrained on corpus of dialog data from social networks and finetuned on [tinkoff-ai/context_similarity](https://huggingface.co/tinkoff-ai/context_similarity). |
|
The performance of the model on validation split [tinkoff-ai/context_similarity](https://huggingface.co/tinkoff-ai/context_similarity) (with the best thresholds for validation samples): |
|
|
|
<table> |
|
<thead> |
|
<tr> |
|
<td colspan="2">relevance</td> |
|
<td colspan="2">specificity</td> |
|
</tr> |
|
</thead> |
|
<tbody> |
|
<tr> |
|
<td>f0.5</td> |
|
<td>roc-auc</td> |
|
<td>f0.5</td> |
|
<td>roc-auc</td> |
|
</tr> |
|
<tr> |
|
<td>0.86</td> |
|
<td>0.83</td> |
|
<td>0.85</td> |
|
<td>0.86</td> |
|
</tr> |
|
</tbody> |
|
</table> |
|
|
|
The model can be loaded as follows: |
|
|
|
```python |
|
# pip install transformers |
|
from transformers import AutoTokenizer, AutoModel |
|
tokenizer = AutoTokenizer.from_pretrained("tinkoff-ai/context_similarity") |
|
model = AutoModel.from_pretrained("tinkoff-ai/context_similarity") |
|
# model.cuda() |
|
``` |