This is a large Longformer model designed for Russian language. It was initialized from ai-forever/ruRoberta-large weights and has been modified to support a context length of up to 4096 tokens. We fine-tuned it on a dataset of Russian books. For a detailed information check out our post on Habr.
Model attributes:
- 16 attention heads
- 24 hidden layers
- 4096 tokens length of context
The model can be used as-is to produce text embeddings or it can be further fine-tuned for a specific downstream task.
Text embeddings can be produced as follows:
# pip install transformers sentencepiece
import torch
from transformers import LongformerForMaskedLM, LongformerTokenizerFast
model = LongformerModel.from_pretrained('kazzand/ru-longformer-large-4096')
tokenizer = LongformerTokenizerFast.from_pretrained('kazzand/ru-longformer-large-4096')
def get_cls_embedding(text, model, tokenizer, device='cuda'):
model.to(device)
batch = tokenizer(text, return_tensors='pt')
#set global attention for cls token
global_attention_mask = [
[1 if token_id == tokenizer.cls_token_id else 0 for token_id in input_ids]
for input_ids in batch["input_ids"]
]
#add global attention mask to batch
batch["global_attention_mask"] = torch.tensor(global_attention_mask)
with torch.no_grad():
output = model(**batch.to(device))
return output.last_hidden_state[:,0,:]
- Downloads last month
- 119
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.