File size: 557 Bytes
1babf31
 
e3fca02
95b4916
e3fca02
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
Core implementation of Jina XLM-RoBERTa

This implementation is adapted from [XLM-Roberta](https://huggingface.co/docs/transformers/en/model_doc/xlm-roberta). In contrast to the original implementation, this model uses Rotary positional encodings and supports flash-attention 2.

### Models that use this implementation

to be added soon


### Converting weights

Weights from an [original XLMRoberta model](https://huggingface.co/FacebookAI/xlm-roberta-large) can be converted using the `convert_roberta_weights_to_flash.py` script in the model repository.