File size: 1,940 Bytes
f890be9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
---
language: sw
license: mit
---
# roberta-base-wechsel-swahili
Model trained with WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.
See the code here: https://github.com/CPJKU/wechsel
And the paper here: https://arxiv.org/abs/2112.06598
## Performance
### RoBERTa
| Model | NLI Score | NER Score | Avg Score |
|---|---|---|---|
| `roberta-base-wechsel-french` | **82.43** | **90.88** | **86.65** |
| `camembert-base` | 80.88 | 90.26 | 85.57 |
| Model | NLI Score | NER Score | Avg Score |
|---|---|---|---|
| `roberta-base-wechsel-german` | **81.79** | **89.72** | **85.76** |
| `deepset/gbert-base` | 78.64 | 89.46 | 84.05 |
| Model | NLI Score | NER Score | Avg Score |
|---|---|---|---|
| `roberta-base-wechsel-chinese` | **78.32** | 80.55 | **79.44** |
| `bert-base-chinese` | 76.55 | **82.05** | 79.30 |
| Model | NLI Score | NER Score | Avg Score |
|---|---|---|---|
| `roberta-base-wechsel-swahili` | **75.05** | **87.39** | **81.22** |
| `xlm-roberta-base` | 69.18 | 87.37 | 78.28 |
### GPT2
| Model | PPL |
|---|---|
| `gpt2-wechsel-french` | **19.71** |
| `gpt2` (retrained from scratch) | 20.47 |
| Model | PPL |
|---|---|
| `gpt2-wechsel-german` | **26.8** |
| `gpt2` (retrained from scratch) | 27.63 |
| Model | PPL |
|---|---|
| `gpt2-wechsel-chinese` | **51.97** |
| `gpt2` (retrained from scratch) | 52.98 |
| Model | PPL |
|---|---|
| `gpt2-wechsel-swahili` | **10.14** |
| `gpt2` (retrained from scratch) | 10.58 |
See our paper for details.
## Citation
Please cite WECHSEL as
```
@misc{minixhofer2021wechsel,
title={WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models},
author={Benjamin Minixhofer and Fabian Paischer and Navid Rekabsaz},
year={2021},
eprint={2112.06598},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
|