Token Classification
Transformers
PyTorch
Safetensors
xmod
named-entity-recognition
File size: 1,463 Bytes
5a29953
 
78402f9
 
 
 
 
 
 
 
a537dd3
78402f9
 
5a29953
78402f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
490a789
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: cc-by-nc-4.0
datasets:
- Babelscape/wikineural
language:
- de
- fr
- it
- rm
- multilingual
inference: false
tags:
  - named-entity-recognition
---

The [SwissBERT](https://huggingface.co/ZurichNLP/swissbert) model fine-tuned on the [WikiNEuRal](https://huggingface.co/datasets/Babelscape/wikineural) dataset for multilingual NER.

Supports German, French and Italian as supervised languages and Romansh Grischun as a zero-shot language.

## Usage

```python
from transformers import pipeline

token_classifier = pipeline(
  model="ZurichNLP/swissbert-ner",
  aggregation_strategy="simple",
)
```

### German example
```python
token_classifier.model.set_default_language("de_CH")
token_classifier("Mein Name sei Gantenbein.")
```
Output:
```
[{'entity_group': 'PER',
  'score': 0.5002625,
  'word': 'Gantenbein',
  'start': 13,
  'end': 24}]
```

### French example
```python
token_classifier.model.set_default_language("fr_CH")
token_classifier("J'habite à Lausanne.")
```
Output:
```
[{'entity_group': 'LOC',
  'score': 0.99955386,
  'word': 'Lausanne',
  'start': 10,
  'end': 19}]
```

## Citation
```bibtex
@article{vamvas-etal-2023-swissbert,
      title={Swiss{BERT}: The Multilingual Language Model for Switzerland}, 
      author={Jannis Vamvas and Johannes Gra\"en and Rico Sennrich},
      year={2023},
      eprint={2303.13310},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2303.13310}
}
```