MEDNER-de-fp-gbert / README.md
farnazzeidi's picture
Update README.md
0e41a35 verified
---
license: agpl-3.0
language:
- de
base_model:
- deepset/gbert-base
pipeline_tag: token-classification
---
# MEDNER.DE: Medicinal Product Entity Recognition in German-Specific Contexts
Released in December 2024, this is a German BERT language model further pretrained on `deepset/gbert-base` using a pharmacovigilance-related case summary corpus. The model has been fine-tuned for Named Entity Recognition (NER) tasks on an automatically annotated dataset to recognize medicinal products such as medications and vaccines.
In our paper, we outline the steps taken to train this model and demonstrate its superior performance compared to previous approaches
---
## Overview
- **Paper**: [https://...
- **Architecture**: MLM_based BERT Base
- **Language**: German
- **Supported Labels**: Medicinal Product
**Model Name**: MEDNER.DE
---
## How to Use
### Use a pipeline as a high-level helper
```python
from transformers import pipeline
# Load the pipeline
model = pipeline("ner", model="pei-germany/MEDNER-de-fp-gbert", aggregation_strategy='simple')
# Input text
text="Der Patient bekam den COVID-Impfstoff und nahm danach Aspirin."
# Get predictions
predictions = model(text)
print(predictions)
```
### Load model directly
```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("pei-germany/MEDNER-de-fp-gbert")
model = AutoModelForTokenClassification.from_pretrained("pei-germany/MEDNER-de-fp-gbert")
text="Der Patient bekam den COVID-Impfstoff und nahm danach Aspirin."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
# Process logits and map predictions to labels
predictions = [
(token, model.config.id2label[label.item()])
for token, label in zip(
tokenizer.convert_ids_to_tokens(inputs["input_ids"][0]),
torch.argmax(torch.softmax(outputs.logits, dim=-1), dim=-1)[0]
)
if token not in tokenizer.all_special_tokens
]
print(predictions)
```
---
# Authors
Farnaz Zeidi, Manuela Messelhäußer, Roman Christof, Xing David Wang, Ulf Leser, Dirk Mentzer, Renate König, Liam Childs.
---
## License
This model is shared under the [GNU Affero General Public License v3.0 License](https://choosealicense.com/licenses/agpl-3.0/).