ClinicalNER
Model Description
This is a multilingual clinical NER model extracting DRUG, STRENGTH, FREQUENCY, DURATION, DOSAGE and FORM entities from a medical text.
It consist of XLM-R Base fine-tuned on n2c2 (English). It is the model that obtains the best results on our French evaluation test set MedNERF in a zero-shot cross-lingual transfer setting.
Evaluation Metrics on MedNERF dataset
- Loss: 0.692
- Accuracy: 0.859
- Precision: 0.817
- Recall: 0.791
- micro-F1: 0.804
- macro-F1: 0.819
Usage
from transformers import AutoModelForTokenClassification, AutoTokenizer
model = AutoModelForTokenClassification.from_pretrained("Posos/ClinicalNER")
tokenizer = AutoTokenizer.from_pretrained("Posos/ClinicalNER")
inputs = tokenizer("Take 2 pills every morning", return_tensors="pt")
outputs = model(**inputs)
Citation information
@inproceedings{mednerf,
title = "Multilingual Clinical NER: Translation or Cross-lingual Transfer?",
author = "Gaschi, Félix and Fontaine, Xavier and Rastin, Parisa and Toussaint, Yannick",
booktitle = "Proceedings of the 5th Clinical Natural Language Processing Workshop",
publisher = "Association for Computational Linguistics",
year = "2023"
}
- Downloads last month
- 3,425
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train Posos/ClinicalNER
Evaluation results
- micro-F1 score on MedNERFtest set self-reported0.804
- precision on MedNERFtest set self-reported0.817
- recall on MedNERFtest set self-reported0.791
- accuracy on MedNERFtest set self-reported0.859