File size: 3,914 Bytes
1d05cc6 8598be4 1d05cc6 8598be4 1d05cc6 8598be4 1d05cc6 8598be4 1d05cc6 8598be4 1d05cc6 8598be4 1d05cc6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
---
tags:
- spacy
- arxiv:2408.06930
- medical
language:
- nl
license: gpl-3.0
model-index:
- name: Echocardiogram_Multimodel_bespoke
results:
- task:
type: text-classification
dataset:
type: test
name: internal test set
metrics:
- name: Macro f1
type: f1
value: 0.922
verified: false
- name: Macro precision
type: precision
value: 0.931
verified: false
- name: Macro recall
type: recall
value: 0.915
verified: false
pipeline_tag: text-classification
metrics:
- f1
- precision
- recall
---
# Description
This model is a [MedRoBERTa.nl](https://huggingface.co/CLTL/MedRoBERTa.nl) model finetuned on Dutch echocardiogram reports sourced from Electronic Health Records.
The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930.
The config file for training the model can be found at https://github.com/umcu/echolabeler.
# Minimum working example
```python
from transformer import pipeline
```
```python
le_pipe = pipeline(model="UMCU/Echocardiogram_Multimodel_bespoke")
document = "Lorem ipsum"
results = le_pipe(document)
```
# Label Scheme
<details>
<summary>View label scheme</summary>
| Component | Labels |
| --- | --- |
| **`bespoke`** | `pe_Present`, `rv_dil_Present`, `wma_Present`, `lv_dil_Present`, `aortic_valve_native_stenosis_Present`, `mitral_valve_native_regurgitation_Present`, `lv_sys_func_Present`, `rv_sys_func_Present`, `aortic_valve_native_regurgitation_Present`, `lv_dias_func_Present`,`Normal_or_No_Label`, `tricuspid_valve_native_regurgitation_Present` |
| **`reduced`** | `Normal_or_No_Label`, `Present` |
</details>
Here, for the reduced labels `Present` means that for *any one or multiple* of the pathologies we have a positive result.
Here, for the pathologies we have
<details>
<summary>View pathologies</summary>
| Annotation | Pathology |
| --- | --- |
| pe | Pericardial Effusion |
| wma | Wall Motion Abnormality |
| lv_dil | Left Ventricle Dilation |
| rv_dil | Right Ventricle Dilation |
| lv_syst_func | Left Ventricle Systolic Dysfunction |
| rv_syst_func | Right Ventricle Systolic Dysfunction |
| lv_dias_func | Diastolic Dysfunction |
| aortic_valve_native_stenosis | Aortic Stenosis |
| mitral_valve_native_regurgitation | Mitral valve regurgitation |
| tricuspid_valve_native_regurgitation | Tricuspid regurgitation |
| aortic_valve_native_regurgitation | Aortic Regurgitation |
</details>
Note: `lv_dias_func` should have been `dias_func`..
# Intended use
The model is developed for *document* classification of Dutch clinical echocardiogram reports.
Since it is a domain-specific model trained on medical data, it is **only** meant to be used on medical NLP tasks for *Dutch echocardiogram reports*.
# Data
The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht.
The training data was anonymized before starting the training procedure.
| Feature | Description |
| --- | --- |
| **Name** | `Echocardiogram_SpanCategorizer_aortic_stenosis` |
| **Version** | `1.0.0` |
| **transformers** | `>=4.40.0` |
| **Default Pipeline** | `pipeline`, `text-classification` |
| **Components** | `RobertaForSequenceClassification` |
| **License** | `cc-by-sa-4.0` |
| **Author** | [Bram van Es]() |
# Contact
If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues
# Usage
If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930
# References
Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930 |