File size: 3,914 Bytes
1d05cc6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8598be4
1d05cc6
8598be4
1d05cc6
8598be4
1d05cc6
8598be4
1d05cc6
8598be4
1d05cc6
8598be4
1d05cc6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
tags:
- spacy
- arxiv:2408.06930
- medical
language:
- nl
license: gpl-3.0
model-index:
- name: Echocardiogram_Multimodel_bespoke
  results:
  - task:
      type: text-classification
    dataset:
      type: test
      name: internal test set
    metrics:
    - name: Macro f1
      type: f1
      value: 0.922
      verified: false
    - name: Macro precision
      type: precision
      value: 0.931
      verified: false
    - name: Macro recall
      type: recall
      value: 0.915
      verified: false
pipeline_tag: text-classification
metrics:
- f1
- precision
- recall
---

# Description
This model is a [MedRoBERTa.nl](https://huggingface.co/CLTL/MedRoBERTa.nl) model finetuned on Dutch echocardiogram reports sourced from Electronic Health Records. 
The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930. 
The config file for training the model can be found at https://github.com/umcu/echolabeler.

# Minimum working example
```python
from transformer import pipeline
```
```python
le_pipe = pipeline(model="UMCU/Echocardiogram_Multimodel_bespoke")
document = "Lorem ipsum"
results = le_pipe(document)
```

# Label Scheme

<details>

<summary>View label scheme</summary>

| Component | Labels |
| --- | --- |
| **`bespoke`** | `pe_Present`, `rv_dil_Present`, `wma_Present`, `lv_dil_Present`, `aortic_valve_native_stenosis_Present`, `mitral_valve_native_regurgitation_Present`, `lv_sys_func_Present`, `rv_sys_func_Present`, `aortic_valve_native_regurgitation_Present`, `lv_dias_func_Present`,`Normal_or_No_Label`, `tricuspid_valve_native_regurgitation_Present` |
| **`reduced`** | `Normal_or_No_Label`, `Present` |
</details>

Here, for the reduced labels `Present` means that for *any one or multiple* of the pathologies we have a positive result.

Here, for the pathologies we have

<details>

<summary>View pathologies</summary>

| Annotation | Pathology |
| --- | --- |
| pe  | Pericardial Effusion |
| wma | Wall Motion Abnormality |
| lv_dil | Left Ventricle Dilation |
| rv_dil | Right Ventricle Dilation |
| lv_syst_func | Left Ventricle Systolic Dysfunction |
| rv_syst_func | Right Ventricle Systolic Dysfunction |
| lv_dias_func | Diastolic Dysfunction |
| aortic_valve_native_stenosis | Aortic Stenosis |
| mitral_valve_native_regurgitation | Mitral valve regurgitation |
| tricuspid_valve_native_regurgitation | Tricuspid regurgitation |
| aortic_valve_native_regurgitation | Aortic Regurgitation |
</details>

Note: `lv_dias_func` should have been `dias_func`..

# Intended use
The model is developed for *document* classification of Dutch clinical echocardiogram reports.
Since it is a domain-specific model trained on medical data, it is **only** meant to be used on medical NLP tasks for *Dutch echocardiogram reports*.

# Data
The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht.
The training data was anonymized before starting the training procedure.

| Feature | Description |
| --- | --- |
| **Name** | `Echocardiogram_SpanCategorizer_aortic_stenosis` |
| **Version** | `1.0.0` |
| **transformers** | `>=4.40.0` |
| **Default Pipeline** | `pipeline`, `text-classification` |
| **Components** | `RobertaForSequenceClassification` |
| **License** | `cc-by-sa-4.0` |
| **Author** | [Bram van Es]() |

# Contact
If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues

# Usage
If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930

# References
Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930