baukearends
commited on
Commit
•
e2bbe1b
1
Parent(s):
efa7adb
Update README.md
Browse files
README.md
CHANGED
@@ -1,28 +1,58 @@
|
|
1 |
---
|
2 |
tags:
|
3 |
- spacy
|
|
|
|
|
4 |
language:
|
5 |
- nl
|
6 |
license: cc-by-sa-4.0
|
7 |
model-index:
|
8 |
-
- name:
|
9 |
-
results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
---
|
11 |
-
Package to classify spans for the presence and severity of aortic stenosis in Dutch echocardiogram reports.
|
12 |
|
13 |
-
|
14 |
-
|
15 |
-
| **Name** | `nl_Echocardiogram_SpanCategorizer_aortic_stenosis` |
|
16 |
-
| **Version** | `1.0.0` |
|
17 |
-
| **spaCy** | `>=3.7.4,<3.8.0` |
|
18 |
-
| **Default Pipeline** | `tok2vec`, `spancat` |
|
19 |
-
| **Components** | `tok2vec`, `spancat` |
|
20 |
-
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
|
21 |
-
| **Sources** | n/a |
|
22 |
-
| **License** | `cc-ny-sa-4.0` |
|
23 |
-
| **Author** | [Bauke Arends]() |
|
24 |
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
<details>
|
28 |
|
@@ -34,12 +64,30 @@ Package to classify spans for the presence and severity of aortic stenosis in Du
|
|
34 |
|
35 |
</details>
|
36 |
|
37 |
-
### Accuracy
|
38 |
|
39 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
| --- | --- |
|
41 |
-
|
|
42 |
-
|
|
43 |
-
|
|
44 |
-
|
|
45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
tags:
|
3 |
- spacy
|
4 |
+
- arxiv:2408.06930
|
5 |
+
- medical
|
6 |
language:
|
7 |
- nl
|
8 |
license: cc-by-sa-4.0
|
9 |
model-index:
|
10 |
+
- name: Echocardiogram_SpanCategorizer_aortic_stenosis
|
11 |
+
results:
|
12 |
+
- task:
|
13 |
+
type: token-classification
|
14 |
+
dataset:
|
15 |
+
type: test
|
16 |
+
name: "internal test set"
|
17 |
+
metrics:
|
18 |
+
- name: "Weighted f1"
|
19 |
+
type: f1
|
20 |
+
value: 0.864
|
21 |
+
verified: false
|
22 |
+
- name: "Weighted precision"
|
23 |
+
type: precision
|
24 |
+
value: 0.823
|
25 |
+
verified: false
|
26 |
+
- name: "Weighted recall"
|
27 |
+
type: recall
|
28 |
+
value: 0.786
|
29 |
+
verified: false
|
30 |
+
|
31 |
+
pipeline_tag: token-classification
|
32 |
+
metrics:
|
33 |
+
- f1
|
34 |
+
- precision
|
35 |
+
- recall
|
36 |
---
|
|
|
37 |
|
38 |
+
# Description
|
39 |
+
This model is a spaCy SpanCategorizer model trained from scratch on Dutch echocardiogram reports sourced from Electronic Health Records. The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930. The config file for training the model can be found at https://github.com/umcu/echolabeler.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
40 |
|
41 |
+
# Minimum working example
|
42 |
+
```python
|
43 |
+
!pip install https://huggingface.co/baukearends/Echocardiogram-SpanCategorizer-aortic-stenosis/resolve/main/nl_Echocardiogram_SpanCategorizer_aortic_stenosis-any-py3-none-any.whl
|
44 |
+
```
|
45 |
+
```python
|
46 |
+
import spacy
|
47 |
+
nlp = spacy.load("nl_Echocardiogram_SpanCategorizer_aortic_stenosis")
|
48 |
+
```
|
49 |
+
```python
|
50 |
+
prediction = nlp("Op dit echo geen duidelijke WMA te zien, goede systolische L.V. functie, wel L.V.H., diastolische dysfunctie graad 1A tot 2. Geringe aortastenose en - matige -insufficientie. Geringe M.I.")
|
51 |
+
for span, score in zip(prediction.spans['sc'], prediction.spans['sc'].attrs['scores']):
|
52 |
+
print(f"Span: {span}, label: {span.label_}, score: {score[0]:.3f}")
|
53 |
+
```
|
54 |
+
|
55 |
+
# Label Scheme
|
56 |
|
57 |
<details>
|
58 |
|
|
|
64 |
|
65 |
</details>
|
66 |
|
|
|
67 |
|
68 |
+
# Intended use
|
69 |
+
The model is developed for span classification on Dutch clinical text. Since it is a domain-specific model trained on medical data, it is meant to be used on medical NLP tasks for Dutch.
|
70 |
+
|
71 |
+
# Data
|
72 |
+
The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht. The training data was anonymized before starting the training procedure.
|
73 |
+
|
74 |
+
| Feature | Description |
|
75 |
| --- | --- |
|
76 |
+
| **Name** | `Echocardiogram_SpanCategorizer_aortic_stenosis` |
|
77 |
+
| **Version** | `1.0.0` |
|
78 |
+
| **spaCy** | `>=3.7.4,<3.8.0` |
|
79 |
+
| **Default Pipeline** | `tok2vec`, `spancat` |
|
80 |
+
| **Components** | `tok2vec`, `spancat` |
|
81 |
+
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
|
82 |
+
| **Sources** | n/a |
|
83 |
+
| **License** | `cc-by-sa-4.0` |
|
84 |
+
| **Author** | [Bauke Arends]() |
|
85 |
+
|
86 |
+
# Contact
|
87 |
+
If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues
|
88 |
+
|
89 |
+
# Usage
|
90 |
+
If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930
|
91 |
+
|
92 |
+
# References
|
93 |
+
Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930
|