baukearends
/

Echocardiogram-SpanCategorizer-aortic-stenosis

Token Classification

spaCy

Dutch

medical

Eval Results

Model card Files Files and versions Community

baukearends commited on Aug 15

Commit

e2bbe1b

•

1 Parent(s): efa7adb

Update README.md

Browse files

Files changed (1) hide show

README.md +70 -22

README.md CHANGED Viewed

@@ -1,28 +1,58 @@
 ---
 tags:
 - spacy
 language:
 - nl
 license: cc-by-sa-4.0
 model-index:
-- name: nl_Echocardiogram_SpanCategorizer_aortic_stenosis
-  results: []
 ---
-Package to classify spans for the presence and severity of aortic stenosis in Dutch echocardiogram reports.
-| Feature | Description |
-| --- | --- |
-| **Name** | `nl_Echocardiogram_SpanCategorizer_aortic_stenosis` |
-| **Version** | `1.0.0` |
-| **spaCy** | `>=3.7.4,<3.8.0` |
-| **Default Pipeline** | `tok2vec`, `spancat` |
-| **Components** | `tok2vec`, `spancat` |
-| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
-| **Sources** | n/a |
-| **License** | `cc-ny-sa-4.0` |
-| **Author** | [Bauke Arends]() |
-### Label Scheme
 <details>
@@ -34,12 +64,30 @@ Package to classify spans for the presence and severity of aortic stenosis in Du
 </details>
-### Accuracy
-| Type | Score |
 | --- | --- |
-| `SPANS_SC_F` | 82.33 |
-| `SPANS_SC_P` | 86.42 |
-| `SPANS_SC_R` | 78.61 |
-| `TOK2VEC_LOSS` | 78.47 |
-| `SPANCAT_LOSS` | 48628.90 |

 ---
 tags:
 - spacy
+- arxiv:2408.06930
+- medical
 language:
 - nl
 license: cc-by-sa-4.0
 model-index:
+- name: Echocardiogram_SpanCategorizer_aortic_stenosis
+  results:
+  - task:
+      type: token-classification
+    dataset:
+      type: test
+      name: "internal test set"
+    metrics:
+    - name: "Weighted f1"
+      type: f1
+      value: 0.864
+      verified: false
+    - name: "Weighted precision"
+      type: precision
+      value: 0.823
+      verified: false
+    - name: "Weighted recall"
+      type: recall
+      value: 0.786
+      verified: false
+pipeline_tag: token-classification
+metrics:
+- f1
+- precision
+- recall
 ---
+# Description
+This model is a spaCy SpanCategorizer model trained from scratch on Dutch echocardiogram reports sourced from Electronic Health Records. The publication associated with the span classification task can be found at https://arxiv.org/abs/2408.06930. The config file for training the model can be found at https://github.com/umcu/echolabeler.
+# Minimum working example
+```python
+!pip install https://huggingface.co/baukearends/Echocardiogram-SpanCategorizer-aortic-stenosis/resolve/main/nl_Echocardiogram_SpanCategorizer_aortic_stenosis-any-py3-none-any.whl
+```
+```python
+import spacy
+nlp = spacy.load("nl_Echocardiogram_SpanCategorizer_aortic_stenosis")
+```
+```python
+prediction = nlp("Op dit echo geen duidelijke WMA te zien, goede systolische L.V. functie, wel L.V.H., diastolische dysfunctie graad 1A tot 2. Geringe aortastenose en - matige -insufficientie. Geringe M.I.")
+for span, score in zip(prediction.spans['sc'], prediction.spans['sc'].attrs['scores']):
+    print(f"Span: {span}, label: {span.label_}, score: {score[0]:.3f}")
+```
+# Label Scheme
 <details>
 </details>
+# Intended use
+The model is developed for span classification on Dutch clinical text. Since it is a domain-specific model trained on medical data, it is meant to be used on medical NLP tasks for Dutch.
+# Data
+The model was trained on approximately 4,000 manually annotated echocardiogram reports from the University Medical Centre Utrecht. The training data was anonymized before starting the training procedure.
+| Feature | Description |
 | --- | --- |
+| **Name** | `Echocardiogram_SpanCategorizer_aortic_stenosis` |
+| **Version** | `1.0.0` |
+| **spaCy** | `>=3.7.4,<3.8.0` |
+| **Default Pipeline** | `tok2vec`, `spancat` |
+| **Components** | `tok2vec`, `spancat` |
+| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
+| **Sources** | n/a |
+| **License** | `cc-by-sa-4.0` |
+| **Author** | [Bauke Arends]() |
+# Contact
+If you are having problems with this model please add an issue on our git: https://github.com/umcu/echolabeler/issues
+# Usage
+If you use the model in your work please use the following referral; https://doi.org/10.48550/arXiv.2408.06930
+# References
+Paper: Bauke Arends, Melle Vessies, Dirk van Osch, Arco Teske, Pim van der Harst, René van Es, Bram van Es (2024): Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification, Arxiv https://arxiv.org/abs/2408.06930