Add training set info
Browse files
README.md
CHANGED
@@ -42,6 +42,7 @@ metrics:
|
|
42 |
# SpanMarker for Named Entity Recognition
|
43 |
|
44 |
This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [bert-base-cased](https://huggingface.co/bert-base-cased) as the underlying encoder. See [train.py](train.py) for the training script.
|
|
|
45 |
|
46 |
Is your data not (always) capitalized correctly? Then consider using the uncased variant of this model instead for better performance:
|
47 |
[tomaarsen/span-marker-bert-base-uncased-cross-ner](https://huggingface.co/tomaarsen/span-marker-bert-base-uncased-cross-ner).
|
|
|
42 |
# SpanMarker for Named Entity Recognition
|
43 |
|
44 |
This is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model that can be used for Named Entity Recognition. In particular, this SpanMarker model uses [bert-base-cased](https://huggingface.co/bert-base-cased) as the underlying encoder. See [train.py](train.py) for the training script.
|
45 |
+
It is trained on [P3ps/Cross_ner](https://huggingface.co/datasets/P3ps/Cross_ner), which I believe is a variant of [DFKI-SLT/cross_ner](https://huggingface.co/datasets/DFKI-SLT/cross_ner) that marged the validation set into the training set and applied deduplication.
|
46 |
|
47 |
Is your data not (always) capitalized correctly? Then consider using the uncased variant of this model instead for better performance:
|
48 |
[tomaarsen/span-marker-bert-base-uncased-cross-ner](https://huggingface.co/tomaarsen/span-marker-bert-base-uncased-cross-ner).
|