mdermentzi
commited on
Commit
•
cb9a19d
1
Parent(s):
ef688a0
Update README.md
Browse files
README.md
CHANGED
@@ -96,27 +96,28 @@ given that they are semantically close.
|
|
96 |
|
97 |
This model was envisioned to work as part of EHRI-related editorial and publishing pipelines and may not be suitable for
|
98 |
the purposes of other users/organizations.
|
99 |
-
|
100 |
### Recommendations
|
101 |
|
102 |
For more information, we encourage potential users to read the paper accompanying this model:
|
103 |
-
Dermentzi, M., & Scheithauer, H. (2024, May
|
104 |
-
|
105 |
|
106 |
## Citation
|
107 |
|
108 |
**BibTeX:**
|
109 |
@inproceedings{dermentzi_repurposing_2024,
|
110 |
-
address = {
|
111 |
title = {Repurposing {Holocaust}-{Related} {Digital} {Scholarly} {Editions} to {Develop} {Multilingual} {Domain}-{Specific} {Named} {Entity} {Recognition} {Tools}},
|
112 |
-
|
|
|
|
|
|
|
|
|
113 |
author = {Dermentzi, Maria and Scheithauer, Hugo},
|
114 |
month = may,
|
115 |
year = {2024},
|
116 |
-
|
117 |
}
|
118 |
|
119 |
-
|
120 |
**APA:**
|
121 |
-
Dermentzi, M., & Scheithauer, H. (2024, May
|
122 |
-
-->
|
|
|
96 |
|
97 |
This model was envisioned to work as part of EHRI-related editorial and publishing pipelines and may not be suitable for
|
98 |
the purposes of other users/organizations.
|
99 |
+
|
100 |
### Recommendations
|
101 |
|
102 |
For more information, we encourage potential users to read the paper accompanying this model:
|
103 |
+
Dermentzi, M., & Scheithauer, H. (2024, May). Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools. LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evaluation. HTRes@LREC-COLING 2024, Torino, Italy. https://hal.science/hal-04547222
|
|
|
104 |
|
105 |
## Citation
|
106 |
|
107 |
**BibTeX:**
|
108 |
@inproceedings{dermentzi_repurposing_2024,
|
109 |
+
address = {Torino, Italy},
|
110 |
title = {Repurposing {Holocaust}-{Related} {Digital} {Scholarly} {Editions} to {Develop} {Multilingual} {Domain}-{Specific} {Named} {Entity} {Recognition} {Tools}},
|
111 |
+
url = {https://hal.science/hal-04547222},
|
112 |
+
abstract = {The European Holocaust Research Infrastructure (EHRI) aims to support Holocaust research by making information about dispersed Holocaust material accessible and interconnected through its services. Creating a tool capable of detecting named entities in texts such as Holocaust testimonies or archival descriptions would make it easier to link more material with relevant identifiers in domain-specific controlled vocabularies, semantically enriching it, and making it more discoverable. With this paper, we release EHRI-NER, a multilingual dataset (Czech, German, English, French, Hungarian, Dutch, Polish, Slovak, Yiddish) for Named Entity Recognition (NER) in Holocaust-related texts. EHRI-NER is built by aggregating all the annotated documents in the EHRI Online Editions and converting them to a format suitable for training NER models. We leverage this dataset to fine-tune the multilingual Transformer-based language model XLM-RoBERTa (XLM-R) to determine whether a single model can be trained to recognize entities across different document types and languages. The results of our experiments show that despite our relatively small dataset, in a multilingual experiment setup, the overall F1 score achieved by XLM-R fine-tuned on multilingual annotations is 81.5{\textbackslash}\%. We argue that this score is sufficiently high to consider the next steps towards deploying this model.},
|
113 |
+
urldate = {2024-04-29},
|
114 |
+
booktitle = {{LREC}-{COLING} 2024 - {Joint} {International} {Conference} on {Computational} {Linguistics}, {Language} {Resources} and {Evaluation}},
|
115 |
+
publisher = {ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL)},
|
116 |
author = {Dermentzi, Maria and Scheithauer, Hugo},
|
117 |
month = may,
|
118 |
year = {2024},
|
119 |
+
keywords = {Digital Editions, Holocaust Testimonies, Multilingual, Named Entity Recognition, Transfer Learning, Transformers},
|
120 |
}
|
121 |
|
|
|
122 |
**APA:**
|
123 |
+
Dermentzi, M., & Scheithauer, H. (2024, May). Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools. LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evaluation. HTRes@LREC-COLING 2024, Torino, Italy. https://hal.science/hal-04547222
|
|