mdermentzi commited on
Commit
cb9a19d
1 Parent(s): ef688a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -9
README.md CHANGED
@@ -96,27 +96,28 @@ given that they are semantically close.
96
 
97
  This model was envisioned to work as part of EHRI-related editorial and publishing pipelines and may not be suitable for
98
  the purposes of other users/organizations.
99
- <!--
100
  ### Recommendations
101
 
102
  For more information, we encourage potential users to read the paper accompanying this model:
103
- Dermentzi, M., & Scheithauer, H. (2024, May 21). Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools. Proceedings of the LREC-COLING 2024 Workshop on Holocaust Testimonies as Language Resources. HTRes@LREC-COLING 2024, Turin, Italy.
104
-
105
 
106
  ## Citation
107
 
108
  **BibTeX:**
109
  @inproceedings{dermentzi_repurposing_2024,
110
- address = {Turin, Italy},
111
  title = {Repurposing {Holocaust}-{Related} {Digital} {Scholarly} {Editions} to {Develop} {Multilingual} {Domain}-{Specific} {Named} {Entity} {Recognition} {Tools}},
112
- booktitle = {Proceedings of the {LREC}-{COLING} 2024 {Workshop} on {Holocaust} {Testimonies} as {Language} {Resources}},
 
 
 
 
113
  author = {Dermentzi, Maria and Scheithauer, Hugo},
114
  month = may,
115
  year = {2024},
116
- pubstate={forthcoming},
117
  }
118
 
119
-
120
  **APA:**
121
- Dermentzi, M., & Scheithauer, H. (2024, May 21). Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools. Proceedings of the LREC-COLING 2024 Workshop on Holocaust Testimonies as Language Resources. HTRes@LREC-COLING 2024, Turin, Italy.
122
- -->
 
96
 
97
  This model was envisioned to work as part of EHRI-related editorial and publishing pipelines and may not be suitable for
98
  the purposes of other users/organizations.
99
+
100
  ### Recommendations
101
 
102
  For more information, we encourage potential users to read the paper accompanying this model:
103
+ Dermentzi, M., & Scheithauer, H. (2024, May). Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools. LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evaluation. HTRes@LREC-COLING 2024, Torino, Italy. https://hal.science/hal-04547222
 
104
 
105
  ## Citation
106
 
107
  **BibTeX:**
108
  @inproceedings{dermentzi_repurposing_2024,
109
+ address = {Torino, Italy},
110
  title = {Repurposing {Holocaust}-{Related} {Digital} {Scholarly} {Editions} to {Develop} {Multilingual} {Domain}-{Specific} {Named} {Entity} {Recognition} {Tools}},
111
+ url = {https://hal.science/hal-04547222},
112
+ abstract = {The European Holocaust Research Infrastructure (EHRI) aims to support Holocaust research by making information about dispersed Holocaust material accessible and interconnected through its services. Creating a tool capable of detecting named entities in texts such as Holocaust testimonies or archival descriptions would make it easier to link more material with relevant identifiers in domain-specific controlled vocabularies, semantically enriching it, and making it more discoverable. With this paper, we release EHRI-NER, a multilingual dataset (Czech, German, English, French, Hungarian, Dutch, Polish, Slovak, Yiddish) for Named Entity Recognition (NER) in Holocaust-related texts. EHRI-NER is built by aggregating all the annotated documents in the EHRI Online Editions and converting them to a format suitable for training NER models. We leverage this dataset to fine-tune the multilingual Transformer-based language model XLM-RoBERTa (XLM-R) to determine whether a single model can be trained to recognize entities across different document types and languages. The results of our experiments show that despite our relatively small dataset, in a multilingual experiment setup, the overall F1 score achieved by XLM-R fine-tuned on multilingual annotations is 81.5{\textbackslash}\%. We argue that this score is sufficiently high to consider the next steps towards deploying this model.},
113
+ urldate = {2024-04-29},
114
+ booktitle = {{LREC}-{COLING} 2024 - {Joint} {International} {Conference} on {Computational} {Linguistics}, {Language} {Resources} and {Evaluation}},
115
+ publisher = {ELRA Language Resources Association (ELRA); International Committee on Computational Linguistics (ICCL)},
116
  author = {Dermentzi, Maria and Scheithauer, Hugo},
117
  month = may,
118
  year = {2024},
119
+ keywords = {Digital Editions, Holocaust Testimonies, Multilingual, Named Entity Recognition, Transfer Learning, Transformers},
120
  }
121
 
 
122
  **APA:**
123
+ Dermentzi, M., & Scheithauer, H. (2024, May). Repurposing Holocaust-Related Digital Scholarly Editions to Develop Multilingual Domain-Specific Named Entity Recognition Tools. LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evaluation. HTRes@LREC-COLING 2024, Torino, Italy. https://hal.science/hal-04547222