UGARIT commited on
Commit
4df187f
1 Parent(s): e3d87b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -2,11 +2,24 @@
2
  license: cc-by-4.0
3
  ---
4
  # Automatic Translation Alignment of Ancient Greek Texts
5
- GRC-ALIGNMENT model is an XLM-RoBERTa-based model, trained on 12 million monolingual ancient Greek tokens with Masked Language Model (MLM) training objective. Further, the model is fine-tuned on 45k parallel sentences, mainly in ancient Greek-English, Greek-Latin, and Greek-Georgian.
 
6
 
7
  ### Multilingual Training Dataset
8
  | Languages |Sentences | Source |
9
  |:---------------------------------------|:-----------:|:--------------------------------------------------------------------------------|
10
  | GRC-ENG | 32.500 | Perseus Digital Library (Iliad, Odyssey, Xenophon, New Testament) |
11
  | GRC-LAT | 8.200 | [Digital Fragmenta Historicorum Graecorum project](https://www.dfhg-project.org/) |
12
- | GRC-KAT <br>GRC-ENG <br>GRC-LAT<br>GRC-ITA<br>GRC-POR | 4.000 | [UGARIT Translation Alignment Editor](https://ugarit.ialigner.com/ ) |
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: cc-by-4.0
3
  ---
4
  # Automatic Translation Alignment of Ancient Greek Texts
5
+ GRC-ALIGNMENT model is an XLM-RoBERTa-based model, fine-tuned for automatic multilingual text alignment at the word level.
6
+ The model is trained on 12 million monolingual ancient Greek tokens with Masked Language Model (MLM) training objective. Further, the model is fine-tuned on 45k parallel sentences, mainly in ancient Greek-English, Greek-Latin, and Greek-Georgian.
7
 
8
  ### Multilingual Training Dataset
9
  | Languages |Sentences | Source |
10
  |:---------------------------------------|:-----------:|:--------------------------------------------------------------------------------|
11
  | GRC-ENG | 32.500 | Perseus Digital Library (Iliad, Odyssey, Xenophon, New Testament) |
12
  | GRC-LAT | 8.200 | [Digital Fragmenta Historicorum Graecorum project](https://www.dfhg-project.org/) |
13
+ | GRC-KAT <br>GRC-ENG <br>GRC-LAT<br>GRC-ITA<br>GRC-POR | 4.000 | [UGARIT Translation Alignment Editor](https://ugarit.ialigner.com/ ) |
14
+
15
+ If you use this model, please cite our paper:
16
+ <pre>
17
+ @misc{yousef_palladino_wright_berti_2022,
18
+ title={Automatic Translation Alignment for Ancient Greek and Latin},
19
+ url={osf.io/8epsy},
20
+ DOI={10.31219/osf.io/8epsy},
21
+ publisher={OSF Preprints},
22
+ author={Yousef, Tariq and Palladino, Chiara and Wright, David J and Berti, Monica},
23
+ year={2022},
24
+ month={Apr}
25
+ }</pre>