UGARIT commited on
Commit
d2428fa
·
1 Parent(s): d071c0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -5,7 +5,8 @@ license: cc-by-4.0
5
  GRC-ALIGNMENT model is an XLM-RoBERTa-based model, trained on 12 million monolingual ancient Greek tokens with Masked Language Model (MLM) training objective. Further, the model is fine-tuned on 45k parallel sentences mainly in ancient Greek-English, ancient Greek-Latin, and ancient Greek-Georgian.
6
 
7
  ### Multilingual Training Dataset
8
- | Languages | # Sentences | Source |
9
- |:---------:|:-----------:|:--------------------------------------------------------------------------------:|
10
- | GRC-ENG | 32.500 | Perseus Digital Library (Iliad, Odyssey, Xenophon, New Testament) |
11
- | GRC-LAT | 8.200 | Digital Fragmenta Historicorum Graecorum project (https://www.dfhg-project.org/) |
 
 
5
  GRC-ALIGNMENT model is an XLM-RoBERTa-based model, trained on 12 million monolingual ancient Greek tokens with Masked Language Model (MLM) training objective. Further, the model is fine-tuned on 45k parallel sentences mainly in ancient Greek-English, ancient Greek-Latin, and ancient Greek-Georgian.
6
 
7
  ### Multilingual Training Dataset
8
+ | Languages | # Sentences | Source |
9
+ |:---------------------------------------:|:-----------:|:--------------------------------------------------------------------------------:|
10
+ | GRC-ENG | 32.500 | Perseus Digital Library (Iliad, Odyssey, Xenophon, New Testament) |
11
+ | GRC-LAT | 8.200 | [Digital Fragmenta Historicorum Graecorum project](https://www.dfhg-project.org/) |
12
+ | GRC-KAT GRC-ENG GRC-LAT GRC-ITA GRC-POR | 4.000 | [UGARIT Translation Alignment Editor](https://ugarit.ialigner.com/ ) |