thodel commited on
Commit
4122205
1 Parent(s): f70b570

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -7,6 +7,11 @@ tags:
7
  - medieval
8
  - ocr
9
  - htr
 
 
 
 
 
10
  ---
11
  # TrOCR Medieval Model with linemasks generated in eScriptorium (https://de.wikipedia.org/wiki/EScriptorium)
12
  Base model: **microsoft/trocr-base-handwritten**
@@ -14,5 +19,21 @@ Base model: **microsoft/trocr-base-handwritten**
14
  Epochs: 19.05 / 20
15
  Eval CER: 0.0329
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  The model has not been extensively tested.
18
  Potential biases are still to be identified.
 
7
  - medieval
8
  - ocr
9
  - htr
10
+ language:
11
+ - de
12
+ - fr
13
+ - la
14
+ - nl
15
  ---
16
  # TrOCR Medieval Model with linemasks generated in eScriptorium (https://de.wikipedia.org/wiki/EScriptorium)
17
  Base model: **microsoft/trocr-base-handwritten**
 
19
  Epochs: 19.05 / 20
20
  Eval CER: 0.0329
21
 
22
+ This is a combined model of ground truth of different **charter** and **book scripts** from a variety of projects and institutions, aiming at building a generic model for Latin scripts of the Middle Ages.
23
+ It is mainly based on documents from the project CREMMA Manuscrits médiévaux latins, HIMANIS (CNRS), Itinera Nova (Stadsarchief Leuven), and Charters and Records of Königsfelden (Universität Zürich).
24
+
25
+ Based on the following data:
26
+ CREMMA Manuscrits médiévaux latins has been produced by Clérice, Thibault and Chagué, Alix and Vlachou Efstathiou, Malamatenia. It is licensed under a CC-BY 4.0 license.
27
+ URL: https://github.com/HTR-United/CREMMA-Medieval-LAT
28
+
29
+ HIMANIS is partially published as HIMANIS Guérin produced by Stutzmann, Dominique; Hamel, Sébastien; Kernier, Iseut de; Mühlberger, Günter; Hackl, Günter. Licensed under a CC-BY 4.0 license.
30
+ DOI: 10.5281/zenodo.5535306
31
+
32
+ Charters and Records of Königsfelden Abbey and Bailiwick (1308-1662) has been produced by Halter-Pernet, Colette; Teuscher, Simon; Hodel, Tobias; Barwitzki, Lukas; Egloff, Salome; Henggeler, Fabian; Nadig, Michael; Steinmann, Anina; Stettler, Sabine; Prada Ziegler, Ismail. Licensed under a CC-BY 4.0 license.
33
+ DOI: 10.5281/zenodo.5179361
34
+
35
+ The model is based on the same data as the following PyLaia model (available on Transkribus):
36
+ https://readcoop.eu/model/charter-scripts-german-latin-french/
37
+
38
  The model has not been extensively tested.
39
  Potential biases are still to be identified.