Image-Text-to-Text
Transformers
Safetensors
vision-encoder-decoder
Inference Endpoints
titae commited on
Commit
5901392
·
verified ·
1 Parent(s): 43eab28

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -33
README.md CHANGED
@@ -15,26 +15,33 @@ base_model:
15
  ---
16
 
17
  # Model Card for Sprakbanken/trocr_smi_nor_pred_synth
18
- This is a TrOCR-model for OCR (optical character recognition) of Sámi languages.
19
  It can be used to recognize text in images of printed text (scanned books, magazines, etc.) in North Sámi, South Sámi, Lule Sámi, and Inari Sámi.
20
 
21
- ## Collection details
22
- This model is a part of our collection of OCR models for Sámi languages.
23
 
24
- The following TrOCR models are available:
25
- - [Sprakbanken/trocr_smi](https://huggingface.co/Sprakbanken/trocr_smi): [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed) fine-tuned on manually annotated Sámi data
26
- - [Sprakbanken/trocr_smi_nor](https://huggingface.co/Sprakbanken/trocr_smi_nor): microsoft/trocr-base-printed fine-tuned on manually annotated Sámi and Norwegian data
27
- - [Sprakbanken/trocr_smi_pred](https://huggingface.co/Sprakbanken/trocr_smi_pred): microsoft/trocr-base-printed fine-tuned on manually annotated and automatically transcribed Sámi data
28
- - [Sprakbanken/trocr_smi_nor_pred](https://huggingface.co/Sprakbanken/trocr_smi_nor_pred): microsoft/trocr-base-printed fine-tuned on manually annotated and automatically transcribed Sámi data, and manually annotated Norwegian data
29
- - [Sprakbanken/trocr_smi_synth](https://huggingface.co/Sprakbanken/trocr_smi_synth): microsoft/trocr-base-printed fine-tuned on [Sprakbanken/synthetic_sami_ocr_data](https://huggingface.co/datasets/Sprakbanken/synthetic_sami_ocr_data), and then on manually annotated Sámi data
30
- - [Sprakbanken/trocr_smi_pred_synth](https://huggingface.co/Sprakbanken/trocr_smi_pred_synth): microsoft/trocr-base-printed fine-tuned on Sprakbanken/synthetic_sami_ocr_data, and then fine-tuned on manually annotated and automatically transcribed Sámi data
31
- - [Sprakbanken/trocr_smi_nor_pred_synth](https://huggingface.co/Sprakbanken/trocr_smi_nor_pred_synth): microsoft/trocr-base-printed fine-tuned on Sprakbanken/synthetic_sami_ocr_data, and then fine-tuned on manually annotated and automatically transcribed Sámi data, and manually annotated Norwegian
32
 
33
- [Sprakbanken/trocr_smi_pred_synth](https://huggingface.co/Sprakbanken/trocr_smi_pred_synth) is the model that achieved the best results (of the TrOCR models) on our test dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## Model Details
36
- This model is TrOCR-printed base model trained on Sprakbanken/synthetic_sami_ocr_data for 5 epochs,
37
- and then fine-tuned on manually annotated and automatically transcribed Sámi data, and manually annotated Norwegian. See our paper for more details.
 
38
 
39
 
40
  ### Model Description
@@ -43,39 +50,37 @@ and then fine-tuned on manually annotated and automatically transcribed Sámi da
43
  - **Model type:** TrOCR
44
  - **Languages:** North Sámi (sme), South Sámi (sma), Lule Sámi (smj), and Inari Sámi (smn)
45
  - **License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
46
- - **Finetuned from model :** [TrOCR-printed base model](https://huggingface.co/microsoft/trocr-base-printed)
47
 
48
  ### Model Sources
49
 
50
  - **Repository:** https://github.com/Sprakbanken/nodalida25_sami_ocr
51
  - **Paper:** "Enstad T, Trosterud T, Røsok MI, Beyer Y, Roald M. Comparative analysis of optical character recognition methods for Sámi texts from the National Library of Norway. Accepted for publication in Proceedings of the 25th Nordic Conference on Computational Linguistics (NoDaLiDa) 2025." (preprint coming soon.)
52
 
53
- ## Uses
54
- You can use the raw model for optical character recognition (OCR) on single text-line images in North Sámi, South Sámi, Lule Sámi, and Inari Sámi.
55
-
56
- ### Out-of-Scope Use
57
- The model only works with images of lines of text.
58
- If you have images of entire pages of text, you must segment the text into lines first to benefit from this model.
59
 
 
 
60
 
61
- ## How to Get Started with the Model
 
 
 
 
 
 
 
62
 
63
- Use the code below to get started with the model.
64
 
65
- ```python
66
- from transformers import TrOCRProcessor, VisionEncoderDecoderModel
67
- from PIL import Image
68
 
69
- processor = TrOCRProcessor.from_pretrained("Sprakbanken/trocr_smi_nor_pred_synth")
70
- model = VisionEncoderDecoderModel.from_pretrained("Sprakbanken/trocr_smi_nor_pred_synth")
71
 
72
- image = Image.open("path_to_image.jpg").convert("RGB")
 
 
73
 
74
- pixel_values = processor(image, return_tensors="pt").pixel_values
75
- generated_ids = model.generate(pixel_values)
76
 
77
- generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
78
- ```
79
  ## Citation
80
 
81
  **APA:**
 
15
  ---
16
 
17
  # Model Card for Sprakbanken/trocr_smi_nor_pred_synth
18
+ This is a TrOCR-model for OCR (optical character recognition) of Sámi languages.
19
  It can be used to recognize text in images of printed text (scanned books, magazines, etc.) in North Sámi, South Sámi, Lule Sámi, and Inari Sámi.
20
 
 
 
21
 
22
+ ## How to Get Started with the Model
 
 
 
 
 
 
 
23
 
24
+ Use the code below to get started with the model.
25
+
26
+ ```python
27
+ from transformers import TrOCRProcessor, VisionEncoderDecoderModel
28
+ from PIL import Image
29
+
30
+ processor = TrOCRProcessor.from_pretrained("Sprakbanken/trocr_smi_nor_pred_synth")
31
+ model = VisionEncoderDecoderModel.from_pretrained("Sprakbanken/trocr_smi_nor_pred_synth")
32
+
33
+ image = Image.open("path_to_image.jpg").convert("RGB")
34
+
35
+ pixel_values = processor(image, return_tensors="pt").pixel_values
36
+ generated_ids = model.generate(pixel_values)
37
+
38
+ generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
39
+ ```
40
 
41
  ## Model Details
42
+ This model is [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed) trained on [Sprakbanken/synthetic_sami_ocr_data](https://huggingface.co/datasets/Sprakbanken/synthetic_sami_ocr_data) for 5 epochs,
43
+ and then fine-tuned on manually annotated and automatically transcribed Sámi data, and manually annotated Norwegian.
44
+ See our paper for more details.
45
 
46
 
47
  ### Model Description
 
50
  - **Model type:** TrOCR
51
  - **Languages:** North Sámi (sme), South Sámi (sma), Lule Sámi (smj), and Inari Sámi (smn)
52
  - **License:** [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
53
+ - **Finetuned from model :** [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed)
54
 
55
  ### Model Sources
56
 
57
  - **Repository:** https://github.com/Sprakbanken/nodalida25_sami_ocr
58
  - **Paper:** "Enstad T, Trosterud T, Røsok MI, Beyer Y, Roald M. Comparative analysis of optical character recognition methods for Sámi texts from the National Library of Norway. Accepted for publication in Proceedings of the 25th Nordic Conference on Computational Linguistics (NoDaLiDa) 2025." (preprint coming soon.)
59
 
 
 
 
 
 
 
60
 
61
+ ## Collection details
62
+ This model is a part of our collection of OCR models for Sámi languages.
63
 
64
+ The following TrOCR models are available:
65
+ - [Sprakbanken/trocr_smi](https://huggingface.co/Sprakbanken/trocr_smi): [microsoft/trocr-base-printed](https://huggingface.co/microsoft/trocr-base-printed) fine-tuned on manually annotated Sámi data
66
+ - [Sprakbanken/trocr_smi_nor](https://huggingface.co/Sprakbanken/trocr_smi_nor): microsoft/trocr-base-printed fine-tuned on manually annotated Sámi and Norwegian data
67
+ - [Sprakbanken/trocr_smi_pred](https://huggingface.co/Sprakbanken/trocr_smi_pred): microsoft/trocr-base-printed fine-tuned on manually annotated and automatically transcribed Sámi data
68
+ - [Sprakbanken/trocr_smi_nor_pred](https://huggingface.co/Sprakbanken/trocr_smi_nor_pred): microsoft/trocr-base-printed fine-tuned on manually annotated and automatically transcribed Sámi data, and manually annotated Norwegian data
69
+ - [Sprakbanken/trocr_smi_synth](https://huggingface.co/Sprakbanken/trocr_smi_synth): microsoft/trocr-base-printed fine-tuned on [Sprakbanken/synthetic_sami_ocr_data](https://huggingface.co/datasets/Sprakbanken/synthetic_sami_ocr_data), and then on manually annotated Sámi data
70
+ - [Sprakbanken/trocr_smi_pred_synth](https://huggingface.co/Sprakbanken/trocr_smi_pred_synth): microsoft/trocr-base-printed fine-tuned on Sprakbanken/synthetic_sami_ocr_data, and then fine-tuned on manually annotated and automatically transcribed Sámi data
71
+ - [Sprakbanken/trocr_smi_nor_pred_synth](https://huggingface.co/Sprakbanken/trocr_smi_nor_pred_synth): microsoft/trocr-base-printed fine-tuned on Sprakbanken/synthetic_sami_ocr_data, and then fine-tuned on manually annotated and automatically transcribed Sámi data, and manually annotated Norwegian
72
 
73
+ [Sprakbanken/trocr_smi_pred_synth](https://huggingface.co/Sprakbanken/trocr_smi_pred_synth) is the model that achieved the best results (of the TrOCR models) on our test dataset.
74
 
 
 
 
75
 
76
+ ## Uses
77
+ You can use the raw model for optical character recognition (OCR) on single text-line images in North Sámi, South Sámi, Lule Sámi, and Inari Sámi.
78
 
79
+ ### Out-of-Scope Use
80
+ The model only works with images of lines of text.
81
+ If you have images of entire pages of text, you must segment the text into lines first to benefit from this model.
82
 
 
 
83
 
 
 
84
  ## Citation
85
 
86
  **APA:**