leobertolazzi
commited on
Commit
•
f5778e7
1
Parent(s):
ff621fc
Update README.md
Browse files
README.md
CHANGED
@@ -1,46 +1,46 @@
|
|
1 |
---
|
2 |
-
tags:
|
3 |
-
- generated_from_keras_callback
|
4 |
model-index:
|
5 |
- name: medieval-it5-base
|
6 |
results: []
|
|
|
|
|
7 |
---
|
8 |
|
9 |
-
<!-- This model card has been generated automatically according to the information Keras had access to. You should
|
10 |
-
probably proofread and complete it, then remove this comment. -->
|
11 |
-
|
12 |
# medieval-it5-base
|
13 |
|
14 |
-
This model
|
15 |
-
It achieves the following results on the evaluation set:
|
16 |
-
|
17 |
|
18 |
-
|
19 |
|
20 |
-
More information needed
|
21 |
|
22 |
-
##
|
23 |
|
24 |
-
|
|
|
|
|
|
|
|
|
25 |
|
26 |
-
|
27 |
-
|
28 |
-
|
|
|
|
|
|
|
29 |
|
30 |
## Training procedure
|
31 |
|
32 |
-
|
33 |
|
34 |
-
|
35 |
-
- optimizer: None
|
36 |
-
- training_precision: float32
|
37 |
|
38 |
-
|
39 |
|
|
|
40 |
|
|
|
41 |
|
42 |
### Framework versions
|
43 |
|
44 |
-
- Transformers 4.26.
|
45 |
-
- TensorFlow 2.11.0
|
46 |
- Tokenizers 0.13.2
|
|
|
1 |
---
|
|
|
|
|
2 |
model-index:
|
3 |
- name: medieval-it5-base
|
4 |
results: []
|
5 |
+
language:
|
6 |
+
- it
|
7 |
---
|
8 |
|
|
|
|
|
|
|
9 |
# medieval-it5-base
|
10 |
|
11 |
+
This model is a version of [gsarti/it5-base](https://huggingface.co/gsarti/it5-base) fine-tuned on a dataset called [ita2medieval](https://huggingface.co/datasets/leobertolazzi/ita2medieval). The Dataset contains sentences from medieval italian along with paraphrases in contemporary italian (approximately 6.5k pairs in total).
|
|
|
|
|
12 |
|
13 |
+
The fine-tuning task is text-style-tansfer from contemporary to medieval italian.
|
14 |
|
|
|
15 |
|
16 |
+
## Using the model
|
17 |
|
18 |
+
```
|
19 |
+
from transformers import AutoTokenzier, AutoModelForSeq2SeqLM
|
20 |
+
tokenizer = AutoTokenizer.from_pretrained("leobertolazzi/medieval-it5-base")
|
21 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("leobertolazzi/medieval-it5-base")
|
22 |
+
```
|
23 |
|
24 |
+
Flax and Tensorflow versions of the model are also available:
|
25 |
+
```
|
26 |
+
from transformers import FlaxT5ForConditionalGeneration, TFT5ForConditionalGeneration
|
27 |
+
model_flax = FlaxT5ForConditionalGeneration.from_pretrained("leobertolazzi/medieval-it5-base")
|
28 |
+
model_tf = TFT5ForConditionalGeneration.from_pretrained("leobertolazzi/medieval-it5-base")
|
29 |
+
```
|
30 |
|
31 |
## Training procedure
|
32 |
|
33 |
+
The code used for the fine-tuning is available in this [repo](https://github.com/leobertolazzi/medievalIT5)
|
34 |
|
35 |
+
## Intended uses & limitations
|
|
|
|
|
36 |
|
37 |
+
The biggest limitation for this project is the size of the ita2medieval dataset. In fact, it consists only of 6.5K sentence pairs whereas [gsarti/it5-small](https://huggingface.co/gsarti/it5-base) has 220M parameters.
|
38 |
|
39 |
+
For this reason the results can be far from perfect, but some nice style translations can also be obtained.
|
40 |
|
41 |
+
It would be nice to expand ita2medieval with text and paraphrases from more medieval italian authors!
|
42 |
|
43 |
### Framework versions
|
44 |
|
45 |
+
- Transformers 4.26.0
|
|
|
46 |
- Tokenizers 0.13.2
|