gsarti
/

it5-efficient-small-el32-formal-to-informal

@@ -1,54 +1,97 @@
 ---
-license: mit
 tags:
-- generated_from_trainer
 datasets:
-- it5/datasets
 metrics:
 - rouge
 model-index:
-- name: it5-efficient-small-el32-fst-f2i-0.0003
   results:
-  - task:
-      name: Summarization
-      type: summarization
     dataset:
-      name: it5/datasets fst
-      type: it5/datasets
-      args: fst
     metrics:
-    - name: Rouge1
-      type: rouge
-      value: 56.585
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# it5-efficient-small-el32-fst-f2i-0.0003
-This model is a fine-tuned version of [stefan-it/it5-efficient-small-el32](https://huggingface.co/stefan-it/it5-efficient-small-el32) on the it5/datasets fst dataset.
-It achieves the following results on the evaluation set:
-- Loss: 2.2160
-- Rouge1: 56.585
-- Rouge2: 36.9335
-- Rougel: 53.7782
-- Rougelsum: 53.7779
-- Gen Len: 13.0891
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -61,43 +104,10 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: linear
 - num_epochs: 10.0
-### Training results
-| Training Loss | Epoch | Step   | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
-|:-------------:|:-----:|:------:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
-| 2.9377        | 0.35  | 5000   | 2.5157          | 54.6148 | 35.1518 | 51.8908 | 51.8957   | 12.8717 |
-| 2.803         | 0.7   | 10000  | 2.4086          | 55.641  | 36.1214 | 52.8683 | 52.8572   | 12.7513 |
-| 2.5483        | 1.05  | 15000  | 2.3420          | 55.6604 | 36.0085 | 52.9599 | 52.9433   | 12.7754 |
-| 2.4978        | 1.39  | 20000  | 2.3145          | 56.204  | 36.5896 | 53.338  | 53.3351   | 12.8804 |
-| 2.5383        | 1.74  | 25000  | 2.2697          | 56.1356 | 36.6963 | 53.3579 | 53.3664   | 12.795  |
-| 2.3368        | 2.09  | 30000  | 2.2603          | 56.0271 | 36.4249 | 53.3113 | 53.3272   | 12.7478 |
-| 2.371         | 2.44  | 35000  | 2.2328          | 56.5041 | 36.8718 | 53.8064 | 53.7995   | 12.8243 |
-| 2.3567        | 2.79  | 40000  | 2.2079          | 56.5318 | 36.9437 | 53.8359 | 53.8254   | 12.6851 |
-| 2.1753        | 3.14  | 45000  | 2.2168          | 56.3831 | 36.8896 | 53.6542 | 53.6708   | 12.67   |
-| 2.2069        | 3.48  | 50000  | 2.2055          | 56.7171 | 37.1665 | 53.9299 | 53.9259   | 12.8014 |
-| 2.2396        | 3.83  | 55000  | 2.1801          | 56.936  | 37.5465 | 54.1064 | 54.1125   | 12.7989 |
-| 2.0657        | 4.18  | 60000  | 2.1915          | 56.6312 | 37.1618 | 53.8646 | 53.8791   | 12.6987 |
-| 2.0806        | 4.53  | 65000  | 2.1809          | 56.6599 | 37.1282 | 53.8838 | 53.8781   | 12.715  |
-| 2.0933        | 4.88  | 70000  | 2.1771          | 56.5891 | 36.9461 | 53.8058 | 53.8087   | 12.6593 |
-| 1.9949        | 5.23  | 75000  | 2.1932          | 56.4956 | 36.9679 | 53.7634 | 53.7731   | 12.6723 |
-| 1.9954        | 5.57  | 80000  | 2.1813          | 56.4827 | 36.8319 | 53.6397 | 53.6399   | 12.6599 |
-| 1.9912        | 5.92  | 85000  | 2.1755          | 56.6723 | 37.0432 | 53.8339 | 53.8233   | 12.7534 |
-| 1.9068        | 6.27  | 90000  | 2.1849          | 56.6574 | 37.0691 | 53.9029 | 53.892    | 12.7037 |
-| 1.9173        | 6.62  | 95000  | 2.1787          | 56.5701 | 36.861  | 53.6855 | 53.6699   | 12.6467 |
-| 1.9131        | 6.97  | 100000 | 2.1862          | 56.7175 | 37.0749 | 53.8761 | 53.8794   | 12.7072 |
-| 1.8164        | 7.32  | 105000 | 2.1999          | 56.6104 | 37.0809 | 53.8098 | 53.8216   | 12.6364 |
-| 1.8489        | 7.66  | 110000 | 2.1945          | 56.6645 | 37.1267 | 53.9009 | 53.9008   | 12.5741 |
-| 1.82          | 8.01  | 115000 | 2.2075          | 56.6075 | 37.0359 | 53.8792 | 53.8833   | 12.6428 |
-| 1.772         | 8.36  | 120000 | 2.2067          | 56.4716 | 36.8675 | 53.6826 | 53.6742   | 12.6591 |
-| 1.7795        | 8.71  | 125000 | 2.2056          | 56.4112 | 36.9011 | 53.6554 | 53.6495   | 12.608  |
-| 1.72          | 9.06  | 130000 | 2.2197          | 56.4735 | 36.9255 | 53.6592 | 53.6463   | 12.6758 |
-| 1.7174        | 9.41  | 135000 | 2.2169          | 56.4209 | 36.8139 | 53.5778 | 53.5685   | 12.6568 |
-| 1.7466        | 9.75  | 140000 | 2.2165          | 56.3715 | 36.767  | 53.555  | 53.5468   | 12.6416 |
 ### Framework versions
 - Transformers 4.15.0
 - Pytorch 1.10.0+cu102
 - Datasets 1.17.0
-- Tokenizers 0.10.3

 ---
+language:
+- it
+license: apache-2.0
 tags:
+- italian
+- sequence-to-sequence
+- style-transfer
+- efficient
+- formality-style-transfer
 datasets:
+- yahoo/xformal_it
+widget:
+- text: "Questa performance è a dir poco spiacevole."
+- text: "In attesa di un Suo cortese riscontro, Le auguriamo un piacevole proseguimento di giornata."
+- text: "Questa visione mi procura una goduria indescrivibile."
+- text: "qualora ciò possa interessarti, ti pregherei di contattarmi."
 metrics:
 - rouge
+- bertscore
 model-index:
+- name: it5-efficient-small-el32-formal-to-informal
   results:
+  - task:
+      type: formality-style-transfer
+      name: "Formal-to-informal Style Transfer"
     dataset:
+      type: xformal_it
+      name: "XFORMAL (Italian Subset)"
     metrics:
+      - type: rouge1
+        value: 0.459
+        name: "Avg. Test Rouge1"
+      - type: rouge2
+        value: 0.244
+        name: "Avg. Test Rouge2"
+      - type: rougeL
+        value: 0.435
+        name: "Avg. Test RougeL"
+      - type: bertscore
+        value: 0.739
+        name: "Avg. Test BERTScore"
+        args:
+          - model_type: "dbmdz/bert-base-italian-xxl-uncased"
+          - lang: "it"
+          - num_layers: 10
+          - rescale_with_baseline: True
+          - baseline_path: "bertscore_baseline_ita.tsv"
 ---
+# IT5 Cased Small Efficient EL32 for Formal-to-informal Style Transfer 🤗
+*Shout-out to [Stefan Schweter](https://github.com/stefan-it) for contributing the pre-trained efficient model!*
+This repository contains the checkpoint for the [IT5 Cased Small Efficient EL32](https://huggingface.co/it5/it5-efficient-small-el32)
+ model fine-tuned on Formal-to-informal style transfer on the Italian subset of the XFORMAL dataset as part of the experiments of the paper [IT5: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation](https://arxiv.org/abs/2203.03759) by [Gabriele Sarti](https://gsarti.com) and [Malvina Nissim](https://malvinanissim.github.io).
+Efficient IT5 models differ from the standard ones by adopting a different vocabulary that enables cased text generation and an [optimized model architecture](https://arxiv.org/abs/2109.10686) to improve performances while reducing parameter count. The Small-EL32 replaces the original encoder from the T5 Small architecture with a 32-layer deep encoder, showing improved performances over the base model.
+A comprehensive overview of other released materials is provided in the [gsarti/it5](https://github.com/gsarti/it5) repository. Refer to the paper for additional details concerning the reported scores and the evaluation approach.
+## Using the model
+Model checkpoints are available for usage in Tensorflow, Pytorch and JAX. They can be used directly with pipelines as:
+```python
+from transformers import pipelines
+f2i = pipeline("text2text-generation", model='it5/it5-efficient-small-el32-formal-to-informal')
+f2i("Vi ringrazio infinitamente per vostra disponibilità")
+>>> [{"generated_text": "e grazie per la vostra disponibilità!"}]
+```
+or loaded using autoclasses:
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+tokenizer = AutoTokenizer.from_pretrained("it5-efficient-small-el32-formal-to-informal")
+model = AutoModelForSeq2SeqLM.from_pretrained("it5-efficient-small-el32-formal-to-informal")
+```
+If you use this model in your research, please cite our work as:
+```bibtex
+@article{sarti-nissim-2022-it5,
+    title={{IT5}: Large-scale Text-to-text Pretraining for Italian Language Understanding and Generation},
+    author={Sarti, Gabriele and Nissim, Malvina},
+    journal={ArXiv preprint 2203.03759},
+    url={https://arxiv.org/abs/2203.03759},
+    year={2022},
+	month={mar}
+}
+```
 ### Training hyperparameters
 - lr_scheduler_type: linear
 - num_epochs: 10.0
 ### Framework versions
 - Transformers 4.15.0
 - Pytorch 1.10.0+cu102
 - Datasets 1.17.0
+- Tokenizers 0.10.3