antortl
/

ludwig-t5-based

text2text-generation

Model card Files Files and versions Community

antortl commited on Sep 13

Commit

be939dc

•

1 Parent(s): b293815

Update README.md

Files changed (1) hide show

README.md +7 -14

README.md CHANGED Viewed

@@ -38,22 +38,15 @@ Abstraction Level: The model tends to be more extractive than abstractive in its
 ## Training and evaluation data
-News Articles Dataset:
-Source: CNN/Daily Mail dataset (version 3.0.0)
-Size: Approximately 200,000 articles
-Time Range: 2007-2021
 Language: English
-Content: Wide range of topics including politics, sports, entertainment, and world events
-Academic Articles Dataset:
-Source: arXiv and PubMed Open Access Subset
-Size: Approximately 150,000 articles
-Time Range: 2010-2022
-Language: English
-Content: Research papers from various scientific fields including physics, mathematics, computer science, and biomedical sciences
 Pre-processing Steps:

 ## Training and evaluation data
+Dataset:
+Source: PARANMT-50M
+Size: Approximately 50M
+Time Range: 2007-2017
 Language: English
+Content: more than 50 million English-English
+sentential paraphrase pairs
+https://arxiv.org/pdf/1711.05732v2
 Pre-processing Steps: