José Ángel González
commited on
Commit
·
217e0bf
1
Parent(s):
e006739
Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,6 @@ widget:
|
|
9 |
---
|
10 |
|
11 |
|
12 |
-
News Abstractive Summarization for Catalan (NASCA) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization on Catalan news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four objectives have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Catalan newspapers, the Catalan subset of the OSCAR corpus and Wikipedia articles in Catalan were used for pretrain the model.
|
13 |
|
14 |
For the summarization task, it is trained on 636.596 documents from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA) corpus.
|
|
|
9 |
---
|
10 |
|
11 |
|
12 |
+
News Abstractive Summarization for Catalan (NASCA) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization on Catalan news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four objectives have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Catalan newspapers, the Catalan subset of the OSCAR corpus and Wikipedia articles in Catalan were used for pretrain the model (9.3GB of raw text -2.5 millions of documents-).
|
13 |
|
14 |
For the summarization task, it is trained on 636.596 documents from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA) corpus.
|