José Ángel González
commited on
Commit
·
b2a2542
1
Parent(s):
0121c8c
Update README.md
Browse files
README.md
CHANGED
@@ -11,3 +11,20 @@ widget:
|
|
11 |
News Abstractive Summarization for Spanish (NASES) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization of Spanish news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four pre-training tasks have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Spanish newspapers, and Wikipedia articles in Spanish were used for pre-training the model (21GB of raw text -8.5 millions of documents-).
|
12 |
|
13 |
NASES is finetuned for the summarization task on 1.802.919 (document, summary) pairs from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
News Abstractive Summarization for Spanish (NASES) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization of Spanish news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four pre-training tasks have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Spanish newspapers, and Wikipedia articles in Spanish were used for pre-training the model (21GB of raw text -8.5 millions of documents-).
|
12 |
|
13 |
NASES is finetuned for the summarization task on 1.802.919 (document, summary) pairs from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA).
|
14 |
+
|
15 |
+
|
16 |
+
More details about the pretraining/finetuning datasets and the models soon:
|
17 |
+
|
18 |
+
@unpublished{DACSA,
|
19 |
+
author = "Vicent Ahuir, Lluís-F. Hurtado , José Ángel González and Encarna Segarra",
|
20 |
+
title = "DACSA: a Dataset for Automatic summarization of Catalan and Spanish
|
21 |
+
newspaper Articles",
|
22 |
+
note = "Unsubmitted",
|
23 |
+
}
|
24 |
+
|
25 |
+
@unpublished{NAS,
|
26 |
+
author = "Vicent Ahuir, Lluís-F. Hurtado , José Ángel González and Encarna Segarra",
|
27 |
+
title = "NASCA and NASES : Two monolingual pre-trained models for
|
28 |
+
abstractive summarization in Catalan and Spanish",
|
29 |
+
note = "Submitted to the Special Issue on Current Approaches and Applications in Natural Language Processing (Applied Sciences)",
|
30 |
+
}
|