José Ángel González commited on
Commit
b2a2542
·
1 Parent(s): 0121c8c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -11,3 +11,20 @@ widget:
11
  News Abstractive Summarization for Spanish (NASES) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization of Spanish news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four pre-training tasks have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Spanish newspapers, and Wikipedia articles in Spanish were used for pre-training the model (21GB of raw text -8.5 millions of documents-).
12
 
13
  NASES is finetuned for the summarization task on 1.802.919 (document, summary) pairs from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  News Abstractive Summarization for Spanish (NASES) is a Transformer encoder-decoder model, with the same hyper-parameters than BART, to perform summarization of Spanish news articles. It is pre-trained on a combination of several self-supervised tasks that help to increase the abstractivity of the generated summaries. Four pre-training tasks have been combined: sentence permutation, text infilling, Gap Sentence Generation, and Next Segment Generation. Spanish newspapers, and Wikipedia articles in Spanish were used for pre-training the model (21GB of raw text -8.5 millions of documents-).
12
 
13
  NASES is finetuned for the summarization task on 1.802.919 (document, summary) pairs from the Dataset for Automatic summarization of Catalan and Spanish newspaper Articles (DACSA).
14
+
15
+
16
+ More details about the pretraining/finetuning datasets and the models soon:
17
+
18
+ @unpublished{DACSA,
19
+ author = "Vicent Ahuir, Lluís-F. Hurtado , José Ángel González and Encarna Segarra",
20
+ title = "DACSA: a Dataset for Automatic summarization of Catalan and Spanish
21
+ newspaper Articles",
22
+ note = "Unsubmitted",
23
+ }
24
+
25
+ @unpublished{NAS,
26
+ author = "Vicent Ahuir, Lluís-F. Hurtado , José Ángel González and Encarna Segarra",
27
+ title = "NASCA and NASES : Two monolingual pre-trained models for
28
+ abstractive summarization in Catalan and Spanish",
29
+ note = "Submitted to the Special Issue on Current Approaches and Applications in Natural Language Processing (Applied Sciences)",
30
+ }