I’m finetuning a BART model from this tutorial: Google Colab and having a lot of trouble figuring out if I trained my model for the right document length.
I truncated my training data to 1028 tokens, but the default for generation is 12 tokens. If I bump this up to 512, will it still generate meaningful sentences? Is there another parameter I need to add to Seq2SeqTrainingArguments
? Do the GenerationConfig
parameters have any impact on training?
The BartConfig
has the following section on task_specific_params.
"task_specific_params": {
"summarization": {
"length_penalty": 1.0,
"max_length": 128,
"min_length": 12,
"num_beams": 4
},
"summarization_cnn": {
"length_penalty": 2.0,
"max_length": 142,
"min_length": 56,
"num_beams": 4
},
"summarization_xsum": {
"length_penalty": 1.0,
"max_length": 62,
"min_length": 11,
"num_beams": 6
}
Thanks!