Does generate's max_length influence training?

vsocrates · April 25, 2024, 11:40pm

I’m finetuning a BART model from this tutorial: Google Colab and having a lot of trouble figuring out if I trained my model for the right document length.

I truncated my training data to 1028 tokens, but the default for generation is 12 tokens. If I bump this up to 512, will it still generate meaningful sentences? Is there another parameter I need to add to Seq2SeqTrainingArguments? Do the GenerationConfig parameters have any impact on training?

The BartConfig has the following section on task_specific_params.

  "task_specific_params": {
    "summarization": {
      "length_penalty": 1.0,
      "max_length": 128,
      "min_length": 12,
      "num_beams": 4
    },
    "summarization_cnn": {
      "length_penalty": 2.0,
      "max_length": 142,
      "min_length": 56,
      "num_beams": 4
    },
    "summarization_xsum": {
      "length_penalty": 1.0,
      "max_length": 62,
      "min_length": 11,
      "num_beams": 6
    }

Thanks!

Topic		Replies	Views
How to increase the length of the summary in Bart_large_cnn model used via transformers.Auto_Model_frompretrained? Beginners	1	983	November 15, 2021
BART max_new_tokens in generate function Models	2	157	May 11, 2024
BART generation with shorter input sequences on pre-training task Models	0	306	January 25, 2023
T5 Gen Len is only 1/14 of max_target_length Beginners	3	677	October 5, 2023
Truncated last sentence on summaries 🤗Transformers	2	854	March 30, 2023

Does generate's max_length influence training?

Related topics