BART max_new_tokens in generate function

vsocrates · May 11, 2024, 1:34pm

I’m using a fine-tuned BartForConditionalGeneration model and trying to generate tokens.

  outputs = model.generate(input_ids, attention_mask=attention_mask, num_beams=3, 
                           min_new_tokens=1500,
                           max_new_tokens=2500,
                           early_stopping=True)

However, I get an error that This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (1024). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.

CUDA also throws a RunTimeError.

Why? Shouldn’t BART be able to arbitrarily long sequences autoregressively?

RaushanTurganbay · May 11, 2024, 6:26pm

Hey!

No, Bart cannot arbitrarily increase the max length since it has been trained with learned positional encodings, unlike the recent LLMs that use RoPE. In other words, Bart has a fixed size matrix that computes position embeddings.

system · May 13, 2024, 4:57pm

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Generate function returns random words for BartForConditionalGeneration Beginners	1	279	February 28, 2021
Does generate's max_length influence training? 🤗Transformers	0	98	April 25, 2024
How to increase the length of the summary in Bart_large_cnn model used via transformers.Auto_Model_frompretrained? Beginners	1	981	November 15, 2021
Creating a custom loss function for token appearance based in BART on the input Intermediate	0	431	February 11, 2022
Inconsistent Model/Pipeline Behavior using Automodel/Pipeline/BartForConditionalGeneration 🤗Transformers	3	870	February 16, 2021

BART max_new_tokens in generate function

Related topics