padding side "left"? or "right"?

#148
by GoominDev - opened

For Llama 3.1 model during training, should the padding side be to the left or right?

python

tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="left", add_eos_token=True)
tokenizer.pad_token = tokenizer.eos_token

did you find any answers and why to add this line tokenizer.pad_token = tokenizer.eos_token? I am not able to stop (EOS) fine-tuned llama.

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, use_fast=True)
tokenizer.add_special_tokens({"pad_token": PAD_TOKEN})
tokenizer.padding_side = "right"

Sign up or log in to comment