new tokenizer contains the cutoff date and today date by default

#74
by yuchenlin - opened

Is this the desired behavior?

It seems that you update the tokenizer config file and update the chat template a few days ago and here are the difference when I apply the chat templates from the two versions of tokenizers. (left: before; right: now). There is a new default system message for the cut-off date for the knowledge and today's date.

image.png

cc @Rocketknight1 even though they have that as an example here https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1, I don't think we should include "Cutting Knowledge Date" by default in every system prompt. What do you think about updating the chat template here?

Implicitly introducing a prefix is causing us massive headaches. Would prefer this was explicitly added by the user.

Has anyone found a way to avoid a way to not have this information as part of the prompt?

How did "Cutting Knowledge Date" and "Today Date" make it into production? Shouldn't it be "Knowledge cutoff date" and "Today's date"?

Sign up or log in to comment