about bos token
I'm getting this warning while testing the model on Colab. What should I do?
this is my code
def load_model(repo_id, filename):
model = Llama.from_pretrained(
repo_id=repo_id,
filename=filename,
n_gpu_layers=-1,
chat_format = 'llama-3'
)
return model
model=load_model('bartowski/Llama-3-8B-Instruct-Gradient-1048k-GGUF', filename = 'Llama-3-8B-Instruct-Gradient-1048k-Q4_K_M.gguf')
output = model.create_chat_completion(
messages = [
{"role": "system", "content": 'you are helpful assistant'},
{"role": "user", "content": 'hello'}
]
)
==> llama_tokenize_internal: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token. So now the final prompt starts with 2 BOS tokens. Are you sure this is what you want?
I really can't figure out why this problem is happening.
Where should I fix this code? Thanks in advance!