did you solve the issue? I am encountering same error. I have openAssistant model which works using inference endpoint.
but Llama 2 is giving me exception mentioned in question.
when I test it in Huggingface inference UI it works there. what could be the reason?