Hi All.
I have uploaded my fine-tuned model to Hugging Face. I want to create an inference API (serverless) on the model page, but a timeout occurs. What should I do?
Here is how I wrote the README:
pipeline_tag: text-generation
inference:
parameters:
max_new_tokens: 300
stop:
- <|end_of_text|>
- <|eot_id|>