403 Forbidden error when accessing the model
model_id = "elyza/ELYZA-japanese-Llama-2-7b-instruct"
llm_hub = HuggingFaceEndpoint(repo_id=model_id, temperature= 0.1, max_new_tokens=600, model_kwargs={"max_length": 600})
I am using the above code to load the model. Since the size of the model is more than my RAM I gues it won`t be possible to load it locally.
So I want to use the inference to load the model.
I am even setting the HuggingFace token using The same code works for other heavy models. I even tried changing the access token from Inference to Read & Write but did not work. Does this have something to do with the HuggingFace plan?os.environ["HUGGINGFACEHUB_API_TOKEN"]
but getting the below error:requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: /static-proxy?url=https%3A%2F%2Fapi-inference.huggingface.co%2Fmodels%2Felyza%2FELYZA-japanese-Llama-2-7b-instruct%3C%2Fcode%3E%3C%2Fp%3E
Can anyone please help me with this?