I’m trying to deploy the following huggingface model to AWS SageMaker:
I created a domain, launched the studio, and opened a new notebook:
Image: Data Science 3.0
Kernel: Python 3
I tried running the following code:
import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri
try:
role = sagemaker.get_execution_role()
except ValueError:
iam = boto3.client('iam')
role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']
# Hub Model configuration. https://huggingface.co/models
hub = {
'HF_MODEL_ID':'TheBloke/Luna-AI-Llama2-Uncensored-GGML',
'SM_NUM_GPUS': json.dumps(1)
}
# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
image_uri=get_huggingface_llm_image_uri("huggingface",version="0.9.3"),
env=hub,
role=role,
)
# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
initial_instance_count=1,
instance_type="ml.g5.2xlarge",
container_startup_health_check_timeout=300,
)
# send request
predictor.predict({
"inputs": "My name is Clara and I am",
})
I’m getting the following errors and warning:
UnexpectedStatusException: Error hosting endpoint huggingface-pytorch-tgi-inference-2023-09-10-11-59-20-948: Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint…
ERROR: pip’s dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
distributed 2022.7.0 requires tornado<6.2,>=6.0.3, but you have tornado 6.3.2 which is incompatible.
WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
I checked the CloudWatch logs following the instructions in first error, and I found many DownloadError logs for different files. For example:
Error: DownloadError File “/opt/conda/bin/text-generation-server”, line 8, in sys.exit(app()) File “/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py”, line 182, in download_weights utils.convert_files(local_pt_files, local_st_files, discard_names) File “/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py”, line 106, in convert_files convert_file(pt_file, sf_file, discard_names) File “/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/convert.py”, line 65, in convert_file loaded = torch.load(pt_file, map_location=“cpu”) File “/opt/conda/lib/python3.9/site-packages/torch/serialization.py”, line 815, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File “/opt/conda/lib/python3.9/site-packages/torch/serialization.py”, line 1033, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args)
2023-09-12T02:08:45.377+08:00 _pickle.UnpicklingError: could not find MARK