Hello,
I’m trying to perform custom inference, where I need to a model and a tokenizer hosted at 2 different repositories on HuggingFace. I have looked at Sample customer inference notebook, however it only uses a single model.
Similarly, the code summarization notebook also uses a tokenizer and model from the same directory. I want to implement something similar, but with model and tokenizer hosted at 2 different HuggingFace repositories.
If I am to load a model and a tokenizer hosted at 2 different HuggingFace repos, zip them to a tar.gz file and finally push them to s3, what should the directory structure of this tar.gz be? I have tried the following with no success:
model.tar.gz
/model1
model_config.json (along with other model files)
/code
inference.py
/tokenizer
tokenizer_config.json
my custom model loading function looks like:
def model_fn(model_dir):
model = AutoModel.from_pretrained(f"{model_dir}/model1")
tokenizer= T5Tokenizer.from_pretrained(f'{model_dir}/tokenizer')
return model, tokenizer
And my model creation function looks like:
from sagemaker.huggingface.model import HuggingFaceModel
huggingface_model = HuggingFaceModel(entry_point='inference.py',
source_dir='code',
model_data=s3_location,
role=role,
pytorch_version='1.7.1',
py_version='py36',
transformers_version='4.6.1'
)
The error I’m seeing is:
OSError: file /.sagemaker/mms/models/model/config.json not found.
I’d appreciate some help figuring out if my directory structure is incorrect, and how should it be. If there are better ways to achieve this task, please suggest.