Inference Toolkit - custom inference with multiple models

nilpferd · February 17, 2024, 9:44pm

Hello,

I’m trying to perform custom inference, where I need to a model and a tokenizer hosted at 2 different repositories on HuggingFace. I have looked at Sample customer inference notebook, however it only uses a single model.

Similarly, the code summarization notebook also uses a tokenizer and model from the same directory. I want to implement something similar, but with model and tokenizer hosted at 2 different HuggingFace repositories.

If I am to load a model and a tokenizer hosted at 2 different HuggingFace repos, zip them to a tar.gz file and finally push them to s3, what should the directory structure of this tar.gz be? I have tried the following with no success:

model.tar.gz
    /model1
        model_config.json (along with other model files)
       /code
           inference.py
    /tokenizer
        tokenizer_config.json

my custom model loading function looks like:

def model_fn(model_dir):
    model = AutoModel.from_pretrained(f"{model_dir}/model1")
    tokenizer= T5Tokenizer.from_pretrained(f'{model_dir}/tokenizer')
    return model, tokenizer

And my model creation function looks like:

from sagemaker.huggingface.model import HuggingFaceModel

huggingface_model = HuggingFaceModel(entry_point='inference.py',
                                        source_dir='code',
                                        model_data=s3_location,
                                        role=role,
                                        pytorch_version='1.7.1',
                                        py_version='py36',
                                        transformers_version='4.6.1'
                                    )

The error I’m seeing is:

OSError: file /.sagemaker/mms/models/model/config.json not found.

I’d appreciate some help figuring out if my directory structure is incorrect, and how should it be. If there are better ways to achieve this task, please suggest.

akshat-kumar-akight · April 4, 2024, 2:52pm

as per my understanding your “code” folder should be outside where it is currently. Also, you need to download a model snapshot, place “code” folder in it, place other files that your tokenizer might require and .tar.gz them.

Topic		Replies	Views
Loading inference.py separately from model.tar.gz Amazon SageMaker	4	1746	June 5, 2023
Multi-Model Endpoint with Hugging Face Amazon SageMaker	6	2333	March 3, 2024
SageMaker Inference for Model Tuned Elsewhere Amazon SageMaker	4	1054	September 2, 2021
Infer on sagemaker with custom pipeline Amazon SageMaker	2	494	September 14, 2023
Inference Toolkit - Init and default template for custom inference Amazon SageMaker	12	2069	October 4, 2021

Inference Toolkit - custom inference with multiple models

Related topics