Hello
Hello, I previously fine-tuned a sentiment analysis model with pytorch, when I saved the model I did it with a .pth extension as recommended by pytorch.
I want to use this model remotely and I uploaded it to hugging face hub, but when uploading it with “AutoModelForSequenceClassification” I get an error saying it must be in extension .bin.
What should I do to use this model?
Hello! Can you link to the model on the Hub? I can take a quick look at it
I believe that the .bin
extension is just a convention. You should be able to just rename your .pth
file to be pytorch_model.bin
. Can you try that and see if it loads the weights?
Hello
I tried changing the name of the file but it didn’t work, now I have even more doubts, as I understand when loading the model with AutoModelForSequenceClassification
it takes the config.json
file for the configuration and then the .bin
file to load the weights.
When I train the model, create a class like the following to define the model:
class BERTSentimentClassifier(nn.Module):
def __init__(self,n_clases):
super(BERTSentimentClassifier,self).__init__()
self.bert = BertModel.from_pretrained(model_name, return_dict = False)
self.drop = nn.Dropout(p=0.35)
self.linear = nn.Linear(self.bert.config.hidden_size, n_clases)
def forward (self,input_ids,attention_mask):
_,cls_output = self.bert(
input_ids = input_ids,
attention_mask = attention_mask
)
drop_out = self.drop(cls_output)
output = self.linear(drop_out)
return output
By doing this, my model doesn’t have a config attribute, it has a bert layer and this one does have a config attribute, but it’s not the entire model configuration. Because of this and because of the model extension, I can’t load my model from the hub.
I don’t know what I have to do to use my model from the hub. This is the link to the model: https://huggingface.co/SickBoy/analisis-sentimiento-spanish-eds
Ah I see! The BertForSequenceClassification
class is basically the same as yours, so I think instead of creating your own class when training the model, you would need to create it as
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=n_classes)
(Pulled from Fine-tune a pretrained model)
If you want to set a value for nn.Dropout
you can pass also pass in a custom BertConfig
to from_pretrained
, and that’s where you would set those parameters: BERT
If you had other custom stuff that you needed to add to your model, maybe this would be useful? Sharing custom models
For your current model, I can’t unpickle the pytorch_model.bin
file because it looks for your BERTSentimentClassifier
, but since you’ve already trained the model maybe it’s possible for you to unpickle that locally, edit the state dict manually, and use that state dict on a model created with BertForSequenceClassification.from_pretrained
? (I haven’t tried doing that myself before, so I don’t know how easy/possible it is.)
Thanks @NimaBoscarino,
I chose to do the training with the huggin face trainer
and instantiating the model as model = BertForSequenceClassification.from_pretrained(model_name, num_labels=n_classes)
as the tutorial shows and it worked.
If the models are very similar, modifying the keys in the state_dict is probably a quicker way to go! That’s what I did to load a locally fine-tuned HubertForSequenceClassification
model into the HF’s HubertForSequenceClassification
class.