Merged and Saved model not giving same result after loading

Hey there, this was my general work flow of saving a Peft model after training it.


merged_model = PeftModel.from_pretrained(AutoModelForCausalLM.from_pretrained('/content/gemma-2-2b-it/gemma-2-2b-it', device_map="cpu"), '/content/gemma-2-2b-it-fine_tuned')

merged_model.save_pretrained("/content/gemma-2-2b-it-merged")

model = AutoModelForCausalLM.from_pretrained("/content/gemma-2-2b-it-merged", device_map='cpu')

tokenizer = AutoTokenizer.from_pretrained("/content/gemma-2-2b-it-merged")

However if you notice the inference result i am getting after loading the merged and saved model as compared to the merged model’s inference, they are so distorted, and not optimal at all.


question = "कुछ एक रीसाइक्लिंग अभियान के लिए एक नारा सुझाव दें।"

# Also move the tokenized question too.
inputs = tokenizer(question, return_tensors="pt").to('cpu')

generated_ids = merged_model.generate(**inputs,
                              max_new_tokens=128,
                              do_sample=True,
                              temperature=1,
                              top_p=0.95,
                              top_k=50,
                              repetition_penalty=1,
                              use_cache=False)

print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0])

कुछ एक रीसाइक्लिंग अभियान के लिए एक नारा सुझाव दें।
"
model
“हर बार दान करना, पुनर्युत्पादन करना, रीसाइक्ल करना - हमारे पर्यावरण को संरक्षित करना।”


question = "कुछ एक रीसाइक्लिंग अभियान के लिए एक नारा सुझाव दें।"

# Also move the tokenized question too.
inputs = tokenizer(question, return_tensors="pt").to('cpu')

generated_ids = model.generate(**inputs,
                              max_new_tokens=128,
                              do_sample=True,
                              temperature=1,
                              top_p=0.95,
                              top_k=50,
                              repetition_penalty=1,
                              use_cache=False)

print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0])

कुछ एक रीसाइक्लिंग अभियान के लिए एक नारा सुझाव दें।
"
Réponses: रीसाइक्लिंग अभियान के लिए एक नारा सुझाव दें।"‌آمباردا


I don’t understand what is causing this behaviour. Can any one give me any clue. There is literally no other way to save this model such that i can load it again that i know about. Specialy in case i have to fine tune it on more data. Can any one help me out with this

1 Like

Try

do_sample=False,

sure i’ll test it

1 Like

I solved the inference error by saving the state dictionary from merged_model and loading it back in the model after loading the model using from_pretrained.

1 Like