Re-download the llama 11B HF format but suddenly hit error in loading model
#16
by
hxgy610
- opened
Error message is as below.
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4507, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for MllamaForConditionalGeneration:
size mismatch for vision_model.gated_positional_embedding.embedding: copying a param with shape torch.Size([1025, 1280]) from checkpoint, the shape in current model is torch.Size([1601, 1280]).
size mismatch for vision_model.gated_positional_embedding.tile_embedding.weight: copying a param with shape torch.Size([9, 5248000]) from checkpoint, the shape in current model is torch.Size([9, 8197120]).
I used the latest transformers transformers-20240924-4.45.0.dev0-py3-none-any.whl
Should be fixed now, there was a change in the configuration (448 to 560 image size) that required some additional adjustments in the tile embeddings. You have to download the weights again (sorry), but it should work with the same wheel.
osanseviero
changed discussion status to
closed
Hi @pcuenq I confirmed that model loading is working again, but do you know any other potential factor impacting memory usage? The same finetuning code now goes OOM on the same hardware. I am using transformer 4.45.0