Re-download the llama 11B HF format but suddenly hit error in loading model

#16
by hxgy610 - opened

Error message is as below.

File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4507, in _load_pretrained_model
    raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for MllamaForConditionalGeneration:
        size mismatch for vision_model.gated_positional_embedding.embedding: copying a param with shape torch.Size([1025, 1280]) from checkpoint, the shape in current model is torch.Size([1601, 1280]).
        size mismatch for vision_model.gated_positional_embedding.tile_embedding.weight: copying a param with shape torch.Size([9, 5248000]) from checkpoint, the shape in current model is torch.Size([9, 8197120]).

I used the latest transformers transformers-20240924-4.45.0.dev0-py3-none-any.whl

Meta Llama org

Should be fixed now, there was a change in the configuration (448 to 560 image size) that required some additional adjustments in the tile embeddings. You have to download the weights again (sorry), but it should work with the same wheel.

osanseviero changed discussion status to closed

Hi @pcuenq I confirmed that model loading is working again, but do you know any other potential factor impacting memory usage? The same finetuning code now goes OOM on the same hardware. I am using transformer 4.45.0

Sign up or log in to comment