meta-llama/Llama-3.2-11B-Vision · Re-download the llama 11B HF format but suddenly hit error in loading model

Sep 25, 2024

Error message is as below.

File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4507, in _load_pretrained_model
    raise RuntimeError(f"Error(s) in loading state_dict for {model.__class__.__name__}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for MllamaForConditionalGeneration:
        size mismatch for vision_model.gated_positional_embedding.embedding: copying a param with shape torch.Size([1025, 1280]) from checkpoint, the shape in current model is torch.Size([1601, 1280]).
        size mismatch for vision_model.gated_positional_embedding.tile_embedding.weight: copying a param with shape torch.Size([9, 5248000]) from checkpoint, the shape in current model is torch.Size([9, 8197120]).

I used the latest transformers transformers-20240924-4.45.0.dev0-py3-none-any.whl

pcuenq

Meta Llama org Sep 25, 2024

Should be fixed now, there was a change in the configuration (448 to 560 image size) that required some additional adjustments in the tile embeddings. You have to download the weights again (sorry), but it should work with the same wheel.

osanseviero changed discussion status to closed Sep 25, 2024

hxgy610

Sep 26, 2024

•

edited Sep 26, 2024

Hi @pcuenq I confirmed that model loading is working again, but do you know any other potential factor impacting memory usage? The same finetuning code now goes OOM on the same hardware. I am using transformer 4.45.0