Error "shape '[1, 9, 3072]' is invalid for input of size 36864" while running Gemma 7b using torch.float16
Hello,
I'm trying to run the Gemma 7b example from this model's card using torch.float16 but I keep getting shape '[1, 9, 3072]' is invalid for input of size 36864
as an error.
I just copy/pasted the example from the card page to a Google Colab notebook (and installed the necessary dependencies of course).
Am I doing something wrong?
EDIT: tried using 8-bit precision but got the same error.
I got same error on Google Colab with T4.
I found gemma-2b
and gemma-2b-it
worked, but gemma-7b
and gemma-7b-it
got error RuntimeError: shape '[1, 9, 3072]' is invalid for input of size 36864
.
!pip3 install -q -U bitsandbytes==0.42.0
!pip3 install -q -U peft==0.8.2
!pip3 install -q -U trl==0.7.10
!pip3 install -q -U accelerate==0.27.1
!pip3 install -q -U datasets==2.17.0
!pip3 install -q -U transformers==4.38.0
import os
from google.colab import userdata
os.environ["HF_TOKEN"] = userdata.get('HF_TOKEN')
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True)
# quantization_config = BitsAndBytesConfig(load_in_8bit=True)
model_id = "google/gemma-7b" # gemma-2b and gemma-2b-it worked, but gemma-7b and gemma-7b-it got error
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=quantization_config)
# model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16)
# model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
# model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
Hey all! We're looking into it! Things work with torch 2.2.0 but not 2.1.0. We'll update here once we find the issue.
Hey all! The source of the code is the difference in the attention implementation. Using any version before 2.1.1 will use eager
as sdpa
isn't supported in torch
in these versions. We will fix the models to work with these versions in transformers
ASAP and release a patch; but in the meantime, we recommend using a torch
version that satisfies torch>=2.1.1
in order to leverage the sdpa
attention implementation, which works correctly.
Here is the necessary line to install the relevant pytorch version in colab:
pip install "torch>=2.1.1" -U
Please restart your runtime afterwards for it to leverage the updated pytorch version!
Hey all! The source of the code is the difference in the attention implementation. Using any version before 2.1.1 will use
eager
assdpa
isn't supported intorch
in these versions. We will fix the models to work with these versions intransformers
ASAP and release a patch; but in the meantime, we recommend using atorch
version that satisfiestorch>=2.1.1
in order to leverage thesdpa
attention implementation, which works correctly.Here is the necessary line to install the relevant pytorch version in colab:
pip install "torch>=2.1.1" -U
Please restart your runtime afterwards for it to leverage the updated pytorch version!
Thank you so much!
https://huggingface.co/google/gemma-7b-it/discussions/13
@osanseviero @lysandre Thank you!
I tested on Google Colab on T4 and confirmed that it works without error by adding this cell at the top of the notebook.
!pip3 install -q -U torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 torchdata==0.7.1 torchtext==0.16.1 --index-url https://download.pytorch.org/whl/cu121
By the way, it seems that the example prompt Write me a poem about Machine Learning.
is not suitable for the non-instruct model gemma-7b
. Because it generates nonsense output so that it is hard to tell whether it works well or not.
<bos>Write me a poem about Machine Learning.
<bos><bos><bos><bos><bos><bos><bos><bos><bos><bos>
But it actually works well with Write me a poem about Machine Learning. Because
.
<bos>Write me a poem about Machine Learning. Because I’m a poet. And I’m
Hi all! We just did a new release in transformers that fixes the issue being discussed in this thread. Make sure to upgrade. Thanks everyone!
@osanseviero Thank you so much!
I tested on Google Colab (torch 2.1.0+cu121) with transformers==4.38.1, and confirmed example worked well.
!pip3 install -q -U bitsandbytes==0.42.0
!pip3 install -q -U peft==0.8.2
!pip3 install -q -U trl==0.7.10
!pip3 install -q -U accelerate==0.27.1
!pip3 install -q -U datasets==2.17.0
!pip3 install -q -U transformers==4.38.1 # NOT 4.38.0
Great to hear! I'll close this discussion, but feel free to comment if you still face the issue!