Model cannot be loaded with HF transformers

#1
by bowenbaoamd - opened
from transformers import AutoModelForCausalLM

MODEL_DIR = "grok-1"

model = AutoModelForCausalLM.from_pretrained(MODEL_DIR, device_map="auto", trust_remote_code=True)

Hit error

OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory grok-1.

The first error can be resolved by generating a pytorch_model.bin.index.json file. However it hits another error afterwards

ValueError: Trying to set a tensor of shape torch.Size([4096, 6144]) in "weight" (which has shape torch.Size([32768, 6144])), this looks incorrect.

Sign up or log in to comment