ValueError: Unknown quantization type

#1
by grg - opened

Hello,
Thanks for the model!

I am having some issue with running the model. Here is the code snippet:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "tencent-community/Hunyuan-A52B-Instruct-FP8"

print("Loading tokenizer")
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

print("Loading model")
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)

Upon loading the model, this gives me the following error

ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet']

The problem can most probably be resolved by updating the config.json

"quant_method": "fp8"

or by defining which additional library and or version should be installed.

Sign up or log in to comment