Anyone suceeded in finetuning?
#9
by
echooooooooo
- opened
I'm trying to finetune this model for my task, but Quanto quantization doesn't support finetuning.
I changed the quantization method to BitsandBytes and add llm_int8_skip_modules
corresponding to modules_to_not_convert
.
bnb_config = BitsAndBytesConfig(
load_in_8bit=True,
llm_int8_skip_modules=["lm_head", "embed_tokens",]
+ [f"model.layers.{i}.coefficient" for i in range(hf_config.num_hidden_layers)]
+ [f"model.layers.{i}.block_sparse_moe.gate" for i in range(hf_config.num_hidden_layers)]
)
# load bfloat16 model, move to device, and apply quantization
model = AutoModelForCausalLM.from_pretrained(
base_model_name,
torch_dtype="bfloat16",
device_map=device_map,
quantization_config=bnb_config,
trust_remote_code=True,
offload_buffers=True,
)
But I get TypeError: cannot unpack non-iterable NoneType object
when computing self-attention during training.
File "/root/.cache/huggingface/modules/transformers_modules/MiniMaxAI/MiniMax-Text-01/372fb1d2051619593bfc3b7ef553745615bbbd5d/modeling_minimax_text_01.py", line 1028, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
TypeError: cannot unpack non-iterable NoneType object
Some sample code for finetuning would be nice.
This comment has been hidden
Thank you for your feedback. Currently, we have only released the code for inference, not for training.