Anyone suceeded in finetuning?

#9
by echooooooooo - opened

I'm trying to finetune this model for my task, but Quanto quantization doesn't support finetuning.
I changed the quantization method to BitsandBytes and add llm_int8_skip_modules corresponding to modules_to_not_convert.

    bnb_config = BitsAndBytesConfig(
        load_in_8bit=True,
        llm_int8_skip_modules=["lm_head", "embed_tokens",]
                               + [f"model.layers.{i}.coefficient" for i in range(hf_config.num_hidden_layers)]
                               + [f"model.layers.{i}.block_sparse_moe.gate" for i in range(hf_config.num_hidden_layers)]
    )
    # load bfloat16 model, move to device, and apply quantization
    model = AutoModelForCausalLM.from_pretrained(
        base_model_name,
        torch_dtype="bfloat16",
        device_map=device_map,
        quantization_config=bnb_config,
        trust_remote_code=True,
        offload_buffers=True,
    )

But I get TypeError: cannot unpack non-iterable NoneType object when computing self-attention during training.

  File "/root/.cache/huggingface/modules/transformers_modules/MiniMaxAI/MiniMax-Text-01/372fb1d2051619593bfc3b7ef553745615bbbd5d/modeling_minimax_text_01.py", line 1028, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
TypeError: cannot unpack non-iterable NoneType object

Some sample code for finetuning would be nice.

This comment has been hidden

Thank you for your feedback. Currently, we have only released the code for inference, not for training.

Sign up or log in to comment