BioGptTokenizer, BioGptLMHeadModel don't exist yet in transformers
Hi @kamalkraj ,
Thanks a lot for the contributions, however it seems like that BioGptTokenizer and LMHeadModel are not implemented in transformers yet. Is this normal?
Tanks in advance for the help,
Kind regards,
tdekelver
Hi @tdekelver ,
The PR is not yet merged with the main branch. For experiments you can install the transformers directly from- https://github.com/huggingface/transformers/pull/20420
Thanks,
Kamal
Hi Kamal,
Thanks I just tried it out and wanted to train the model with my own dataset (2 classes) but I get an error when I try to train it, can you help me ?
See below my code:
! pip install git+https://github.com/kamalkraj/transformers.git@BioGPT
! pip install sacremoses
from transformers import BioGptTokenizer, BioGptForCausalLM, TrainingArguments, Trainer
import evaluate
model = BioGptForCausalLM.from_pretrained("kamalkraj/biogpt", num_labels=2)
tokenizer = BioGptTokenizer.from_pretrained("kamalkraj/biogpt", use_fast=True)
clf_metrics = evaluate.combine(["accuracy", "f1", "precision", "recall"])
args = TrainingArguments(
"biogpt-finetuned",
evaluation_strategy = "epoch",
save_strategy = "epoch",
learning_rate=2e-5,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
num_train_epochs=5,
weight_decay=0.01,
load_best_model_at_end=True,
metric_for_best_model='f1',
push_to_hub=False,
report_to='mlflow'
)
def compute_metrics(eval_pred):
predictions, labels = eval_pred
predictions = predictions[:, 0]
return clf_metrics.compute(predictions=predictions, references=labels)
trainer = Trainer(
model,
args,
train_dataset=encoded_dataset["train"],
eval_dataset=encoded_dataset['valid'],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
trainer.train()
and the last line (to train the model) gives me the following error:
The following columns in the training set don't have a corresponding argument in `BioGptForCausalLM.forward` and have been ignored: text, abstract, title, BERT_txt, authors, journals, keywords, sources, file. If text, abstract, title, BERT_txt, authors, journals, keywords, sources, file are not expected by `BioGptForCausalLM.forward`, you can safely ignore this message.
/usr/local/lib/python3.7/dist-packages/transformers/optimization.py:310: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
FutureWarning,
***** Running training *****
Num examples = 2820
Num Epochs = 5
Instantaneous batch size per device = 4
Total train batch size (w. parallel, distributed & accumulation) = 4
Gradient Accumulation steps = 1
Total optimization steps = 3525
Number of trainable parameters = 346763264
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-20-3435b262f1ae> in <module>
----> 1 trainer.train()
5 frames
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1549 resume_from_checkpoint=resume_from_checkpoint,
1550 trial=trial,
-> 1551 ignore_keys_for_eval=ignore_keys_for_eval,
1552 )
1553
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in _inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
1793 tr_loss_step = self.training_step(model, inputs)
1794 else:
-> 1795 tr_loss_step = self.training_step(model, inputs)
1796
1797 if (
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in training_step(self, model, inputs)
2552
2553 with self.compute_loss_context_manager():
-> 2554 loss = self.compute_loss(model, inputs)
2555
2556 if self.args.n_gpu > 1:
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
2584 else:
2585 labels = None
-> 2586 outputs = model(**inputs)
2587 # Save past state if it exists
2588 # TODO: this needs to be fixed and made cleaner later.
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1129 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130 return forward_call(*input, **kwargs)
1131 # Do not call functions when jit is used
1132 full_backward_hooks, non_full_backward_hooks = [], []
/usr/local/lib/python3.7/dist-packages/transformers/models/biogpt/modeling_biogpt.py in forward(self, input_ids, attention_mask, head_mask, inputs_embeds, past_key_values, labels, use_cache, output_attentions, output_hidden_states, return_dict)
685 # we are doing next-token prediction; shift prediction scores and input ids by one
686 shifted_prediction_scores = prediction_scores[:, :-1, :].contiguous()
--> 687 labels = labels[:, 1:].contiguous()
688 loss_fct = CrossEntropyLoss()
689 lm_loss = loss_fct(shifted_prediction_scores.view(-1, self.config.vocab_size), labels.view(-1))
IndexError: too many indices for tensor of dimension 1
Can you help me?
Hi @tdekelver ,
BioGptForCausalLM
is not for the sequence classification tasks. It is only for generating text.
The original and the current HF implementation don't have a sequence classification task implementation. Once the original PR merges, I will add support for the same.
Thanks.
Ah okay thanks !