tokenizer.vocab_size != model.vocab_size

#4
by jluckyboyj - opened

Hi, I'm finetuning Phobert for question-answering task but I got a problem. I'm using tokenizer from model to tokenize UIT-ViQuad-2.0 dataset but after tokenizing some tokens are out of vocab of model (both using Pyvi.Vitokenizer and not do not work).

jluckyboyj changed discussion status to closed

Sign up or log in to comment