tokenizer.vocab_size != model.vocab_size

by jluckyboyj - opened Mar 24, 2023

Mar 24, 2023

Hi, I'm finetuning Phobert for question-answering task but I got a problem. I'm using tokenizer from model to tokenize UIT-ViQuad-2.0 dataset but after tokenizing some tokens are out of vocab of model (both using Pyvi.Vitokenizer and not do not work).

jluckyboyj changed discussion status to closed Mar 25, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment