tokenizer = AutoTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")
Then running a for loop to get prediction over 10k sentences on a G4 instance (T4 GPU). GPU usage (averaged by minute) is a flat 0.0%. What is wrong? How to use GPU with Transformers?
I had the same issue - to answer this question, if pytorch + cuda is installed, an e.g. transformers.Trainer class using pytorch will automatically use the cuda (GPU) version without any additional specification.
(You can check if pytorch + cuda is installed by checking if the pytorch-cuda package is installed.)
Is there a way to explicitly disable the trainer from using the GPU? I see something about place_model_on_device on Trainer but it is unclear how to set it to False.