I’m using colab pro to try out a few huggingface models, but even loading a 2gb model completely fills up the GPU memory (Tesla 16gb) and then as the data loads i frequently run out of memory. Can anyone explain why a small model will maximize gpu memory?
My dataset size is about 1gb between train and test if that makes a difference
Can you supply your parameters on how you are training? ie batchsize, epochs, etc. I have run into memory issues with 1gb model, 2gb is actually pretty big when training.
To check gpu resources in googlecolab you can try something like this
#On the left side you can open Terminal ('>_' with black background)
#You can run commands from there even when some cell is running
#Write command to see GPU usage in real-time:
$ watch nvidia-smi