I’ve been unsuccessful in freezing lower pretrained BERT layers when training a classifier using Huggingface. I’m using AutoModelForSequenceClassification particularly, via code below, and I want to freeze the lower X layers (ex: lower 9 layers). Is this possible in HuggingFace, and if so what code would I add to this for functionality?
Yes, in PyTorch freezing layers is quite easy. It can be done as follows:
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(“bert-base-cased”, num_labels=1)
for name, param in model.named_parameters():
if name.startswith("..."): # choose whatever you like here
param.requires_grad = False
Thank you so much nielsr for the quick and useful reply. I believe I got this to work. So to verify, that can be written prior to “Trainer” command and will freeze any specified parameter? So for example, I could write the code below to freeze the first two layers.
for name, param in model.named_parameters():
if name.startswith(“bert.encoder.layer.1”):
param.requires_grad = False
if name.startswith(“bert.encoder.layer.2”):
param.requires_grad = False
This question shows my ignorance, but is there a way to print model settings prior to training to verify which layers/parameters are frozen?
Would just add to this, you probably want to freeze layer 0, and you don’t want to freeze 10, 11, 12 (if using 12 layers for example), so “bert.encoder.layer.1.” rather than “bert.encoder.layer.1” should avoid such things.
May I know for subsequent operations such as model.train and model.eval, does it change the the param.requires_grad that is specify above? Or do I have to do the above, everytime I change between training and eval mode?