How to debug NaN output of a logits in training

We did it boys, the accuracy not water still NaN for some reason. I thank thee all

Last thing I did was reverting this code back to it’s tutorial example

from torchvision.transforms import ColorJitter
from transformers import SegformerImageProcessor

processor = SegformerImageProcessor()
jitter = ColorJitter(brightness=0.25, contrast=0.25, saturation=0.25, hue=0.1) 

def train_transforms(example_batch):
    images = [jitter(x) for x in example_batch['pixel_values']]
    labels = [x for x in example_batch['label']]
    inputs = processor(images, labels)
    return inputs


def val_transforms(example_batch):
    images = [x for x in example_batch['pixel_values']]
    labels = [x for x in example_batch['label']]
    inputs = processor(images, labels)
    return inputs


# Set transforms
train_ds.set_transform(train_transforms)
test_ds.set_transform(val_transforms)

@Alanturner2 @John6666 Thank you to both of you, should we made the torch.nn.functional.normalize for posterity. I think John’s solution to normalization is bit simple but works.

Btw what I meant by “tutorial version” of that code was because previously we, me and John, made that code to convert it into B/W. Do you guys think it affects the normalized variables? Maybe I should check it after the training because I really don’t know which part of other code that makes it NaN after normalization.

For future readers, the answer is to always normalize your dataset.

O guys, one more thing. Is there any reading about judging these scores like when does this scores become abnormally bad or good? MIoU and other evaluation metric is rarely had any free readings except maybe this one

2 Likes