Determining size of a logits

My dataset consists of two row of images, one of them is colored the other is black and white mask. From this snippet of code from the tutorial, what is the size=labels.shape[-2] supposed to be if this has an error of “out of bounds” like Target 2 is out of bounds.

Would be happy to add more details like the labels2id json if needed


# Define mean intersection over union (IoU) as the evaluation metric

metric = evaluate.load("mean_iou")

# Function to calculate evaluation metrics for model predictions
def compute_metrics(eval_pred):
    with torch.no_grad():
        logits, labels = eval_pred

        # Convert 'logits' into a tensor and resize it to match the shape of "labels"
        logits_tensor = torch.from_numpy(logits)
        logits_tensor = nn.functional.interpolate(
            logits_tensor,
            size=labels.shape[-2:],
            mode="linear",
            align_corners=True,
        ).argmax(dim=1)

        # Convert the predicted labels to a NumPy array
        pred_labels = logits_tensor.detach().cpu().numpy()

        # Calculate metrics using the 'mean_iou' evaluation metric
        metrics = metric._compute(
            predictions=pred_labels,
            references=labels,
            num_labels=len(id2label),  # Number of unique labels in the dataset
            ignore_index=0,
            reduce_labels=processor.do_reduce_labels,
        )

        # Extract per-category accuracy and IoU scores
        per_category_accuracy = metrics.pop("per_category_accuracy").tolist()
        per_category_iou = metrics.pop("per_category_iou").tolist()

        # Update the metrics dictionary with accuracy and IoU scores for each category
        metrics.update({f"accuracy_{id2label[i]}": v for i, v in enumerate(per_category_accuracy)})
        metrics.update({f"iou_{id2label[i]}": v for i, v in enumerate(per_category_iou)})

        return metrics

If I relate to the numpy documentation the dimension can be either 2, 3, or even 5. If I have two dimensional table of dataset with 2 dimension and 1 dimension image inside, which dimension should I see?

1 Like