Trained a model with a 0.0566 loss and empty MIoU

Dear community members and @John6666 ,

As seen inmy profile and continuation from this question, I have successfully trained a model with two labels from a model with hundreds of labels called segformer-b0-finetuned-ade-512-512. I got a pretty much a satisfied visually output I cannot send rn (will add on update or reply because the said laptop is freezing-occupied in larger batch training) but with a little bit incomplete final conclusion stats? Is this often happen when I trained on small batch-small epoch or is there a mistake in my program?

The stats are such:

edit 2: The final product is surprisingly… very similar to 10 random image. Is this what they called overfitting?

I mean… not that similar, it shouldn’t be that roundy either, I haven’t test it side by side with the image or make it colored like most segmentation product. This one is done with

epochs = 0.1
lr = 0.00006
batch_size = 8

That’s all the details I could give, feel free to ask for more information, thanks in advance for the community help

1 Like

I think you probably don’t have enough epochs. I’ve heard that it’s okay to train up to around 20 epochs. It seems that overfitting can occur if you train with the same data too much, but anyway, I think you’re probably short on epochs right now. I think it would be a good idea to train the model you’ve trained using the same procedure again.

I was gonna put big epoch but doing 1 is quite restraining and colab doesn’t seem to be a proper solution. Is there a holy triangle of epoch, batch_size, that won’t sacrifice result quality and computer ability?

1 Like

If the model is small, I think it is possible to train it in the free CPU space of HF or in the virtual environment without GPU in Colab. It will take time, but…
In your case, the model is a little special, so it is difficult to use, but AutoTrainAdvanced, etc. actually work even in CPU space. This uses HF Trainer internally.
Also, with other company services, you can use a little GPU for free with Lightning.ai.

For some reason Lightning.ai doesn’t have services in my area, is there actually a lot services like this other than google collab? I have ran my model with big epoch and still running 24 hours after… looks like going with cpu doesn’t look good while it’s working.

I’ll try that AutoTrainAdvanced perhaps, even though it doesn’t have segmentation preset…

1 Like

I think github also offers a free virtual machine service. However, it doesn’t have a GPU. In HF, you can create as many weak VMs without GPUs as you like for free, so it will take time but it won’t cost you any money.
By the way, the program below is a sample program I made for a different project, which adds a simple GUI to HF Trainer.
It’s just like doing manually what AutoTrainAdvanced does automatically, but this is more convenient if there are parts you want to do manually. In your case, it might be faster to modify this.

You are truly a godsend, let me try that quickly

1 Like

“Error occured: ‘Image’ object has no attribute ‘names’” is this from your code or

Ohh is it from my dataset format

1 Like

The key relationships and structure of the data set are hard-coded, so think of this as a normal script that is model-dependent, with a GUI added to make it work in HF space rather than an app. Concept model?:sweat_smile:

I don’t understand, so should I convert the images to other format? Where should I read the hardcoded code so I can suit my dataset to it?

1 Like

I think the collate function is just causing problems, so I think it would be better to rewrite it or stop using it.

            # data_collator=collate_fn,

For better or worse, the GUI “does nothing”.
Well, you can think of it as a CLI in effect. That’s what it was made for. (dummy GUI)