I’m using Hiera as a base for an image generation model (I want to use it to generate missing parts in images). In order to evaluate the feasibility of this model for that task, I started by using the HieraForPreTraining model and tried to reconstruct and image using the model logits.
I basically set the mask_ratio low, set the noise to pick specific non-random patches and wrote some code to transform the logits into “non-normalized” pixels. (reversed the normalization process done in get_pixel_label_2d as suggested in the git repo). After that I again un-normalize the image transform (it’s a BitImageProcessor) and I get the image that looks reasonable ok.
The issue is that I can see artifacts around the patches (i.e. you can clearly see the patches separation) and the image itself, other than the patch areas has a slightly different color…
I wonder if I am missing something in my logic that causes the artifacts around the patches or is it expected from MAE reconstruction?
Here’s an example of an image with the bottom part of the face masked and the rest not. you can see the color difference (the bottom color is closer to the real picture color) and the artifacts around the patches.