AICurious Logo

What is: OASIS?

SourceYou Only Need Adversarial Supervision for Semantic Image Synthesis
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

OASIS is a GAN-based model to translate semantic label maps into realistic-looking images. The model builds on preceding work such as Pix2Pix and SPADE. OASIS introduces the following innovations:

  1. The method is not dependent on the perceptual loss, which is commonly used for the semantic image synthesis task. A VGG network trained on ImageNet is routinely employed as the perceptual loss to strongly improve the synthesis quality. The authors show that this perceptual loss also has negative effects: First, it reduces the diversity of the generated images. Second, it negatively influences the color distribution to be more biased towards ImageNet. OASIS eliminates the dependence on the perceptual loss by changing the common discriminator design: The OASIS discriminator segments an image into one of the real classes or an additional fake class. In doing so, it makes more efficient use of the label maps that the discriminator normally receives. This distinguishes the discriminator from the commonly used encoder-shaped discriminators, which concatenate the label maps to the input image and predict a single score per image. With the more fine-grained supervision through the loss of the OASIS discriminator, the perceptual loss is shown to become unnecessary.

  2. A user can generate a diverse set of images per label map by simply resampling noise. This is achieved by conditioning the spatially-adaptive denormalization module in each layer of the GAN generator directly on spatially replicated input noise. A side effect of this conditioning is that at inference time an image can be resampled either globally or locally (either the complete image changes or a restricted region in the image).