AICurious Logo

What is: COCO-FUNIT?

SourceCOCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

COCO-FUNIT is few-shot image translation model which computes the style embedding of the example images conditioned on the input image and a new module called the constant style bias. It builds on top of FUNIT by identifying the content loss problem and then addressing it with a novel content-conditioned style encoder architecture.

The FUNIT method suffers from the content loss problem—the translation result is not well-aligned with the input image. While a direct theoretical analysis is likely elusive, we conduct an empirical study, aiming at identify the cause of the content loss problem. In analyses, the authors show that the FUNIT style encoder produces very different style codes using different crops -- suggesting the style code contains other information about the style image such as the object pose.

To make the style embedding more robust to small variations in the style image, a new style encoder architecture, the Content-Conditioned style encoder (COCO), is introduced. The most distinctive feature of this new encoder is the conditioning in the content image as illustrated in the top-right of the Figure. Unlike the style encoder in FUNIT, COCO takes both content and style image as input. With this content-conditioning scheme, a direct feedback path is created during learning to let the content image influence how the style code is computed. It also helps reduce the direct influence of the style image to the extract style code.