What is: Bootstrap Your Own Latent?

BYOL (Bootstrap Your Own Latent) is a new approach to self-supervised learning. BYOL’s goal is to learn a representation $y_θ$ which can then be used for downstream tasks. BYOL uses two neural networks to learn: the online and target networks. The online network is defined by a set of weights $θ$ and is comprised of three stages: an encoder $f_θ$ , a projector $g_θ$ and a predictor $q_θ$ . The target network has the same architecture as the online network, but uses a different set of weights $ξ$ . The target network provides the regression targets to train the online network, and its parameters $ξ$ are an exponential moving average of the online parameters $θ$ .

Given the architecture diagram on the right, BYOL minimizes a similarity loss between $q_θ(z_θ)$ and $sg(z'{_ξ})$ , where $θ$ are the trained weights, $ξ$ are an exponential moving average of $θ$ and $sg$ means stop-gradient. At the end of training, everything but $f_θ$ is discarded, and $y_θ$ is used as the image representation.

Source: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning

Image credit: Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning

Source	Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com