AICurious Logo

What is: Nouveau VAE?

SourceNVAE: A Deep Hierarchical Variational Autoencoder
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

NVAE, or Nouveau VAE, is deep, hierarchical variational autoencoder. It can be trained with the original VAE objective, unlike alternatives such as VQ-VAE-2. NVAE’s design focuses on tackling two main challenges: (i) designing expressive neural networks specifically for VAEs, and (ii) scaling up the training to a large number of hierarchical groups and image sizes while maintaining training stability.

To tackle long-range correlations in the data, the model employs hierarchical multi-scale modelling. The generative model starts from a small spatially arranged latent variables as z_1\mathbf{z}\_{1} and samples from the hierarchy group-by-group while gradually doubling the spatial dimensions. This multi-scale approach enables NVAE to capture global long-range correlations at the top of the hierarchy and local fine-grained dependencies at the lower groups.

Additional design choices include the use of residual cells for the generative models and the encoder, which employ a number of tricks and modules to achieve good performance, and the use of residual normal distributions to smooth optimization. See the components section for more details.