**LeVIT** is a hybrid neural network for fast inference image classification. LeViT is a stack of [transformer blocks](https://paperswithcode.com/method/transformer), with [pooling steps](https://paperswithcode.com/methods/category/pooling-operation) to reduce the resolution of the activation maps as in classical [convolutional architectures](https://paperswithcode.com/methods/category/convolutional-neural-networks). This replaces the uniform structure of a Transformer by a pyramid with pooling, similar to the [LeNet](https://paperswithcode.com/method/lenet) architecture

In contrast to typical GANs, a U-Net GAN uses a segmentation network as the discriminator. This segmentation network predicts two classes: real and fake. In doing so, the discriminator gives the generator region-specific feedback. This discriminator design also enables a  [CutMix](https://paperswithcode.com/method/cutmix)-based consistency regularization on the two-dimensional output of the U-Net GAN discriminator, which further improves image synthesis quality.

U-Net GAN

A U-Net Based Discriminator for Generative Adversarial Networks

LeVIT

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Train a convolutional neural network to generate the contents of an arbitrary image region conditioned on its surroundings.

Source	LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com