AICurious Logo

What is: LeVIT?

SourceLeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

LeVIT is a hybrid neural network for fast inference image classification. LeViT is a stack of transformer blocks, with pooling steps to reduce the resolution of the activation maps as in classical convolutional architectures. This replaces the uniform structure of a Transformer by a pyramid with pooling, similar to the LeNet architecture