**LV-ViT** is a type of [vision transformer](https://paperswithcode.com/method/vision-transformer) that uses token labelling as a training objective. Different from the standard training
objective of ViTs that computes the classification loss on an additional trainable class token, token labelling takes advantage of all the image patch tokens to compute the training loss in a dense manner. Specifically, token labeling reformulates the image classification problem into multiple token-level recognition problems and assigns each patch token with an individual location-specific supervision generated by a machine annotator.

A new activation function named NIPUNA : f(x)=max⁡〖(g(x),x)〗 where g(x)=x/(〖(1+e〗^(-βx)))

Nipuna

0/1 Deep Neural Networks via Block Coordinate Descent

LV-ViT

All Tokens Matter: Token Labeling for Training Better Vision Transformers

**SCARLET** is a type of convolutional neural architecture learnt by the [SCARLET-NAS](https://paperswithcode.com/method/scarlet-nas) [neural architecture search](https://paperswithcode.com/method/neural-architecture-search) method. The three variants are SCARLET-A, SCARLET-B and SCARLET-C. The basic building block is MBConvs from [MobileNetV2](https://paperswithcode.com/method/mobilenetv2). Squeeze-and-excitation layers are also experimented with.

Source	All Tokens Matter: Token Labeling for Training Better Vision Transformers
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com