**ALiBi**, or **Attention with Linear Biases**, is a [positioning method](https://paperswithcode.com/methods/category/position-embeddings) that allows [Transformer](https://paperswithcode.com/methods/category/transformers) language models to consume, at inference time, sequences which are longer than the ones they were trained on. 

ALiBi does this without using actual position embeddings. Instead, computing the attention between a certain key and query, ALiBi penalizes the attention value that that query can assign to the key depending on how far away the key and query are. So when a key and query are close by, the penalty is very low, and when they are far away, the penalty is very high. 

This method was motivated by the simple reasoning that words that are close-by matter much more than ones that are  far away.

This method is as fast as the sinusoidal or absolute embedding methods (the fastest positioning methods there are). It outperforms those methods and Rotary embeddings when evaluating sequences that are longer than the ones the model was trained on (this is known as extrapolation).

**ProGAN**, or **Progressively Growing GAN**, is a generative adversarial network that utilises a progressively growing training approach. The idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses.

ProGAN

Progressive Growing of GANs for Improved Quality, Stability, and Variation

ALiBi

Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation

**Decorrelated Batch Normalization (DBN)** 
is a normalization technique which not just centers and scales activations but whitens them. ZCA whitening instead of [PCA](https://paperswithcode.com/method/pca) whitening is employed since PCA whitening causes a problem called *stochastic axis swapping*, which is detrimental to learning.

Source	Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com