AICurious Logo

What is: Residual Block?

SourceDeep Residual Learning for Image Recognition
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Residual Blocks are skip-connection blocks that learn residual functions with reference to the layer inputs, instead of learning unreferenced functions. They were introduced as part of the ResNet architecture.

Formally, denoting the desired underlying mapping as H(x)\mathcal{H}({x}), we let the stacked nonlinear layers fit another mapping of F(x):=H(x)x\mathcal{F}({x}):=\mathcal{H}({x})-{x}. The original mapping is recast into F(x)+x\mathcal{F}({x})+{x}. The F(x)\mathcal{F}({x}) acts like a residual, hence the name 'residual block'.

The intuition is that it is easier to optimize the residual mapping than to optimize the original, unreferenced mapping. To the extreme, if an identity mapping were optimal, it would be easier to push the residual to zero than to fit an identity mapping by a stack of nonlinear layers. Having skip connections allows the network to more easily learn identity-like mappings.

Note that in practice, Bottleneck Residual Blocks are used for deeper ResNets, such as ResNet-50 and ResNet-101, as these bottleneck blocks are less computationally intensive.