What is: Self-Calibrated Convolutions?

Liu et al. presented self-calibrated convolution as a means to enlarge the receptive field at each spatial location.

Self-calibrated convolution is used together with a standard convolution. It first divides the input feature $X$ into $X_{1}$ and $X_{2}$ in the channel domain. The self-calibrated convolution first uses average pooling to reduce the input size and enlarge the receptive field: \begin{align} T_{1} = AvgPool_{r}(X_{1}) \end{align} where $r$ is the filter size and stride. Then a convolution is used to model the channel relationship and a bilinear interpolation operator $Up$ is used to upsample the feature map:

\begin{align} X'_{1} = \text{Up}(Conv_2(T_1)) \end{align}

Next, element-wise multiplication finishes the self-calibrated process:

\begin{align} Y'_{1} = Conv_3(X_1) \sigma(X_1 + X'_1) \end{align}

Finally, the output feature map of is formed: \begin{align} Y_{1} &= Conv_4(Y'_{1}) \end{align} \begin{align} Y_2 &= Conv_1(X_2) \end{align} \begin{align} Y &= [Y_1; Y_2] \end{align} Such self-calibrated convolution can enlarge the receptive field of a network and improve its adaptability. It achieves excellent results in image classification and certain downstream tasks such as instance segmentation, object detection and keypoint detection.

Source	Improving Convolutional Networks With Self-Calibrated Convolutions
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com