What is: Channel Attention Module?

A Channel Attention Module is a module for channel-based attention in convolutional neural networks. We produce a channel attention map by exploiting the inter-channel relationship of features. As each channel of a feature map is considered as a feature detector, channel attention focuses on ‘what’ is meaningful given an input image. To compute the channel attention efficiently, we squeeze the spatial dimension of the input feature map.

We first aggregate spatial information of a feature map by using both average-pooling and max-pooling operations, generating two different spatial context descriptors: $\mathbf{F}^{c}\_{avg}$ and $\mathbf{F}^{c}\_{max}$ , which denote average-pooled features and max-pooled features respectively.

Both descriptors are then forwarded to a shared network to produce our channel attention map $\mathbf{M}\_{c} \in \mathbb{R}^{C\times{1}\times{1}}$ . Here $C$ is the number of channels. The shared network is composed of multi-layer perceptron (MLP) with one hidden layer. To reduce parameter overhead, the hidden activation size is set to $\mathbb{R}^{C/r×1×1}$ , where $r$ is the reduction ratio. After the shared network is applied to each descriptor, we merge the output feature vectors using element-wise summation. In short, the channel attention is computed as:

$\mathbf{M\_{c}}\left(\mathbf{F}\right) = \sigma\left(\text{MLP}\left(\text{AvgPool}\left(\mathbf{F}\right)\right)+\text{MLP}\left(\text{MaxPool}\left(\mathbf{F}\right)\right)\right)$

$\mathbf{M\_{c}}\left(\mathbf{F}\right) = \sigma\left(\mathbf{W\_{1}}\left(\mathbf{W\_{0}}\left(\mathbf{F}^{c}\_{avg}\right)\right) +\mathbf{W\_{1}}\left(\mathbf{W\_{0}}\left(\mathbf{F}^{c}\_{max}\right)\right)\right)$

where $\sigma$ denotes the sigmoid function, $\mathbf{W}\_{0} \in \mathbb{R}^{C/r\times{C}}$ , and $\mathbf{W}\_{1} \in \mathbb{R}^{C\times{C/r}}$ . Note that the MLP weights, $\mathbf{W}\_{0}$ and $\mathbf{W}\_{1}$ , are shared for both inputs and the ReLU activation function is followed by $\mathbf{W}\_{0}$ .

Note that the channel attention module with just average pooling is the same as the Squeeze-and-Excitation Module.

Source	CBAM: Convolutional Block Attention Module
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com