What is: Global second-order pooling convolutional networks?

A Gsop block has a squeeze module and an excitation module, and uses a second-order pooling to model high-order statistics while gathering global information. In the squeeze module, a GSoP block firstly reduces the number of channels from $c$ to $c'$ ( $c' < c$ ) using a $1 \times 1$ convolution, then computes a $c' \times c'$ covariance matrix for the different channels to obtain their correlation. Next, row-wise normalization is performed on the covariance matrix. Each $(i, j)$ in the normalized covariance matrix explicitly relates channel $i$ to channel $j$ .

In the excitation module, a GSoP block performs row-wise convolution to maintain structural information and output a vector. Then a fully-connected layer and a sigmoid function are applied to get a $c$ -dimensional attention vector. Finally, it multiplies the input features by the attention vector, as in an SE block. A GSoP block can be formulated as: \begin{align} s = F_\text{gsop}(X, \theta) & = \sigma (W \text{RC}(\text{Cov}(\text{Conv}(X)))) \end{align} \begin{align} Y & = s X \end{align} Here, $\text{Conv}(\cdot)$ reduces the number of channels, $\text{Cov}(\cdot)$ computes the covariance matrix and $\text{RC}(\cdot)$ means row-wise convolution.

Source	Global Second-order Pooling Convolutional Networks
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com