**DropConnect** generalizes [Dropout](https://paperswithcode.com/method/dropout) by randomly dropping the weights rather than the activations with probability $1-p$. DropConnect is similar to Dropout as it introduces dynamic sparsity within the model, but differs in that the sparsity is on the weights $W$, rather than the output vectors of a layer. In other words, the fully connected layer with DropConnect becomes a sparsely connected layer in which the connections are chosen at random during the training stage. Note that this is not equivalent to setting $W$ to be a fixed sparse matrix during training.

For a DropConnect layer, the output is given as:

$$ r = a \left(\left(M * W\right){v}\right)$$

Here $r$ is the output of a layer, $v$ is the input to a layer, $W$ are weight parameters, and $M$ is a binary matrix encoding the connection information where $M\_{ij} \sim \text{Bernoulli}\left(p\right)$. Each element of the mask $M$ is drawn independently for each example during training, essentially instantiating a different connectivity for each example seen. Additionally, the biases are also masked out during training.

**TDN**, or **Temporaral Difference Network**, is an action recognition model that aims to capture multi-scale temporal information. To fully capture temporal information over the entire video, the TDN is established with a two-level difference modeling paradigm. Specifically, for local motion modeling, temporal difference over consecutive frames is used to supply 2D CNNs with finer motion pattern, while for global motion modeling, temporal difference across segments is incorporated to capture long-range structure for motion feature excitation.

TDN: Temporal Difference Networks for Efficient Action Recognition

DropConnect

Regularization of Neural Networks using DropConnect

**Asynchronous Interaction Aggregation**, or **AIA**, is a network that leverages different interactions to boost action detection. There are two key designs in it: one is the Interaction Aggregation structure (IA) adopting a uniform paradigm to model and integrate multiple types of interaction; the other is the Asynchronous Memory Update algorithm (AMU) that enables us to achieve better performance by modeling very long-term interaction dynamically.

Source	Regularization of Neural Networks using DropConnect
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com