Multi-layer neural networks traditionally use  dot products between the output vector of previous layer and the incoming weight vector as the input to activation function. The result of dot product is unbounded. To bound dot product and decrease the variance, **Cosine Normalization** uses cosine similarity or centered cosine similarity (Pearson Correlation Coefficient) instead of dot products in neural networks. 

Using cosine normalization, the output of a hidden unit is computed by:

$$o = f(net_{norm})= f(\cos \theta) = f(\frac{\vec{w} \cdot \vec{x}} {\left|\vec{w}\right|  \left|\vec{x}\right|})$$

where $net_{norm}$ is the normalized pre-activation,  $\vec{w}$ is the incoming weight vector and $\vec{x}$ is the input vector, ($\cdot$) indicates dot product, $f$ is nonlinear activation function. Cosine normalization bounds the pre-activation between -1 and 1.

**PixLoc** is a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. It is based on the direct alignment of multiscale deep features, casting camera localization as metric learning. PixLoc learns strong data priors by end-to-end training from pixels to pose and exhibits exceptional generalization to new scenes by separating model parameters and scene geometry. As the CNN never sees 3D points, PixLoc can generalize to any 3D structure available. This includes sparse SfM point clouds, dense depth maps from stereo or RGBD sensors, meshes, Lidar scans, but also lines and other primitives.

PixLoc

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose

Cosine Normalization

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks

**PReLU-Net** is a type of convolutional neural network that utilises parameterized ReLUs for its activation function. It also uses a robust initialization scheme - afterwards known as [Kaiming Initialization](https://paperswithcode.com/method/he-initialization) - that accounts for non-linear activation functions.

Source	Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com