**Ternary Weight Splitting** is a ternarization approach used in [BinaryBERT](https://www.paperswithcode.com/method/binarybert) that exploits the flatness of ternary loss landscape as the optimization proxy of the binary model. We first train the half-sized ternary BERT to convergence, and then split both the latent full-precision weight $\mathbf{w}^{t}$ and quantized $\hat{\mathbf{w}}^{t}$ to their binary counterparts $\mathbf{w}\_{1}^{b}, \mathbf{w}\_{2}^{b}$ and $\hat{\mathbf{w}}\_{1}^{b}, \hat{\mathbf{w}}\_{2}^{b}$ via the TWS operator. To inherit the performance of the ternary model after splitting, the TWS operator requires the splitting equivalency (i.e., the same output given the same input):

$$
\mathbf{w}^{t}=\mathbf{w}\_{1}^{b}+\mathbf{w}\_{2}^{b}, \quad \hat{\mathbf{w}}^{t}=\hat{\mathbf{w}}\_{1}^{b}+\hat{\mathbf{w}}\_{2}^{b}
$$

While solution to the above equation is not unique, we constrain the latent full-precision weights after splitting $\mathbf{w}\_{1}^{b}, \mathbf{w}\_{2}^{b}$ to satisfy $\mathbf{w}^{t}=\mathbf{w}\_{1}^{b}+\mathbf{w}\_{2}^{b}$. See the paper for more details.

An attention mechanism for content-based filtering of multi-level features. For example, recurrent features obtained by forward and backward passes of a bidirectional RNN block can be combined using attention feature filters, with unprocessed input features/embeddings as queries and recurrent features as keys/values.

Attention Feature Filters

NeuriCam: Key-Frame Video Super-Resolution and Colorization for IoT Cameras

Ternary Weight Splitting

BinaryBERT: Pushing the Limit of BERT Quantization

**FCOS** is an anchor-box free, proposal free, single-stage object detection model. By eliminating the predefined set of anchor boxes, FCOS avoids computation related to anchor boxes such as calculating overlapping during training. It also avoids all hyper-parameters related to anchor boxes, which are often very sensitive to the final detection performance.

Source	BinaryBERT: Pushing the Limit of BERT Quantization
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com