AICurious Logo

What is: Kernel Activation Function?

SourceKafnets: kernel-based non-parametric activation functions for neural networks
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

A Kernel Activation Function is a non-parametric activation function defined as a one-dimensional kernel approximator:

f(s)=i=1Dαiκ(s,di)f(s) = \sum_{i=1}^D \alpha_i \kappa( s, d_i)

where:

  1. The dictionary of the kernel elements d0,,dDd_0, \ldots, d_D is fixed by sampling the xx-axis with a uniform step around 0.
  2. The user selects the kernel function (e.g., Gaussian, ReLU, Softplus) and the number of kernel elements DD as a hyper-parameter. A larger dictionary leads to more expressive activation functions and a larger number of trainable parameters.
  3. The linear coefficients are adapted independently at every neuron via standard back-propagation.

In addition, the linear coefficients can be initialized using kernel ridge regression to behave similarly to a known function in the beginning of the optimization process.