AICurious Logo

What is: Gumbel Cross Entropy?

SourceLong-tailed Instance Segmentation using Gumbel Optimized Loss
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Gumbel activation function, is defined using the cumulative Gumbel distribution and it can be used to perform Gumbel regression. Gumbel activation is an alternative activation function to the sigmoid or softmax activation functions and can be used to transform the unormalised output of a model to probability. Gumbel activation ηGumbel\eta_{Gumbel} is defined as follows:

ηGumbel(qi)=exp(exp(qi))\eta_{Gumbel}(q_i) = exp(-exp(-q_i))

It can be combined with Cross Entropy loss function to solve long-tailed classification problems. Gumbel Cross Entropy (GCE) is defined as follows:

GCE(ηGumbel(qi),yi)=yilog(ηGumbel(qi))+(1yi)log(1ηGumbel(qi))GCE(\eta_{Gumbel}(q_i),y_i) = -y_i \log(\eta_{Gumbel}(q_i))+ (1-y_i) \log(1-\eta_{Gumbel}(q_i))