AICurious Logo

What is: Mixture of Softmaxes?

SourceBreaking the Softmax Bottleneck: A High-Rank RNN Language Model
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Mixture of Softmaxes performs KK different softmaxes and mixes them. The motivation is that the traditional softmax suffers from a softmax bottleneck, i.e. the expressiveness of the conditional probability we can model is constrained by the combination of a dot product and the softmax. By using a mixture of softmaxes, we can model the conditional probability more expressively.