AICurious Logo

What is: MPNet?

SourceMPNet: Masked and Permuted Pre-training for Language Understanding
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

MPNet is a pre-training method for language models that combines masked language modeling (MLM) and permuted language modeling (PLM) in one view. It takes the dependency among the predicted tokens into consideration through permuted language modeling and thus avoids the issue of BERT. On the other hand, it takes position information of all tokens as input to make the model see the position information of all the tokens and thus alleviates the position discrepancy of XLNet.

The training objective of MPNet is:

E_zZ_nn_t=c+1logP(x_z_tx_z_<t,M_z_>c;θ)\mathbb{E}\_{z\in{\mathcal{Z}\_{n}}} \sum^{n}\_{t=c+1}\log{P}\left(x\_{z\_{t}}\mid{x\_{z\_{<t}}}, M\_{z\_{{>}{c}}}; \theta\right)

As can be seen, MPNet conditions on x_z_<t{x\_{z\_{<t}}} (the tokens preceding the current predicted token x_z_tx\_{z\_{t}}) rather than only the non-predicted tokens x_z_<=c{x\_{z\_{<=c}}} in MLM; comparing with PLM, MPNet takes more information (i.e., the mask symbol [M][M] in position z_>cz\_{>c}) as inputs. Although the objective seems simple, it is challenging to implement the model efficiently. For details, see the paper.