AICurious Logo

What is: PanGu-$α$?

SourcePanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

PanGu-αα is an autoregressive language model (ALM) with up to 200 billion parameters pretrained on a large corpus of text, mostly in Chinese language. The architecture of PanGu-αα is based on Transformer, which has been extensively used as the backbone of a variety of pretrained language models such as BERT and GPT. Different from them, there's an additional query layer developed on top of Transformer layers which aims to explicitly induce the expected output.