AICurious Logo

What is: GPipe?

SourceGPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

GPipe is a distributed model parallel method for neural networks. With GPipe, each model can be specified as a sequence of layers, and consecutive groups of layers can be partitioned into cells. Each cell is then placed on a separate accelerator. Based on this partitioned setup, batch splitting is applied. A mini-batch of training examples is split into smaller micro-batches, then the execution of each set of micro-batches is pipelined over cells. Synchronous mini-batch gradient descent is applied for training, where gradients are accumulated across all micro-batches in a mini-batch and applied at the end of a mini-batch.