AICurious Logo

What is: ZeRO-Infinity?

SourceZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Scale Deep Learning
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

ZeRO-Infinity is a sharded data parallel system that extends ZeRO with new innovations in heterogeneous memory access called the infinity offload engine. This allows ZeRO-Infinity to support massive model sizes on limited GPU resources by exploiting CPU and NVMe memory simultaneously. In addition, ZeRO-Infinity also introduces a novel GPU memory optimization technique called memory-centric tiling to support extremely large individual layers that would otherwise not fit in GPU memory even one layer at a time.