AICurious Logo

What is: WaveNet?

SourceWaveNet: A Generative Model for Raw Audio
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

WaveNet is an audio generative model based on the PixelCNN architecture. In order to deal with long-range temporal dependencies needed for raw audio generation, architectures are developed based on dilated causal convolutions, which exhibit very large receptive fields.

The joint probability of a waveform x={x1,,xT}\vec{x} = \{ x_1, \dots, x_T \} is factorised as a product of conditional probabilities as follows:

p(x)=t=1Tp(xtx1,,xt1)p\left(\vec{x}\right) = \prod_{t=1}^{T} p\left(x_t \mid x_1, \dots ,x_{t-1}\right)

Each audio sample xtx_t is therefore conditioned on the samples at all previous timesteps.