**Expected Sarsa** is like [Q-learning](https://paperswithcode.com/method/q-learning) but instead of taking the maximum over next state-action pairs, we use the expected value, taking into account how likely each action is under the current policy.

$$Q\left(S\_{t}, A\_{t}\right) \leftarrow Q\left(S\_{t}, A\_{t}\right) + \alpha\left[R_{t+1} + \gamma\sum\_{a}\pi\left(a\mid{S\_{t+1}}\right)Q\left(S\_{t+1}, a\right) - Q\left(S\_{t}, A\_{t}\right)\right] $$

Except for this change to the update rule, the algorithm otherwise follows the scheme of Q-learning. It is more computationally expensive than [Sarsa](https://paperswithcode.com/method/sarsa) but it eliminates the variance due to the random selection of $A\_{t+1}$.

Source: Sutton and Barto, Reinforcement Learning, 2nd Edition

**Bridge-net** is an audio model block used in the [ClariNet](https://paperswithcode.com/method/clarinet) text-to-speech architecture. Bridge-net maps frame-level hidden representation to sample-level through several [convolution](https://paperswithcode.com/method/convolution) blocks and [transposed convolution](https://paperswithcode.com/method/transposed-convolution) layers interleaved with softsign non-linearities.

Bridge-net

ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech

Expected Sarsa

**AdamW** is a stochastic optimization method that modifies the typical implementation of weight decay in [Adam](https://paperswithcode.com/method/adam), by decoupling [weight decay](https://paperswithcode.com/method/weight-decay) from the gradient update. To see this, $L\_{2}$ regularization in Adam is usually implemented with the below modification where $w\_{t}$ is the rate of the weight decay at time $t$:

$$ g\_{t} = \nabla{f\left(\theta\_{t}\right)} + w\_{t}\theta\_{t}$$

while AdamW adjusts the weight decay term to appear in the gradient update:

$$ \theta\_{t+1, i} = \theta\_{t, i} - \eta\left(\frac{1}{\sqrt{\hat{v}\_{t} + \epsilon}}\cdot{\hat{m}\_{t}} + w\_{t, i}\theta\_{t, i}\right), \forall{t}$$

Year	2000
Data Source	CC BY-SA - https://paperswithcode.com