**Experience Replay** is a replay memory technique used in reinforcement learning where we store the agent’s experiences at each time-step, $e\_{t} = \left(s\_{t}, a\_{t}, r\_{t}, s\_{t+1}\right)$ in a data-set $D = e\_{1}, \cdots, e\_{N}$ , pooled over many episodes into a replay memory. We then usually sample the memory randomly for a minibatch of experience, and use this to learn off-policy, as with Deep Q-Networks. This tackles the problem of autocorrelation leading to unstable training, by making the problem more like a supervised learning problem.

Image Credit: [Hands-On Reinforcement Learning with Python, Sudharsan Ravichandiran](https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781788836524)

**RMSProp** is an unpublished adaptive learning rate optimizer [proposed by Geoff Hinton](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf). The motivation is that the magnitude of gradients can differ for different weights, and can change during learning, making it hard to choose a single global learning rate. RMSProp tackles this by keeping a moving average of the squared gradient and adjusting the weight updates by this magnitude. The gradient updates are performed as:

$$E\left[g^{2}\right]\_{t} = \gamma E\left[g^{2}\right]\_{t-1} + \left(1 - \gamma\right) g^{2}\_{t}$$

$$\theta\_{t+1} = \theta\_{t} - \frac{\eta}{\sqrt{E\left[g^{2}\right]\_{t} + \epsilon}}g\_{t}$$

Hinton suggests $\gamma=0.9$, with a good default for $\eta$ as $0.001$.

Image: [Alec Radford](https://twitter.com/alecrad)

RMSProp

Experience Replay

**BAGUA** is a communication framework whose design goal is to provide a system abstraction that is both flexible and modular to support state-of-the-art system relaxation techniques of distributed training. The abstraction goes beyond parameter server and Allreduce paradigms, and provides a collection of MPI-style collective operations to facilitate communications with different precision and centralization strategies.

Year	1993
Data Source	CC BY-SA - https://paperswithcode.com