**Soft Actor Critic (Autotuned Temperature** is a modification of the [SAC](https://paperswithcode.com/method/soft-actor-critic) reinforcement learning algorithm. [SAC](https://paperswithcode.com/method/sac) can suffer from brittleness to the temperature hyperparameter. Unlike in conventional reinforcement learning, where the optimal policy is independent of scaling of the reward function, in maximum entropy reinforcement learning the scaling factor has to be compensated by the choice a of suitable temperature, and a sub-optimal temperature can drastically degrade performance. To resolve this issue, SAC with Autotuned Temperature has an automatic gradient-based temperature tuning method that adjusts the expected entropy over the visited states to match a target value.

**NoisyNet-A3C** is a modification of [A3C](https://paperswithcode.com/method/a3c) that utilises noisy linear layers for exploration instead of 
$\epsilon$-greedy exploration as in the original [DQN](https://paperswithcode.com/method/dqn) formulation.

NoisyNet-A3C

Noisy Networks for Exploration

Soft Actor-Critic (Autotuned Temperature)

Soft Actor-Critic Algorithms and Applications

Please enter a description about the method here

Source	Soft Actor-Critic Algorithms and Applications
Year	2000
Data Source	CC BY-SA - https://paperswithcode.com