AICurious Logo

What is: Kalman Optimization for Value Approximation?

SourceKalman meets Bellman: Improving Policy Evaluation through Value Tracking
Year2000
Data SourceCC BY-SA - https://paperswithcode.com

Kalman Optimization for Value Approximation, or KOVA is a general framework for addressing uncertainties while approximating value-based functions in deep RL domains. KOVA minimizes a regularized objective function that concerns both parameter and noisy return uncertainties. It is feasible when using non-linear approximation functions as DNNs and can estimate the value in both on-policy and off-policy settings. It can be incorporated as a policy evaluation component in policy optimization algorithms.