Jayanth's Blog

Equivalence of Forward and Backward view in TD($\lambda$) method (Incomplete Blog)

This blog post explains the equivalence of the forward view and the backward view in the Temporal Difference ($\lambda$) method. The same holds true for generalized advantage estimation.

2 min read · May 2, 2022

2022 · ReinforcementLearning · rl-posts
Monte-Carlo Algorithm with Reward Reshaping for Mountain Car gym environment

This blog post explains the On-policy Every-Visit Monte-Carlo algorithm and its implementation in *MountainCar-v0* openai-gym environment.

13 min read · April 25, 2022

2022 · ReinforcementLearning · rl-posts
Contraction Property of Bellman Operator with contraction operator $\gamma < 1$

This blog post shows that the Bellman Operator used in value iteration is a contraction operator with contraction $\gamma<1$ and with respect to the $l_{\infty}$-norm.

5 min read · February 20, 2022

2022 · ReinforcementLearning · rl-posts