Skip to content

Uses OpenAI Gym with Q-Learning, Value iteration, REINFORCE Policy Gradient with Continuous State Space and Continuous Action Space

License

Notifications You must be signed in to change notification settings

RajatBhageria/Reinforcement-Learning

Repository files navigation

Problem 1:
ValueIteration.py has the value iteration for the maze. Note the qValues are a little bigger in magnitude than
expected but upon looking at the outputs of the renderings via value_plot.py, the policies are correct.
This problem uses the evaluation.py script to do evaluation.

the QValues.npy saved were generated from Value Iteration

Problem 2:
qLearning.py has the entire qLearning problem for the maze. The policies here are close but not perfect but note
that the RMSE is large because the magnitudes of the QValues from ValueIteration were much larger than expected
(even though the policies for VI were correct).
This script uses evaluation.py script to do evaluation.

Problem 3:
acrobat.py has the REINFORCE algorithm called REINFORCEAcrobat() as well as the qlearning algorithm called qLearningAcrobat().
For the qLearningAcrobat, we use the evaluationAcrobat.py and for REINFORCE, evaluation is through evaluationREINFORCE.py. To run qLearning,
just run that method in the if __name__ == "__main__" of the file and vice versa for REINFORCECar().

(d): We cannot use PI/VI since we only have a sequence of state, action, reward triplets but we don't have a full
transition and reward matrix per se.

Problem 4:
car.py has functions qLearningMountain() to do the qLearning and also REINFORCECar() to do the REINFORCE. To run qLearning,
just run that method in the if __name__ == "__main__" of the file and vice versa for REINFORCECar().
For the qLearningMountain, we use the evaluationCar.py and for REINFORCE, evaluation is through evaluationREINFORCE.py.

About

Uses OpenAI Gym with Q-Learning, Value iteration, REINFORCE Policy Gradient with Continuous State Space and Continuous Action Space

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages