Meta-RL

Tensorflow implementation of Meta-RL A3C algorithm taken from Learning to Reinforcement Learn. For more information, as well as explainations of each of the experiments, see my corresponding Medium post. A3C is built from previous implementation available here.

Contains iPython notebooks for:

A3C-Meta-Bandit - Set of bandit tasks described in paper. Including: Independent, Dependent, and Restless bandits.
A3C-Meta-Context - Rainbow bandit task using randomized colors to indicate reward-giving arm in each episode.
A3C-Meta-Grid - Rainbow Gridworld task; a variation of gridworld in which goal colors are randomzied each episode and must be learned "on the fly."

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
resources		resources
A3C-Meta-Bandit.ipynb		A3C-Meta-Bandit.ipynb
A3C-Meta-Context.ipynb		A3C-Meta-Context.ipynb
A3C-Meta-Grid.ipynb		A3C-Meta-Grid.ipynb
LICENSE		LICENSE
README.md		README.md
gridworld.py		gridworld.py
helper.py		helper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meta-RL

About

Releases

Packages

Languages

License

Gaopeng-Bai/Meta-RL

Folders and files

Latest commit

History

Repository files navigation

Meta-RL

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages