TicTacML

#Dependancy

#Reading Material

Sutton's book
Karpathy's blog
David Silver's lecture on Policy Gradient

#Current Version

Right now it supports training of an Policy Agent with a Random Agent and then beating it As long as drawing of the game was not being penalized, the Policy agent didnt learn to beat the random agent, however, I then introduced a term to give a negative penaly if the game was drawed.

Added them to play against each other On retaining the penalty for drawing the game, the number of draws eventually went to zeros and both the players won equal number of games. So, I decided to see what happens if I remove the penalty from draws and then they decided to draw all the games. Interesting eh ?

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
draw.png		draw.png
game.py		game.py
optimization.py		optimization.py
policy_agent.py		policy_agent.py
progress.png		progress.png
random_agent.py		random_agent.py
self_learn.py		self_learn.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TicTacML

About

Releases

Packages

Languages

License

amartya18x/TicTacML

Folders and files

Latest commit

History

Repository files navigation

TicTacML

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages