Tic-tac-toe or in british english noughts and crosses is an ancient game which every seven years old children learns how to play and not to lose in exactly nine plays or so. Why would you spend your time with playing and studying such a triviality? Well, I got interested in Reinforcement learning recently. So I can use it for some more nasty stuff and tic-tac-toe is a classic example such that even Sutton & Barto [3] don't hesitate to put it into the book's introduction.
We use Poetry
on Python 3.8 and each method can be run using
make requirements
make minimax
make run_menace
make run_rl
[1] Michie, D. (1963), "Experiments on the mechanization of game-learning Part I. Characterization of the model
and its parameters", https://people.csail.mit.edu/brooks/idocs/matchbox.pdf
[2] S Michie, D. (1961) "Trial and error". Penguin Science Survey.
[3] Sutton, R. S. & Barto, A. G. (2018), "Reinforcement Learning: An Introduction", The MIT Press.
[4] Mitchell, M. (2019), "Artificial intelligence a guide for thinking humans", Pelican books
[5] Scroggs, M. (2015), "MENACE: Machine Educable Noughts And Crosses Engine"
[6] Luce, R. D. & Raiffa, H. (1957). "Games and decisions: Introduction and critical survey". New York: Wiley.