RL-based Maze Solver

Open Projects 2021

fig1 Input Maze Animated Solution

fig2 Input Maze Solution

fig3 Random Maze Solution

Abstract

The aim of this project is to solve a given 2D maze (given as an image input) using RL (Q-Learning) algorithm. The program first processes the image input of the maze and converts it into a matrix. The Q-Learning algorithm then uses this matrix to generate a q-matrix, which is finally used to get the shortest path. The program can also generate random mazes and then solve it.

Motivation

Reinforcement learning in robotics has been a challenging domain in the field of AI for the past few years. The ability to equip a robot with a powerful enough tool to allow an autonomous discovery of an optimal behavior through trial-and-error interactions with its environment has paved the path to many deep research projects.

Our project inspires us to solve many problems, like Autonomous Mobile Robot Obstacle Prevention using Q-learning.

fig4 Image Processing

Software Aspect of the Design

Tech Stack

Python - It is an interpreted high-level object-oriented programming language designed by Guido van Rossum. It emphasizes code readability with significant use of indentation. It has tons of third-party open-source libraries (which can be installed using its own package manager pip) to assist all types of programs.
Jupyter Notebook - It is a product developed under Project Jupyter (spun off from IPython in 2014 by Fernando Perez) and is used to create documents containing live code, equations, visualizations and narrative text. It is used in data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, etc.
NumPy - It is a Python library providing fundamental package for scientific computing. It is used for working with arrays in the domain of linear algebra, fourier transform, matrices, etc. Numpy arrays are 50x faster than Python lists.
Matplotlib - It is a Python library for creating static, animated, and interactive visualizations and plots and was introduced by John D. Hunter in 2002.
Python Imaging Library (PIL) - It is a Python library providing support for opening, manipulating, and saving many different image file formats. It supports Python version 1.5.2 to 2.7. A subsequent fork of the PIL repository named Pillow added Python 3.x support.

Working

Terminologies
- Episode - In a given episode the Agent solves the maze by making some random moves (exploration) and some moves by looking up at the q-matrix (exploitation).
- Iteration - Every move in an episode is called as an iteration.
- State - Every coordinate point of the Maze can be defined as a unique state.
- Q-Matrix - A Q-matrix is defined as a numpy matrix of size (no. of different moves available, i.e. 4) × (No. of states) with each entry as Zero defined initially.
- Bellman Equation - It is used to update the Q-Matrix.
fig5 Bellman Equation
Image Processing

fig6 Image Processing Workflow

Click Here for detailed explanation

Click here to access the Image Processing Code
Maze Solving

fig7 Maze Solving Workflow

Click here to access the Maze Solving Code

Cost Structure

Software (Components)	Cost
Python	Open-Source/None
Jupyter Notebook	Open-Source/None
NumPy	Open-Source/None
Matplotlib	Open-Source/None
PIL	Open-Source/None

Applications

This project can be used to develop a navigation system with the ability to learn to adapt to unknown environments. Such game-playing AIs are designed in a way that their solutions are relevant in many practical applications :

Industrial Applications - Mobile robots have been increasingly used over the last two decades in various industries to move goods from one place to another with optimal movement policy.
Restaurants of the Future (ROTF) - ROTF is one of the greatest physical application of the project. Since such a RL based implementation in a mobile robot will help us to give and take orders from the customers, in a faster and more efficient way.
Navigation - Maze solving can further be extended for autonomous navigation in an occupancy grid to get to the nearest landmark like an EV charging station or a petrol pump.

Limitations

At the current state, our project has a few limitations :

Before running, the program requires total number of episodes and number of random episodes to be given as a manual input. These can be developed as a function of the maze size to get a higher degree of automation.
The reduceMatrix(), trim() and sharpen() functions also require manual passing of parameters.
The Image Processing part of our project is not strong enough to process very low-resolution maze images.
The simple reinforcement learning algorithm would collapse when dealing with complex mazes.

Future Improvements

In future, we plan to :

generate an animated maze solution.
implement Multi-Objective target search, wherein the agent of Q-Learning must visit intermediate flag positions before going to the end of the maze.
extend our project to 3-D Mazes.

Team Members

Mentors

References

Python :
Graph Theory :
- Progamiz
- Youtube/freeCodeCamp.org
Dijkstra Algorithm :
Bellman Ford's Algorithm :
- Programiz
- Youtube/Abdul Bari
Reinforcement Learning :
- Youtube/DeepMind
- samyzaf.com
Q-Learning :
- Youtube/deeplizard
- Wikipedia
Matplotlib:
- GeeksforGeeks
Image Processing of the Maze :
- stackoverflow

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
Images and Videos		Images and Videos
Poster and Report		Poster and Report
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RL-based Maze Solver

Abstract

Motivation

Software Aspect of the Design

Tech Stack

Working

Terminologies

Image Processing

Maze Solving

Cost Structure

Applications

Limitations

Future Improvements

Team Members

Mentors

References

About

Releases

Packages

Languages

tushdon2/RL-based-Maze-Solver

Folders and files

Latest commit

History

Repository files navigation

RL-based Maze Solver

Abstract

Motivation

Software Aspect of the Design

Tech Stack

Working

Terminologies

Image Processing

Maze Solving

Cost Structure

Applications

Limitations

Future Improvements

Team Members

Mentors

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages