Skip to content

Nils-ChristianIseke/Deep-Reinforcement-Learning-for-motion-planning

Repository files navigation

Deep Reinforcement Learning for motion planning

This repository is an extesion of: drl_grasping. Please read through the README of the original repo first! This is necessary to understand this README. We have deliberately chosen to focus this README only on the changes that we implemented.

We added a few new Environments:

Newly added Environments (click to expand)
  1. InverseKinematics
  2. InverseKinematicsWithObstacles
  3. InverseKinematicsWithMovingObstacles
  4. InverseKinematicsWithManyMovingObstacles
  5. ReachWithObstacles

For a detailed explination of each task, go down to the environment section. The naming of the environments 3 and 4 is a bit misleading. MovingObstacle refers to the fact that the obstacles are randomly spawend at the beginning of each episode, but they are staying at the same position during the whole episode.

The following animations are showing some results using the Panda robotic arm.

Evaluation of a trained policy on the InverseKinematic Task Evaluation of a trained policy on the InverseKinematicTaskWithRandomObstacles Evaluation of a trained policy on the InverseKinematicTaskWithRandomObstacles

Disclaimer: The following instructions are based on the original Repository and were adjusted to the extensions we are providing.

Instructions

Disclaimer: Those instructions were only tested for Ubuntu 22.04. If you get stuck please feel free to contact us.

Requirements

  • OS: Any system that supports Docker should work (Linux, Windows, macOS). Only Ubuntu 22.04 was tested.
  • GPU: CUDA is required.

Dependencies

# Docker
curl https://get.docker.com | sh \
  && sudo systemctl --now enable docker
# Nvidia Docker
distribution=$(. /etc/os-release; echo $ID$VERSION_ID) \
  && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
  && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-docker2
sudo systemctl restart docker

Docker Instructions

  1. Pull this repository.
git pull https://github.com/Nils-ChristianIseke/deepRLIK.git
  1. Get the docker image:
docker pull slin25/rl_motion_planning 

Now you can start developing inside the container by: 1.Starting the container,in a terminal and execute:

    cd drl_grasping <path_to_the_cloned_repo>/docker
    sudo ./run.bash slin25/rl_motion_planning /bin/bash

Now you are inside the running container where you can, start the training. 2. Optional (If you want to develop with VS Code): Connecting to the container as described here See the steps provided below 3. Inside the container cd to the ros package dlr_grasping:

    cd /root/drl_grasping/drl_grasping/src/drl_grasping

Now you are at the root of the ROS-Package.

  1. To start a training execute:
   ros2 run drl_grasping ex_train.bash

This will start a training, which by default is using a pretrained agent (TQC). If you want to see the simulation, prior to starting the training you need to uncomment the line 'model.env.render("human")' in train.py (deepRLIK/scripts/train.py).

VS CODE Remote Containers

One convinient way to edit the code e.g.: changing the reward function, or adding new tasks, is by connecting VS-Code to the container:

  1. Install VS Code
  2. Install the VS Code Extension Remote Containers
  3. To connect to a running container, follow the steps provided here

Training of Agent

Take a look at the the Training of Agent section of the original repository repository

If you want to see what is going on , you need to uncomment 'model.env.render("human")' in train.py (deepRLIK/scripts/train.py). To force the start of the simulation.

Continue Training on pretrained agents

We are also providing some pretrained agent, those can be found in the training directories. If you want to use them you need to change the TRAINED_AGENT varibale in ex_train.bash. it shall point to one .zip file.

Environments

Take a look at the the Enviroment section of the original repository repository

We added the following enviroments, to the original repository:

New Environments (Work of this project) (click to expand)
  • InverseKinematics task Description: The agents goal is to calculate the necessary joint angles of the robotic arm to reach a random goal point --> The agent shall learn the inverse kinematic model of the arm. Environment: The environment contains the robotic arm, and a randomly spawned goal point. Observation: Position of the goal point and the endeffector of the robotic arm Action: The joint angles of the robotic arm.

    • InverseKinematicsWithObstacles -Description: The agents goal is to calculate the necessary joint angles of the robotic arm to reach a random goal point, while avoiding collisions with an obstacle Environment: The environment contains the robotic arm, a randomly spawned goal point and an obstacle. Observation: Position of the goal point, the endeffector of the robotic arm and position + orientation of the obstacle Action: The joint angles of the robotic arm.
  • InverseKinematicsWithMovingObstacles -Description: The agents goal is to calculate the necessary joint angles of the robotic arm to reach a random goal point, while avoiding collisions with an obstacle Environment: The environment contains the robotic arm, a randomly spawned goal point and an obstacle. Observation: Position of the goal point, the endeffector of the robotic arm and position + orientation of the obstacle Action: The joint angles of the robotic arm.

  • InverseKinematicsWithManyMovingObstacles -Description: The agents goal is to calculate the necessary joint angles of the robotic arm to reach a random goal point, while avoiding collisions with an obstacle Environment: The environment contains the robotic arm, a randomly spawned goal point as well as a number of obstacles. Observation: Position of the goal point, the endeffector of the robotic arm and position + orientation of the obstacles Action: The joint angles of the robotic arm.

  • Reach task (extension of the orginal Reach Task)

    • ReachWithObstacles -Description: The agents goal is to calculate the necessary goal positions to move the robotic arm to reach a random goal point, while avoiding collisions with an obstacle. The inverse kinematic is calculated via MOVEIT!. Environment: The environment contains the robotic arm, a randomly spawned goal point and an obstacle. Observation: Position of the goal point, the endeffector of the robotic arm and position + orientation of the obstacle Action: The goal point.

Inside the definition of each class some variables can be set, e.g.: For the InverseKinematicsWithMovingObstacles task. Especially important are the object, and obstacle related variables. For the newly implemented tasks the object and obstacle related variables (e.g.:_object_enable, _object_type, _object_dimension_volume, obstacle_type, etc.) define the properties of the goal point and the obstacle (where it is spawned, what it looks like etc.). The standarts values are restricting the possible spawning volume of object and obstalce to a small volume. Thus keeping the observation space small. (Faster training). For a more general solution the spawning volume of both should be the same size as the workspace of the robot.

Adding new environments and training the agent (click to expand) To implement a new task / environment, the following steps are necessary:
  1. In the dir /envs/task add your task(e.g.: inversekinematics.py inside the inversekinematics dir)
  2. Register your task as gym environment inside /envs/tasks/__init__.py(e.g.: adding register( id='IK-Gazebo-v0',...kwargs={...,'task_cls': InverseKinematics,...)
  3. Add the hyperparams for your task /hyperparams (e.g. add IK-Gazebo-v0 with arguments to the tqc.yml)
  4. Adjust the arguments of examples/ex_train.bash (e.g. change ENV_ID to "IK-Gazebo-v0" and ALGO to "tqc")
  5. Uncommend model.env.render("human") in /scripts/train.py if you want to see the simulation of the environment.
  6. Start the training by executing: ros2 run drl_grasping ex_train.bash in the running container

Future Work

From the author's point future work could focus on:

  • enlarging the spawning volume of obstacle and goal point to the whole workspace
  • Adding moving obstacles and goal_points
  • Adding obstacles of complex shape
  • Comparing the RL-Learning Approach for path planning with classic approaches of path planning
  • Making the task more complex by sensing the obstacle space via a camera (as it's done in the grasp task), instead of getting the positions of the obstacles via the gazebo API
  • Autotuning Hyperparameters

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published