Skip to content

hychen-naza/SSA-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 

Repository files navigation

Safe and Sample efficient Reinforcement Learning for Clustered Dynamic Uncertain Environments

Table of Contents

Introduction

We provide code for evaluate the safety and sample-efficiency of our proposed RL framework.

For safety, we use Safe Set Algorithm (SSA).
For efficiency, there are more strategies you can choose:
1, Adapting SSA;
2, Exploration (PSN, RND, None);
3, Learning from SSA;

The video result is shown below, agent is trained to drive to the goal while avoiding dynamic obstacles. The red means SSA is triggered.

Please cite our paper as:

@article{chen2021safe,
  title={Safe and sample-efficient reinforcement learning for clustered dynamic environments},
  author={Chen, Hongyi and Liu, Changliu},
  journal={IEEE Control Systems Letters},
  volume={6},
  pages={1928--1933},
  year={2021},
  publisher={IEEE}
}

Install

conda create -n safe-rl
conda install python=3.7.9
pip install tensorflow==2.2.1
pip install future
pip install keras
pip install matplotlib
pip install gym
pip install cvxopt

Usage

python train.py --display {none, turtle} --explore {none, psn, rnd} --no-qp --no-ssa-buffer
python train.py --display {none, turtle} --explore {none, psn, rnd} --qp --no-ssa-buffer
python train.py --display {none, turtle} --explore {none, psn, rnd} --no-qp --ssa-buffer
  • --display can be either none or turtle (visulization).
  • --explore specifies the exploration strategy that the robot uses.
  • --no-qp means that we use vanilla SSA.
  • --qp means that we use adapted SSA.
  • --no-ssa-buffer means that we use the default learning.
  • --ssa-buffer means that we use the safe learning from SSA demonstrations.

You may also try to test other safe controller (CBF, Shield) by uncommenting line 108-109 and 155-157.

Acknowledgments

Part of the simulation environment code is coming from the course CS 7638: Artificial Intelligence for Robotics in GaTech. We get the permission from the lecturor Jay Summet to use this code for research.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages