Skip to content

Latest commit

 

History

History
76 lines (50 loc) · 2.17 KB

README.md

File metadata and controls

76 lines (50 loc) · 2.17 KB

simple-DVMPC-implemetation

The un-official implementation of Deep Value Model Predictive Control (DVMPC) based on the papers by F.Farshidian and N.Karnchanachari

Algorithm

  • Unlike the original paper, this implementation utilized the Cross Entropy Method (CEM) and Model Predictive Path Integral (MPPI) for MPC optimization.

Environment

2d navigation environment

  • The environment is a 2d world with walls and goal.
  • the start point : (-14, -10), (-14, 0), (-14, 10)
  • the end point : (12, 0)

"screenshot"

Usage

Train

  • ensemble : if True, train the ensemble model
  • seed : seed number (default: 1234)
  • render : if True, visualize the agent on the environment
  • the default parameters are defined in the params/value_net_cem and params/ensemble_net_mppi.json file.
python3 examples/train_deep_value_mpc.py # train the single deep value mpc
python3 examples/train_deep_value_mpc.py --params_dir params/ensemble_net_mppi.json --ensemble # train the ensemble deep value mpc

Test

  • the default load directory is defined in the params/value_net_cem.json or other files.
python3 examples/test_deep_value_mpc.py --params_dir params/value_net_cem.json
python3 examples/test_deep_value_mpc.py --params_dir params/ensemble_net_mppi.json --ensemble

Visualize

python3 postprocess/visualize_value_net.py --model_weights runs/value_net/mppi_dense/value_net_028
python3 postprocess/visualize_ensemble_value_net.py --model_weights runs/ensemble_value_net/mppi_dense/value_net_033
python3 postprocess/write_reward_plot.py --log_list runs/value_net/cem_dense/logs/20221030_222322.csv runs/ensemble_value_net/mppi_dense/logs/20221031_005341.csv

"value map"

Reward plot

"cumulative reward plot"

Reference