Locally fitted dynamics for linear quadratic Gaussian controllers of a simulated 3 link robot arm. Roughly inspired by the Guided Policy Search method for trajectory optimization.
The overall process is to:
- Generate random controls for each timestep of the trajectory.
- Execute the controls a number of times (with some noise) to sample the dynamics around the trajectory.
- Fit locally linear models at each timestep using the samples of each state and action.
- Use the local models to propose new state-feedback gains and improve the LQG controllers.
The image on the left is a rendering of the robot scene, and on the right is a plot of trajectories for each iteration.