You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello Omnisafe team, thank you very much for your contribution.
When I was Learning the p3o algorithm, I found that the def _loss_pi_cost function was not clip, and loss_pi_cost in the P3O Optimization for Safe Reinforcement Learning used clip.
The text was updated successfully, but these errors were encountered:
You must be a very meticulous person! In fact, this is a trick we discovered while debugging the algorithm, which makes P3O more suitable for high-dimensional complex environments. Have you tried removing the clip? Do you have any experimental data? If it performs well without it, we will modify this implementation later.
Required prerequisites
Questions
Hello Omnisafe team, thank you very much for your contribution.
When I was Learning the p3o algorithm, I found that the def _loss_pi_cost function was not clip, and loss_pi_cost in the P3O Optimization for Safe Reinforcement Learning used clip.
The text was updated successfully, but these errors were encountered: