Added support for minibatch in PPO process_fn #1168

jvasso · 2024-07-06T14:28:52Z

In PPOPolicy, the method process_fn() now computes logp_old in minibatch instead of all at once.

MischaPanch · 2024-07-18T19:02:23Z

This looks innocent, but tests fail. I'll have a look at what goes wrong and will finalize this.

Sorry for the late reaction, my official work has moved me in another direction for a few months, but now I'm back to devoting time to tianshou :)

MischaPanch · 2024-07-20T07:27:49Z

Looks like just a typo, _batch -> batch

MischaPanch · 2024-07-20T07:33:47Z

Ok, I had a closer look. The current handling is a bit confusing, there is self.max_batchsize which is used implicitly through a method defined in super class. So the fix is to split according to this param, pushing the fix now

Trinkle23897 approved these changes Jul 6, 2024

View reviewed changes

MischaPanch enabled auto-merge (squash) July 20, 2024 08:03

MischaPanch disabled auto-merge July 20, 2024 13:48

jvasso and others added 2 commits July 20, 2024 15:48

Added support for minibatch in PPO process_fn

7558497

Typo

918c122

MischaPanch force-pushed the minibatch_ppo branch from 259e426 to 918c122 Compare July 20, 2024 13:49

MischaPanch merged commit 324e3d2 into thu-ml:master Jul 20, 2024
0 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for minibatch in PPO process_fn #1168

Added support for minibatch in PPO process_fn #1168

jvasso commented Jul 6, 2024

MischaPanch commented Jul 18, 2024

MischaPanch commented Jul 20, 2024

MischaPanch commented Jul 20, 2024

Added support for minibatch in PPO process_fn #1168

Added support for minibatch in PPO process_fn #1168

Conversation

jvasso commented Jul 6, 2024

MischaPanch commented Jul 18, 2024

MischaPanch commented Jul 20, 2024

MischaPanch commented Jul 20, 2024