You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched through the issue tracker for duplicates
I have mentioned version numbers, operating system and environment, where applicable:
I have noticed that in the implementation of the PPOPolicy, the computation of the old log probabilities logp_old is performed without using minibatch:
This makes this algorithm unusable in situations where the batch is too large, with no possibility of controlling it via batch_size.
I simply suggest to add support for minibatch:
Closes#1164
In PPOPolicy, the method `process_fn()` now computes `logp_old` in
minibatch instead of all at once.
---------
Co-authored-by: Michael Panchenko <[email protected]>
I have noticed that in the implementation of the PPOPolicy, the computation of the old log probabilities
logp_old
is performed without using minibatch:This makes this algorithm unusable in situations where the batch is too large, with no possibility of controlling it via batch_size.
I simply suggest to add support for minibatch:
The version of Tianshou that I'm using is 1.0.0.
The text was updated successfully, but these errors were encountered: