You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to implement a MCTS planning algorithm, which makes frequent calls to UPSequentialSimulator._get_applicable_actions(). At each step of a MCTS rollout, the set of applicable actions must be generated so an action can be sampled. When the space of grounded actions is large, _get_applicable_actions() becomes expensive to compute. With 100 grounded actions (arising from 3 lifted actions), I find that it takes up to 1 second to perform a single depth 100 MCTS rollout, with that time dominated by calls to _get_applicable_actions().
I fixed this by adding the method decorator @cache from functools to UPSequentialSimulator._get_applicable_actions(), which yielded a 80x speedup for the MCTS rollouts. Does this use of caching violate any assumptions made by the UPSequentialSimulator? Are there other intended ways I can speed up calls to _get_applicable_actions()?
Hi @dhrchan! The _get_applicable_actions method has only the state parameter that is an immutable object. So the caching doesn't violate any assumptions.
However, the State class doesn't implement the __eq__ method, so the caching should work only for the very same object.
So is it correct that you are calling the _get_applicable_actions method several times for the same State object in a single MCTS rollout?
Thank you for your reply! I suppose if there are cycles in my MCTS rollout, repeated states won't be the same State object, and thus would cause a cache miss. I think I can fix this by implementing __eq__ and __hash__ methods for State, as #555 pointed out.
I also found another issue with using the caching decorator: since _get_applicable_actions returns a generator, it can't be cached the same way list can using functools.cache. After the first call to _get_applicable_actions, the generator returned will be initialized to wherever the last call ended, eventually returning an empty iterator. So, the 80x speedup I observed is not actually possible in this way.
To solve the problem with the generator, the caching decorator can be used for the _is_applicable method, since the _get_applicable_actions computation time should be dominated by calls to _is_applicable.
I want to implement a MCTS planning algorithm, which makes frequent calls to
UPSequentialSimulator._get_applicable_actions()
. At each step of a MCTS rollout, the set of applicable actions must be generated so an action can be sampled. When the space of grounded actions is large,_get_applicable_actions()
becomes expensive to compute. With 100 grounded actions (arising from 3 lifted actions), I find that it takes up to 1 second to perform a single depth 100 MCTS rollout, with that time dominated by calls to_get_applicable_actions()
.I fixed this by adding the method decorator
@cache
fromfunctools
toUPSequentialSimulator._get_applicable_actions()
, which yielded a 80x speedup for the MCTS rollouts. Does this use of caching violate any assumptions made by theUPSequentialSimulator
? Are there other intended ways I can speed up calls to_get_applicable_actions()
?The text was updated successfully, but these errors were encountered: