Caching for `UPSequentialSimulator._get_applicable_actions()` #572

dhrchan · 2024-02-22T18:44:52Z

I want to implement a MCTS planning algorithm, which makes frequent calls to UPSequentialSimulator._get_applicable_actions(). At each step of a MCTS rollout, the set of applicable actions must be generated so an action can be sampled. When the space of grounded actions is large, _get_applicable_actions() becomes expensive to compute. With 100 grounded actions (arising from 3 lifted actions), I find that it takes up to 1 second to perform a single depth 100 MCTS rollout, with that time dominated by calls to _get_applicable_actions().

I fixed this by adding the method decorator @cache from functools to UPSequentialSimulator._get_applicable_actions(), which yielded a 80x speedup for the MCTS rollouts. Does this use of caching violate any assumptions made by the UPSequentialSimulator? Are there other intended ways I can speed up calls to _get_applicable_actions()?

Links https://docs.python.org/3/library/functools.html#functools.cache

The text was updated successfully, but these errors were encountered:

alvalentini · 2024-02-29T17:12:29Z

Hi @dhrchan! The _get_applicable_actions method has only the state parameter that is an immutable object. So the caching doesn't violate any assumptions.

However, the State class doesn't implement the __eq__ method, so the caching should work only for the very same object.
So is it correct that you are calling the _get_applicable_actions method several times for the same State object in a single MCTS rollout?

dhrchan · 2024-02-29T18:26:24Z

Thank you for your reply! I suppose if there are cycles in my MCTS rollout, repeated states won't be the same State object, and thus would cause a cache miss. I think I can fix this by implementing __eq__ and __hash__ methods for State, as #555 pointed out.

I also found another issue with using the caching decorator: since _get_applicable_actions returns a generator, it can't be cached the same way list can using functools.cache. After the first call to _get_applicable_actions, the generator returned will be initialized to wherever the last call ended, eventually returning an empty iterator. So, the 80x speedup I observed is not actually possible in this way.

alvalentini · 2024-03-01T09:16:42Z

To solve the problem with the generator, the caching decorator can be used for the _is_applicable method, since the _get_applicable_actions computation time should be dominated by calls to _is_applicable.

dhrchan added the user story label Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching for `UPSequentialSimulator._get_applicable_actions()` #572

Caching for `UPSequentialSimulator._get_applicable_actions()` #572

dhrchan commented Feb 22, 2024

alvalentini commented Feb 29, 2024

dhrchan commented Feb 29, 2024 •

edited

Loading

alvalentini commented Mar 1, 2024 •

edited

Loading

Caching for UPSequentialSimulator._get_applicable_actions() #572

Caching for UPSequentialSimulator._get_applicable_actions() #572

Comments

dhrchan commented Feb 22, 2024

alvalentini commented Feb 29, 2024

dhrchan commented Feb 29, 2024 • edited Loading

alvalentini commented Mar 1, 2024 • edited Loading

Caching for `UPSequentialSimulator._get_applicable_actions()` #572

Caching for `UPSequentialSimulator._get_applicable_actions()` #572

dhrchan commented Feb 29, 2024 •

edited

Loading

alvalentini commented Mar 1, 2024 •

edited

Loading