Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulated action effects break grounded action comparison #585

Open
fteicht opened this issue Mar 20, 2024 · 2 comments
Open

Simulated action effects break grounded action comparison #585

fteicht opened this issue Mar 20, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@fteicht
Copy link
Member

fteicht commented Mar 20, 2024

Hello,

This bug report originally comes from issue #567 but digging further into the issue reveals that the potential bug is different from what I originally thought. Thus this new and more accurate bug report.

So I suspect there is a bug in UP when computing the hash value and equality test of grounded actions with simulated effects. At the moment it is preventing the up-skdecide engine from working properly, especially with the scikit-decide's RL engine which allows UP problems to be solved by using Reinforcement Learning. Indeed, this engine relies on grounded actions.

Below is a minimal code example that you can run and that shows that UP simulated action effects prevent grounded actions from being properly compared:

from unified_planning.shortcuts import *
from unified_planning.environment import get_environment
from unified_planning.engines.compilers.grounder import GrounderHelper
from unified_planning.engines.sequential_simulator import UPSequentialSimulator


Location = UserType("Location")
Robot = UserType("Robot")

at = Fluent("at", Location, robot=Robot)
battery_charge = Fluent("battery_charge", IntType(0, 100), robot=Robot)

move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(GE(battery_charge(robot), 10))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)


def fun(problem, state, actual_params):
    value = state.get_value(battery_charge(actual_params.get(robot))).constant_value()
    return [Int(value - 10)]


# COMMENT IN THE FOLLOWING LINE AND THE LAST ASSERT WILL WORK
move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun))

l1 = Object("l1", Location)
l2 = Object("l2", Location)
r1 = Object("r1", Robot)

problem = Problem("robot_with_simulated_effects")
problem.add_fluent(at)
problem.add_fluent(battery_charge)
problem.add_action(move)
problem.add_object(l1)
problem.add_object(l2)
problem.add_object(r1)

problem.set_initial_value(at(r1), l1)
problem.set_initial_value(battery_charge(r1), 100)

simulator = UPSequentialSimulator(problem)

all_actions = set(
    [(a[0], a[1]) for a in GrounderHelper(problem).get_grounded_actions()]
)

app_actions = set(
    [a for a in simulator.get_applicable_actions(simulator.get_initial_state())]
)

# FOLLOWING ASSERT ALWAYS WORKS WITH OR WITHOUT SIMULATED EFFECT
assert app_actions.issubset(all_actions)

all_grounded_actions = set(
    [a[2] for a in GrounderHelper(problem).get_grounded_actions()]
)

app_grounded_actions = set(
    [
        simulator._ground_action(a[0], a[1])
        for a in simulator.get_applicable_actions(simulator.get_initial_state())
    ]
)

# FOLLOWING ASSERT ONLY WORKS WITHOUT SIMULATED EFFECT
assert app_grounded_actions.issubset(all_grounded_actions)

If you comment in the line move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun)) you will see that the last assert runs fine. It shows that the issue is related to using UP simulated action effects.

Do you confirm this is a bug, and if so, would it be please possible to correct it as soon as you can so that I can push working up-skdecide RL-based engine in theAIPlan4EU project?

Thank you in advance for looking into this!

@alvalentini
Copy link
Member

Hi @fteicht!
The problem is that the python function associated to the grounded simulated effect generated by the GrounderHelper (used to compute the all_grounded_actions) is not the same as the one generated internally by the simulator. It is not clear to me how to implement an equality that is able to understand that two different python functions are semantically equivalent.

A way that I see to fix your code is to use the same GrounderHelper, in this way:

from unified_planning.shortcuts import *
from unified_planning.environment import get_environment
from unified_planning.engines.compilers.grounder import GrounderHelper
from unified_planning.engines.sequential_simulator import UPSequentialSimulator


Location = UserType("Location")
Robot = UserType("Robot")

at = Fluent("at", Location, robot=Robot)
battery_charge = Fluent("battery_charge", IntType(0, 100), robot=Robot)

move = InstantaneousAction("move", robot=Robot, l_from=Location, l_to=Location)
robot = move.parameter("robot")
l_from = move.parameter("l_from")
l_to = move.parameter("l_to")
move.add_precondition(Equals(at(robot), l_from))
move.add_precondition(GE(battery_charge(robot), 10))
move.add_precondition(Not(Equals(l_from, l_to)))
move.add_effect(at(robot), l_to)


def fun(problem, state, actual_params):
    value = state.get_value(battery_charge(actual_params.get(robot))).constant_value()
    return [Int(value - 10)]


# COMMENT IN THE FOLLOWING LINE AND THE LAST ASSERT WILL WORK
move.set_simulated_effect(SimulatedEffect([battery_charge(robot)], fun))

l1 = Object("l1", Location)
l2 = Object("l2", Location)
r1 = Object("r1", Robot)

problem = Problem("robot_with_simulated_effects")
problem.add_fluent(at)
problem.add_fluent(battery_charge)
problem.add_action(move)
problem.add_object(l1)
problem.add_object(l2)
problem.add_object(r1)

problem.set_initial_value(at(r1), l1)
problem.set_initial_value(battery_charge(r1), 100)

simulator = UPSequentialSimulator(problem)
grounder = GrounderHelper(problem)

all_actions = set(
    [(a[0], a[1]) for a in grounder.get_grounded_actions()]
)

app_actions = set(
    [a for a in simulator.get_applicable_actions(simulator.get_initial_state())]
)

# FOLLOWING ASSERT ALWAYS WORKS WITH OR WITHOUT SIMULATED EFFECT
assert app_actions.issubset(all_actions)

all_grounded_actions = set(
    [a[2] for a in grounder.get_grounded_actions()]
)

app_grounded_actions = set(
    [
        grounder.ground_action(a[0], a[1])
        for a in simulator.get_applicable_actions(simulator.get_initial_state())
    ]
)

# FOLLOWING ASSERT ONLY WORKS WITHOUT SIMULATED EFFECT
assert app_grounded_actions.issubset(all_grounded_actions)

@fteicht
Copy link
Member Author

fteicht commented Apr 10, 2024

Thanks a lot @alvalentini for your support!
I made your suggested changes and now the scikit-decide's unified planning wrapper works perfectly with simulated action effects.

I have updated the up-skdecide repository as well to reflect this change (please note that it requires the latest commit of scikit-decide to work, but not the latest PyPi package until we generate a new tagged version of scikit-decide).

Concerning this issue, I believe it only concerns advanced usage of the library. I am happy to close the issue, but maybe you might want to write a warning somewhere in the document to avoid advanced users to fall in the same trap?

Thanks very much again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants