Multi-Agent RL #648

mplemay · 2022-07-08T18:10:45Z

Is your feature request related to a problem? Please describe.
First off, I would like to thank you for building and maintaining an amazing project! One feature, I would be interested in adding/contributing is functionality/pathway to building RL agents with the Agents.jl framework which was mentioned in passing by another individual here.

Describe the solution you'd like
For casual users of Agents.jl it can be daunting to Reinforcement Learning agents. If it is within the scope of the project, I think it would be valuable to provide some guidance or tools for creating a RL Agents. Depending on the desired scope, there are many ways to go about this.

Describe alternatives you've considered
Below is a short list of options I have considered. If it makes sense to explore any of these options, it may be worth a more in-depth study.

Option 1: Integration with ReinforcementLearning.jl - the dominant RL (and more importantly multi-agent RL) framework for Julia
- Option 1.1: Provide examples, tutorials, etc.
  - Pros
    - Relatively low overhead, low effort, etc.
  - Cons
    - Partially reduces the barrier to building RL agents.
    - May require some slight modifications to facilitate the process.
- - Option 1.2: Implement/provide defaults for the multi-agent interface
  - Pros
    - Simplifies the process of building an RL agent
  - Cons
    - Adds an additional dependency (requires effort to maintain compatability, upstream bug fixes, etc.)
    - Limited control over the RL integration process.
Option 2: Building a RL framework (ex: ReinforcementLearningAgents.jl) within Agents.jl/JuliaDynamics ecosystem
- Pros
  - The framework could be tightly coupled with Agents.jl
  - Agnostic to upstream bug fixes, changes, etc. (i.e. a higher level of control)
  - Having a all-in-one framework could increase the Agents.jl popularity
  - Could be extended to evolutionary algos, etc.
- Cons
  - Significant level of work required build
  - More maintenance
  - Code duplication/fragmentation of the Julia Multi-agent RL ecosystem

Please let me know if there is anything I could do to provide more clarification or insight.

The text was updated successfully, but these errors were encountered:

Datseris · 2022-07-09T09:07:39Z

Thanks! Personally I know nothing about Reinforcement learning.

Please let me know if there is anything I could do

Well, the best you can do is make an integration example like we do with DiffEq, BlackBox, etc., and then open a PR in Agents.jl docs!

For casual users of Agents.jl it can be daunting to Reinforcement Learning agents.

This statement was done without any reasoning as to why this would be hard. The integration examples with DiffEq, BlackBox, and practically any other package, was as straight forward as they could be. What is so special here?

Datseris · 2022-07-09T09:13:26Z

Hm, after looking at the wikipedia page for Reinforcement Learning https://en.wikipedia.org/wiki/Reinforcement_learning it seems to me that the process described there can already be done with Agents.jl without making any new code...? What is missing that you couldn't do out of the box in Agents.jl? Can you be specific about the actual difficulties you encountered while trying to model Reinforcement Learning in Agents.jl?

(Agents.jl also support multi-agent environments)

mplemay · 2022-07-09T15:50:33Z

For casual users of Agents.jl it can be daunting to Reinforcement Learning agents.

This statement was done without any reasoning as to why this would be hard. The integration examples with DiffEq, BlackBox, and practically any other package, was as straight forward as they could be. What is so special here?

That is a fair criticism; therefore, I will add some context. Before writing this issue, I started working on integrating my Agents.jl model with ReinforcementLearning.jl. While doing so, I found that it required intermediate to advanced familiarity with Agents.jl. Moreover, once a reference implementation is written, a decent chunk of the code could be re-used by others. My hope is that by adding RL functionality to Agents.jl it would 1) reduce code duplication and 2) lower the barrier to entry so that other could more easily take advantage of the great tools you built for multi-agent RL.

Hm, after looking at the wikipedia page for Reinforcement Learning https://en.wikipedia.org/wiki/Reinforcement_learning it seems to me that the process described there can already be done with Agents.jl without making any new code...? What is missing that you couldn't do out of the box in Agents.jl? Can you be specific about the actual difficulties you encountered while trying to model Reinforcement Learning in Agents.jl?

As mentioned above, Agents.jl is for has all of the features for the RL use case. My concern revolves around implementing models that leverage Agents.jl. As an applied researcher, my job function boils down to two processes: 1) building simulations for the problems I am trying to solve and 2) trying out a bunch of different methods to solve a problem (and if time allows extending that approach with novel ideas). Agents.jl does an excellent job at simplifying the first process. As for the second process, it can be time consuming and non-trivial to "glue" different agent behavior libraries together (ex: POMDPs.jl, ReinforcementLearning.jl, etc.). My hope is that by reducing the barrier for multi-agent problem solvers (RL is a subset of this category), researcher would be able to more easily take advantage of the Agents.jl functionality.

mplemay · 2022-07-09T16:01:40Z

Below is an example of the methods you would need to implement in order to interface with ReinforcementLearning.jl. Methods such as is_terminated, reset!, players, and current_player could be pre-defined plus helper functions, examples for getting an agents state, etc. could be useful for implementing the model specific funcitons.

Note: Once I finish current project, I will try to a generalized implementation, possibly with an example usage.

mutable struct ABMEnv{T} <: AbstractEnv
    abm::ABM
    reward::T
end

ABMEnv(abm::ABM) = ABMEnv(abm, 0.0)

function RLBase.state(env::ABMEnv, ::Observation{Int}, player::Nothing) end

function RLBase.state(env::ABMEnv, ::Observation{Int}, player::Int) end

function RLBase.reward(env::ABMEnv, player::Int)
    return env.reward
end

function (env::ABMEnv)(action, player::Int) end

function RLBase.action_space(env::ABMEnv, player::Nothing) end

function RLBase.action_space(env::ABMEnv, player::Int) end

function RLBase.state_space(env::ABMEnv, ::Observation{Int}, player::Nothing) end

function RLBase.state_space(env::ABMEnv, ::Observation{Int}, player::Int) end

function RLBase.is_terminated(env::ABMEnv) end

function RLBase.is_terminated(env::ABMEnv, player::Int) end

function RLBase.players(env::ABMEnv) end

function RLBase.current_player(env::ABMEnv) end

function RLBase.reset!(env::ABMEnv) end

function RLBase.NumAgentStyle(::ABMEnv)
    return MultiAgent(<num agents>)
end 

RLBase.DynamicStyle(::ABMEnv) = SEQUENTIAL
RLBase.ActionStyle(::ABMEnv) = MINIMAL_ACTION_SET
RLBase.InformationStyle(::ABMEnv) = IMPERFECT_INFORMATION
RLBase.StateStyle(::ABMEnv) = Observation{Int}()
RLBase.RewardStyle(::ABMEnv) = TERMINAL_REWARD
RLBase.UtilityStyle(::ABMEnv) = IDENTICAL_UTILITY
RLBase.ChanceStyle(::ABMEnv) = DETERMINISTIC

Datseris · 2022-07-09T22:26:58Z

I am not sure we are on the same page yet unfortunately. Let's first take a step back, because I am not working at your field and therefore I don't know why you need ReinforcementLearning. In the DiffEq/BlackBox integration examples it is easy for me to understand why an interplay with a different package is required: Agents.jl can't solve ODEs, or minimize functions.

For the case of ReinforcementLearning.jl, what are the specific things you need it for?

The reason I am asking is, because of my ignorance, so far it seems to me that ReinforcementLearning.jl is an alternative way to do agent based simulations. So I am trying to figure out why, or how, one would combine two alternative ways to do the same thing.

Methods such as is_terminated, reset!, players, and current_player could be pre-defined plus helper functions

That's another point I think we are not on the same page. So far it seems to me that the problem is that all of these methods depend on the specific scientific problem you are trying to simulate. How would you define them generally? Isn't action_space something that returns a collection of all the possible "actions" an agent can take? You can't define this generally within Agents.jl, because, in stark contrast to ReinforcementLearning.jl as far as I can see, in Agents.jl a user defines a generic function that simply does all the actions. There is simply so much more flexibility and one doesn't have to learn or extent 50 different types that represent possible actions. The downside of course is that you can't "list" or "extract" a specific list of actions, which would be necessary to define an action_space return value.

Datseris · 2022-07-09T22:29:28Z

In any case, to be clear: if you can come up with a low-dependency way to establish this interfacing you need between these two packages, it is welcomed in Agents.jl as a submodule. I don't have to understand your field to welcome such an addition :D

findmyway · 2022-07-26T08:27:43Z

A typical example is here https://github.com/Farama-Foundation/MAgent

And I believe Agents.jl is quite flexible to create such environments.

simsurace · 2023-11-01T12:37:21Z

The reason I am asking is, because of my ignorance, so far it seems to me that ReinforcementLearning.jl is an alternative way to do agent based simulations. So I am trying to figure out why, or how, one would combine two alternative ways to do the same thing.

Reinforcement learning is about how agents learn to take actions such as to maximize their reward. So it's not about studying how agents with a fixed set of rules behave, but to set some goals and to let the agent figure out how to achieve them. Which is kind of intriguing because you could tie the reward signal to agents to some macroscopic behavior and then let the agents learn to behave microscopically in such a way to get the macroscopic behavior right.

gregid · 2023-11-10T13:00:12Z

For reference on the type of problems that could utilize such approach (multi agent reinforcement learning) in python world there is:

https://github.com/Farama-Foundation/PettingZoo
https://pettingzoo.farama.org/
https://github.com/Farama-Foundation/Gymnasium (single agent RL)

@Datseris, taking pettingzoo as a reference, what's your view on Agents.jl suitability for such use case?

Datseris · 2023-11-10T16:04:10Z

I don't have any background in reinforcement learning so I am not qualified to answer this at the moment. I would have to go through the repositories in detail and unfortunately I do not have the time capacity for doing this at the moment. User @findmyway claimed that Agents.jl is suitable for such tasks, perhaps they can expand on this.

Tortar · 2024-03-19T22:28:38Z

I know a decent amount of reinforcement learning and I think @simsurace gives a good example on how to use reinforcement learning in the ABM world, I really think the way to go is to interface with ReinforcementLearning.jl as @mplemay suggested. One way would be to use one of the standard models such as WolfSheep and use reinforcement learning for setting global goals for sheeps and wolves. It would be very cool. If someone has the time to work on this, I will take the time to review the integration, because I don't really have time to work on that at the moment myself

Tortar · 2024-03-19T22:32:48Z

e.g. this is a good paper in my opinion for more details: https://www.nature.com/articles/s41598-020-68447-8

Datseris · 2024-10-12T15:42:32Z

@Tortar shall we close this issue? Since ReinforcementLearning.jl exists, and is better suited for the simulation scenario in discussion, is there any point in leaving this open? This isn't an integration example nor request for one. In fact, I don't even know if there is a request in this Issue anymore.

Tortar · 2024-10-12T18:00:50Z

there is surely a lot that can be done by integration e.g. there was one gsoc this year in mesa for a RL integration for that framework: https://github.com/harshmahesheka/mesa_rl

Tortar · 2024-10-12T18:06:24Z

and they added those examples to their examples library: https://github.com/projectmesa/mesa-examples/tree/main/rl

manentai · 2024-11-05T19:59:20Z

TBH, it would be great to have an example, but while I know my fair share about ABM, I am a newbie with RL, so not sure I can bring anything to the discussion more than "another user is interested"!

Datseris added the discussion label Jul 9, 2022

Tortar added example-integration A new example or showcase of integration with the Julia ecosystem enhancement New feature or request labels Mar 19, 2024

Tortar pinned this issue Mar 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-Agent RL #648

Multi-Agent RL #648

mplemay commented Jul 8, 2022

Datseris commented Jul 9, 2022

Datseris commented Jul 9, 2022

mplemay commented Jul 9, 2022

mplemay commented Jul 9, 2022 •

edited

Loading

Datseris commented Jul 9, 2022 •

edited

Loading

Datseris commented Jul 9, 2022

findmyway commented Jul 26, 2022

simsurace commented Nov 1, 2023

gregid commented Nov 10, 2023

Datseris commented Nov 10, 2023

Tortar commented Mar 19, 2024 •

edited

Loading

Tortar commented Mar 19, 2024

Datseris commented Oct 12, 2024

Tortar commented Oct 12, 2024 •

edited

Loading

Tortar commented Oct 12, 2024

manentai commented Nov 5, 2024

Multi-Agent RL #648

Multi-Agent RL #648

Comments

mplemay commented Jul 8, 2022

Datseris commented Jul 9, 2022

Datseris commented Jul 9, 2022

mplemay commented Jul 9, 2022

mplemay commented Jul 9, 2022 • edited Loading

Datseris commented Jul 9, 2022 • edited Loading

Datseris commented Jul 9, 2022

findmyway commented Jul 26, 2022

simsurace commented Nov 1, 2023

gregid commented Nov 10, 2023

Datseris commented Nov 10, 2023

Tortar commented Mar 19, 2024 • edited Loading

Tortar commented Mar 19, 2024

Datseris commented Oct 12, 2024

Tortar commented Oct 12, 2024 • edited Loading

Tortar commented Oct 12, 2024

manentai commented Nov 5, 2024

mplemay commented Jul 9, 2022 •

edited

Loading

Datseris commented Jul 9, 2022 •

edited

Loading

Tortar commented Mar 19, 2024 •

edited

Loading

Tortar commented Oct 12, 2024 •

edited

Loading