You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @HYB777. This is a ray config issue - as long as you configure ray on a multi-node cluster, run ray.init appropriately, and use the RayVecEnv, things should work out.
That being said, I haven't tested personally on a multi-node cluster yet.
Since we're not ray developers, I think this question is outside of the scope for support from the tianshou team. However, if you encounter tianshou specific issues on the cluster, feel free let us know!
Ray has a large community and a lot of documentation, I suggest you start there. If you want to contribute a multi-node running example, I'm happy to review a PR
If you want to run RayVecEnv in a cluster you have to setup multiple Ray workers and and connect all of them to the IP address of the Ray head node. This is done using the ray.init command. Here is an example that gets the IP address from every worker node that is connected to a Ray Cluster. If this runs on your multi-node server, you will be able to do the same with RayVecEnv.
importsocketimporttimefromcollectionsimportCounterimportray@ray.remotedeff():
time.sleep(0.001)
returnsocket.gethostbyname(socket.gethostname())
defmain(address: str):
ray.init(address=address) # This needs to be replaced with your IP addressfutures= [f.remote() for_inrange(10000)]
ip_addresses=ray.get(futures)
forip_address, num_tasksinCounter(ip_addresses).items():
print(" {} tasks on {}".format(num_tasks, ip_address))
How to use RayVecEnv in cluster? I want to run my rl code using multi-nodes training, I'm new to ray, is there some demos scripts?
The text was updated successfully, but these errors were encountered: