You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary: This approach is highly beneficial in terms of an approach that separates creating the scheduler, workers and client.
Such an approach is likely the only out-of-the box approach when considering other schedulers (e.g. ray). The downside of this approach is that it is it does not enable the dynamic scaling of the cluster.
using dask-mpi (mpirun/mpiexec)
Not yet tested.
pip install dask-mpi
mpiexec -n 4 python dask_computation.py --mpi
Summary: I didn't bother with this approach since other schedulers (i.e. not dask) are unlikely to provide such mechanisms.
Using dask-jobqueue
pip install dask-jobqueue
total_memory=60# Total memory per worker job/node in GBnum_workers=8# Number of workers per worker job/nodenum_nodes=2walltime='00:10:00'# Maximum runtime of the workers
# set cores and processes to be the same for single-threadedcluster=PBSCluster(
protocol="tcp", # unincryptedcores=num_workers, # Number of cores per worker == processes by defaultprocesses=num_workers, # Number of processes per worker job/nodememory=f"{total_memory} GB", # Amount of memory total per worker job i.e. nodewalltime=walltime, # Maximum runtime of the worker job
)
cluster.scale(jobs=num_nodes) # Scale the number of worker jobs to the number of nodes
# Parameters
# ----------
# n : int
# Target number of workers
# jobs : int
# Target number of jobs
# memory : str
# Target amount of memory
# cores : int
# Target number of cores
or
cluster.scale(<numberofworkers>)
or
cluster.adapt(minimum=1, maximum=10) # Automatically scale the number of workers based on the workload
# Parameters
# ----------
# minimum : int
# Minimum number of workers to keep around for the cluster
# maximum : int
# Maximum number of workers to keep around for the cluster
# minimum_memory : str
# Minimum amount of memory for the cluster
# maximum_memory : str
# Maximum amount of memory for the cluster
# minimum_jobs : int
# Minimum number of jobs
# maximum_jobs : int
# Maximum number of jobs
Summary: This approach provides greatest flexibility in terms of a dynamic cluster that can scale to requirements. However it does mean having the scheduler, workers and client managed centrally. This means less than ideal for sharing a clusters amongst independent programs. Also, another downside here is that other schedulers don't necessarily allow managing the cluster setup in this same way (ray).
Options available
Manual running
Summary: This approach is highly beneficial in terms of an approach that separates creating the scheduler, workers and client.
Such an approach is likely the only out-of-the box approach when considering other schedulers (e.g. ray). The downside of this approach is that it is it does not enable the dynamic scaling of the cluster.
using dask-mpi (mpirun/mpiexec)
Not yet tested.
Summary: I didn't bother with this approach since other schedulers (i.e. not dask) are unlikely to provide such mechanisms.
Using dask-jobqueue
or
or
Summary: This approach provides greatest flexibility in terms of a dynamic cluster that can scale to requirements. However it does mean having the scheduler, workers and client managed centrally. This means less than ideal for sharing a clusters amongst independent programs. Also, another downside here is that other schedulers don't necessarily allow managing the cluster setup in this same way (ray).
Testing these approaches with a script
https://gist.github.com/cpelley/ff537e9dd5fb97ee681fa7207575330b
The text was updated successfully, but these errors were encountered: