PAL: A Variability-Aware Policy for Cluster Scheduling in GPU-Rich HPC Systems

This repository contains the source code implementation for the submission paper "PAL: A Variability-Aware Policy for Cluster Scheduling in GPU-Rich HPC Systems". The source code is from Blox ("Blox: A Modular Toolkit for Deep Learning Schedulers") with PAL and PM-First placement policies added.

Pull this respository

git clone https://github.com/hal_uw/blox-pal.git
cd blox-pal
git checkout artifact

Install dependencies

The following steps are used to create a conda virtual environment with the necessary dependencies installed.

To install conda, follow steps in Conda Installation This step install necesaary Python packages such as grpcio, grpcio-tools, numpy, pandas and seaborn.

conda create -n envpal python=3.8
conda activate envpal
python3 -m pip install -r requirements.txt

Before running simulations, we need to build gRPC stubs. For first-time build, you need to create a directory for gRPC stubs.

cd blox/deployment
mkdir grpc_stubs
make rpc

Artifact Components

A1: Sia-Philly Trace: Baseline Simulations
A2: Sia-Philly Trace: Varying Locality Penalty
A3: Synergy Trace: Varying Job Load and Schedulers

Mapping Artifact Components to Paper Elements

Artifact Component	Contributions Supported	Related Paper Elements
A1	C1	Section V.B
		Figure 10
A2	C2	Section V.B
		Figure 12
A3	C3	Section V.C
		Figure 13
		Section V.C
		Figures 15 and 16

A₁: Sia Simulations

Evaluation

We evaluated the performance of our placement policies relative to baselines using the Sia-Philly workload traces. Section IV.B of the paper provides details about workload and cluster configuration, and results are presented in Section V.B.

Performance

PM-First improves average JCT by 40% geomean (min 5%, max 59%)
PAL improves average JCT by 43% geomean (mix 21%, max 59%) compared to baseline Packed-Sticky placement.

Computational Time

The expected computational time of this artifact on a CPU node is around 180 minutes. This can be reduced by running fewer workload traces (instructions for the same are specified in Artifact Execution).
Artifact setup is expected to take 20 minutes or less, and the artifact analysis script takes 3 minutes.

Hardware Requirements

Simulations require a CPU machine. All experiments were run with an x86_64 machine (Intel E5-2630 v3 8-core CPUs at 2.40 GHz - Haswell w/ EM64T with 8x 16 GB DDR4 memory) running Ubuntu 18.04.6. Code has also been tested on Mac M1 with Darwin Kernel Version 23.4.0.

Software Requirements

Simulations use gRPC to communicate, while analysis scripts use matplotlib, seaborn, and other Python libraries to plot several collected metrics. We recommend that users create a virtual environment to install these dependencies.

Before running simulations, we need to build gRPC stubs.

cd blox/deployment
make rpc

We provide a run script that launches all simulations and produces relevant output logs for each workload trace in the Sia-Philly trace set. To reduce computational time, edit line 7 of the script to run fewer traces.

conda activate envpal
./run_sia_sim_baseline.sh

To reproduce Figure 10 in the paper, run the following plotting script:

python plot_sia_sim_baseline.py

A₂: Sia Locality-Sweep Simulation

Evaluation

We analyze how different inter-node locality penalty values affect our policies for the Sia-Philly workloads. As the locality penalty increases, the best-performing baseline (Packed-Sticky) improves its average JCT and approaches PM-First’s and PAL’s performance. PM-First’s average JCT improvement over Packed-Sticky decreases from 30% to 9% as the locality penalty increases from 1.0 to 3.0 PAL outperforms both PM-First and Packed-Sticky, with benefits over Packed-Sticky only decreasing from 30% to 20% geomean.

Execution

We provide a run script that varies the inter-node locality penalty from 1.0 to 3.0 in steps of 0.5 and runs simulations at each value to produce relevant output logs.

conda activate envpal
./run_sia_sim_losweep.sh

To reproduce Figure 12 in the paper, run the following plotting script:

python plot_sia_sim_losweep.py

A₃: Varying Cluster Contention and Scheduling Policies

Evaluation

We vary the job arrival rate, and in turn, the contention level on the cluster, and measure JCTs to evaluate the performance of our policies. We use the Synergy workload and simulate a 256 GPU cluster. These experiments are also run with three different scheduling policies - FIFO, LAS, and SRTF.

Execution

We provide a run script run_synergy_baseline.sh to vary job load from 8 to 14 jobs/hour and run experiments for all 3 schedulers.

{{ \footnotesize \begin{minted}{bash} conda activate envpal ./run_synergy_sim.sh \end{minted} }}

Lines 441-447 of simulator_synergy.py specify the job load and scheduling policies to run as arrays. These can be modified to reduce the number of experiments, and consequently the execution time for A₃.

To reproduce Figure 13 in the paper, run the following plotting script:

python plot_synergy_fifo.py

To reproduce Figures 15 and 16 in the paper, run the following plotting script:

python plot_synergy_las.py
python plot_synergy_srtf.py

Conda Installation

(1) Get the latest version of Anaconda. The code has been tested with Anaconda3 23.3.1

wget https://repo.anaconda.com/archive/Anaconda3-2023.03-1-Linux-x86_64.sh

(2) Run the install script to install Anaconda

chmod +x Anaconda3-2023.03-1-Linux-x86_64.sh
./Anaconda3-2023.03-1-Linux-x86_64.sh
source ~/.bashrc
conda --version

Name		Name	Last commit message	Last commit date
Latest commit History 262 Commits
admission_control		admission_control
aggregated_data		aggregated_data
applications		applications
blox		blox
blox_examples		blox_examples
blox_exp		blox_exp
cluster_script		cluster_script
combined_plots		combined_plots
helpers		helpers
jobs		jobs
models		models
placement		placement
responsiveness_and_jct		responsiveness_and_jct
schedulers		schedulers
simulator_runner		simulator_runner
workload-traces		workload-traces
workload		workload
workload_synergy		workload_synergy
workloads		workloads
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.Blox.md		README.Blox.md
Readme.md		Readme.md
Readme_figure.md		Readme_figure.md
blox_enumerator.py		blox_enumerator.py
blox_new_flow_multi_run.py		blox_new_flow_multi_run.py
command_launch_file.py		command_launch_file.py
fifo_packed_nonsticky.py		fifo_packed_nonsticky.py
fifo_packed_sticky.py		fifo_packed_sticky.py
fifo_pal.py		fifo_pal.py
fifo_pmfirst.py		fifo_pmfirst.py
fifo_scheduler_cluster.py		fifo_scheduler_cluster.py
idev-scheduler-node.sh		idev-scheduler-node.sh
idev-worker-node.sh		idev-worker-node.sh
las_scheduler.py		las_scheduler.py
las_scheduler_cluster.py		las_scheduler_cluster.py
node_manager.py		node_manager.py
parse_jct.py		parse_jct.py
profiles.json		profiles.json
simulator.py		simulator.py
simulator_simple.py		simulator_simple.py
srtf_scheduler.py		srtf_scheduler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAL: A Variability-Aware Policy for Cluster Scheduling in GPU-Rich HPC Systems

Pull this respository

Install dependencies

Artifact Components

Mapping Artifact Components to Paper Elements

A₁: Sia Simulations

Evaluation

Performance

Computational Time

Hardware Requirements

Software Requirements

A₂: Sia Locality-Sweep Simulation

Evaluation

Execution

A₃: Varying Cluster Contention and Scheduling Policies

Evaluation

Execution

Conda Installation

About

Releases 5

Packages

Languages

License

hal-uw/blox-pal

Folders and files

Latest commit

History

Repository files navigation

PAL: A Variability-Aware Policy for Cluster Scheduling in GPU-Rich HPC Systems

Pull this respository

Install dependencies

Artifact Components

Mapping Artifact Components to Paper Elements

A1: Sia Simulations

Evaluation

Performance

Computational Time

Hardware Requirements

Software Requirements

A2: Sia Locality-Sweep Simulation

Evaluation

Execution

A3: Varying Cluster Contention and Scheduling Policies

Evaluation

Execution

Conda Installation

About

Resources

License

Stars

Watchers

Forks

Releases 5

Packages 0

Languages

A₁: Sia Simulations

A₂: Sia Locality-Sweep Simulation

A₃: Varying Cluster Contention and Scheduling Policies

Packages