Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README.md for more clear instructions and fix some minor errors. #10

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 21 additions & 9 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,27 @@ For an example users should look at `las_scheduler.py` which implements Least At

### Running Blox

#### Installation
Blox uses gRpc, Matplotlib to communicate and Plot several collected Metric.
We suggest the users to create a virtual environment to install the dependencies.
```
pip install grpcio
pip install matplotlib
pip install pandas==1.3.0
pip install grpcio-tools

pip install protobuf==4.21.1
cd blox/deployment
mkdir grpc_stubs
make
```

#### Prepare the trace

Take philly trace as an example, download the trace from [here](https://github.com/msr-fiddle/philly-traces/blob/master/trace-data.tar.gz) and unpack it.

You can get a file named `job_cluster_log` which will be used in the following examples.

Blox has two modes for running. One real cluster workload and second simulator.

##### Simulation Mode
Expand Down Expand Up @@ -87,15 +108,6 @@ PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python python node_manager.py --ipaddr ip

### Details for reproducing results for artifacts
These are instructions for reproducing artifacts for Blox.
#### Installation
Blox uses gRpc, Matplotlib to communicate and Plot several collected Metric.
We suggest the users to create a virtual environment to install the dependencies.
```
pip install grpcio
pip install matplotlib
pip install pandas==1.3.0
pip install grpcio-tools
```

###### Running Blox Code
To perform simulation.
Expand Down
2 changes: 1 addition & 1 deletion placement/bebop.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ def _consolidated_placement(
# found a node with more GPUs then needed
if min_more_GPUs > len(free_gpus[node]):
min_more_GPUs = len(free_gpus[node])
node_with_min_moRE_gpUs = node
node_with_min_more_GPUs = node
if node_with_min_more_GPUs is not None:
# only extracting the GPUs we need
return (free_gpus[node_with_min_more_GPUs][:numGPUs_needed], True)
Expand Down
2 changes: 1 addition & 1 deletion placement/consolidated.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ def _consolidated_placement(
# found a node with more GPUs then needed
if min_more_GPUs > len(free_gpus[node]):
min_more_GPUs = len(free_gpus[node])
node_with_min_moRE_gpUs = node
node_with_min_more_GPUs = node
if node_with_min_more_GPUs is not None:
# only extracting the GPUs we need
return (free_gpus[node_with_min_more_GPUs][:numGPUs_needed], True)
Expand Down
2 changes: 1 addition & 1 deletion placement/consolidated_placement.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def _consolidated_placement(
# found a node with more GPUs then needed
if min_more_GPUs > len(free_gpus[node]):
min_more_GPUs = len(free_gpus[node])
node_with_min_moRE_gpUs = node
node_with_min_more_GPUs = node
if node_with_min_more_GPUs is not None:
# only extracting the GPUs we need
return (free_gpus[node_with_min_more_GPUs][:numGPUs_needed], True)
Expand Down
2 changes: 1 addition & 1 deletion placement/first-gpu.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ def _consolidated_placement(
# found a node with more GPUs then needed
if min_more_GPUs > len(free_gpus[node]):
min_more_GPUs = len(free_gpus[node])
node_with_min_moRE_gpUs = node
node_with_min_more_GPUs = node
if node_with_min_more_GPUs is not None:
# only extracting the GPUs we need
return (free_gpus[node_with_min_more_GPUs][:numGPUs_needed], True)
Expand Down
2 changes: 1 addition & 1 deletion placement/placement.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,7 @@ def _consolidated_placement(
# found a node with more GPUs then needed
if min_more_GPUs > len(free_gpus[node]):
min_more_GPUs = len(free_gpus[node])
node_with_min_moRE_gpUs = node
node_with_min_more_GPUs = node
if node_with_min_more_GPUs is not None:
# only extracting the GPUs we need
return (free_gpus[node_with_min_more_GPUs][:numGPUs_needed], True)
Expand Down
2 changes: 1 addition & 1 deletion schedulers/scheduler_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def schedule(
gpu_df: Contains GPU dataframe.

Returns:
"order_job" : Mandatory key, list of dicts of jobs in the
"order_job" : Mandatory key, list of dicts of jobs
in the order they are supposed to run.
"run_all_jobs": Some scheduler will only output the jobs to
run which will fit on the GPU or expecting
Expand Down