UMLAUT (Universal Machine Learning Analysis UTility)

A modular suite for benchmarking all stages of Machine Learning pipelines. To find bottlenecks in such pipelines and compare different ML tools, this framework can calculate and visualize several metrics in the data preparation, model training, model validation and inference stages.

Installation & Setup

Clone the current repository with the following command:

git clone [email protected]:hpides/End-to-end-ML-System-Benchmark.git

To use the package in a Python project, include it in the requirements.txt file.
That can be done with a path reference to the local repository. Include the following line in your requirements file, with your local path.

-e <PATH_TO_REPOSITORY>/umlaut/

Or, through pip:

pip install -e <PATH_TO_REPOSITORY>/umlaut/

Docker setup

Alternatively you can run umlaut or umplaut + daphne in a docker container. You can find them in /containers.

Only Umlaut

Run

sudo docker build -t umlaut containers/only_umlaut

to build a container and start it by running

bash containers/only_umlaut/start.sh

The Container only installs this repository.

Umlaut + Daphne

Run

sudo docker build -t umlaut_cpu containers/umlaut_daphne

to build a container and start it by running

bash containers/umlaut_daphne/start.sh

The Container builds the newest daphne version from source. This might take a while. You can alternatively uncomment the lines from the Dockerfile to download a daphne binary.

Umlaut + Daphne + CUDA

Run

sudo docker build -t umlaut_cuda containers/umlaut_daphne_cuda

to build a container and start it by running

bash containers/umlaut_daphne_cuda/start.sh

The Container contains builds the dnn-ops branch from daphne.

System Integration

Upon installation, UMLAUT can be imported in any Python pipeline. The complete example pipeline can be found in ./pipelines/github_example/main.py. To import UMLAUT, use the following import statement in the Python script.

import umlaut

To intialize a benchmark, initialize an instance of the Benchmark class. It requires two string parameters, db_file, and description (optional). The metrics are listed in a dictionary that will then be used by the benchmark class.

import time
import numpy as np
from umlaut import Benchmark, BenchmarkSupervisor, MemoryMetric, CPUMetric
 
bm = Benchmark('sample_db_file.db', description="Database for the Github sample measurements")

bloat_metrics = {
    "memory": MemoryMetric('bloat memory', interval=0.1),
    "cpu": CPUMetric('bloat cpu', interval=0.1)
}

To benchmark a method, we attach a decorator (BenchmarkSupervisor) providing the metrics and the benchmark class. After the completion of the method, the Benchmark needs to be closed.

@BenchmarkSupervisor(bloat_metrics.values(), bm)
def bloat():
    a = []
    for i in range(1, 2):
        a.append(np.random.randn(*([10] * i)))
        time.sleep(5)
    print(a)

def main():
    bloat()
    bm.close()

if __name__ == "__main__":
    main()

End-to-End benchmarking with UMLAUT

You can run your custom pipeline by providing the path to you python file. You can specify the different kinds of measurements.

python pipelines/custom_pipeline/run_script.py --cmd "your command" -folder "path/to/your/script" -g -gm -gt -gp -t -c -m

Comand Line Interface

Measurements are accessed through UMLAUT's CLI tool. It can be invoked from a bash terminal with the following command.

umlaut-cli <db_file>

To read through the measurements from the sample_db_file.db database, we insert the db_file name in the command.

cd pipelines/github_example
umlaut-cli sample_db_file.db

For detailed descriptions of all avaiable arguments and flags, call the help command for umlaut-cli.

umlaut-cli --help

Metrics

UMLAUT collects measurements of the following metrics:

Time spent
Memory usage
GPU Memory usage
GPU utilization
GPU power consumtion
Loss (single run and multiple runs)
Influence of batch size and #epochs
Influence of learning rate
Time to Accuracy (single run and multiple runs)
Power usage
Multiclass Confusion Matrix
Standard metrics as accuracy, F1, TP/TN etc.
Latency
Throughput

Visualization

Through the CLI tool, the measurements for each of the metrics can be visualized. For each pipeline, users can generate plots for one or more metrics.
Measurements for the same metric for multiple pipelines can be shown on a single plot. Examples of using the CLI toolkit for visualization are shown below. To reproduce the following plots, use the ./github_example/hello_word.db.

Selecting single pipeline to visualize

umlaut-cli hello_world.db -p plotly

Run umlaut-cli and use plotly as plotting backend.

Select an UUID using space and the arrow keys.

Selecting measurements for a single pipeline

Select one or more metrics using space and the arrow keys.

Select one or more methods

Select one or more descriptions using space and the arrow keys. The description or a measurement is usually the method name.

Results for CPU and Memory Usage [single pipeline]

Selecting measurements for multiple pipelines

umlaut-cli hello_world.db -p plotly

We run again umlaut-cli and use plotly as plotting backend. This time we select mutiple UUIDs using space and the arrow keys.

Example Pipelines

In the pipelines folder, there are several examples of the following pipelines where UMLAUT is integrated.

So2Sat Earth Observation [description] [umlaut pipeline]
Backblaze Hard Drive Anomaly Prediction [description] [umlaut pipeline]
Stock Market Prediction [description] [umlaut pipeline]
MNIST Digit Recognition [description] [umlaut pipeline]
Meta Benchmarking Pipeline for initial testing:
By running the provided sh files, a set of operations (sleeping, sorting, matrix multiplication) can be run to test Umlaut on your own system. Furthermore the provided python file can be run for customized testing with the following arguments:

-t / --time to activate runtime measurements
-m / --memory to activate memory measurements
-mf / --memoryfreq to specify the interval for memory measurements
-c / --cpu to activate cpu measurements
-cf / --cpufreq to specify the interval for cpu measurements\
-o / --order to specify which operations to run ("sleep", "sort", "mult", "vw", in any order and as often as desired)
-r / --repeat to specify how often the set of operations should be repeated -g / --gpu to activate gpu utilization measurement
-gm / --gpumemory to activate gpu memory measurement
-gt / --gputime to activate gpu time measurement. There might be slight differences between the cpu time and gpu time for code executed on a gpu.
-gp / --gpupower to activate gpu power consumption measurement.\

Umlaut should have a memory overhead of ~130 MB, a CPU usage of 10-20% when idle and close to no time overhead. When sorting, memory usage should have a mean and max of within 1000-1100 MB. When matrix multiplying, CPU usage should have a mean of ~90%.

Documentation

https://hpides.github.io/End-to-end-ML-System-Benchmark/

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
containers		containers
docs		docs
pipelines		pipelines
plots		plots
sphinx		sphinx
tests		tests
umlaut		umlaut
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UMLAUT (Universal Machine Learning Analysis UTility)

Installation & Setup

Docker setup

System Integration

End-to-End benchmarking with UMLAUT

Comand Line Interface

Metrics

Visualization

Selecting single pipeline to visualize

Selecting measurements for a single pipeline

Select one or more methods

Results for CPU and Memory Usage [single pipeline]

Selecting measurements for multiple pipelines

Example Pipelines

Documentation

About

Releases

Packages

Languages

License

daphne-eu/umlaut

Folders and files

Latest commit

History

Repository files navigation

UMLAUT (Universal Machine Learning Analysis UTility)

Installation & Setup

Docker setup

System Integration

End-to-End benchmarking with UMLAUT

Comand Line Interface

Metrics

Visualization

Selecting single pipeline to visualize

Selecting measurements for a single pipeline

Select one or more methods

Results for CPU and Memory Usage [single pipeline]

Selecting measurements for multiple pipelines

Example Pipelines

Documentation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages