Skip to content

Latest commit

 

History

History
80 lines (54 loc) · 2.89 KB

README.md

File metadata and controls

80 lines (54 loc) · 2.89 KB

Spår: Cluster Trace Generator

This command-line tool generates cluster trace in a more controllable manner based on Alibaba's cluster trace

image image

Installation

It is recommended to install the tool with pip3.

pip3 install spar

Caution: The tool might not work with newer versions of Python(3.8+) and scipy(1.8+).

Usage

Usage: spar [OPTIONS] OUTPUT_DIR

  By default, we output an hour-long trace from the original Alibaba
  trace to the OUTPUT_DIR. But you could provide several parameters
  and we would transform the trace as follows.
  1. Up- or down-sample trace according to load-factor. For up-sampling,
  we replace the dependencies with synthesized ones.
  2. Adjust resource heterogeneity according to heter-factor.
  3. Rescale resource request and usage according to machine-conf.

  Examples:

  Generate an hour-long trace.
  $ spar <OUTPUT_DIR>

  Generate an hour-long trace with 2x jobs.
  $ spar <OUTPUT_DIR> --load-factor 2

  Generate a half-hour-long trace.
  $ spar <OUTPUT_DIR> --duration 0.5

  Generate an hour-long trace with the resource request and usage deviating
  from the average 1.5x the original.
  $ spar <OUTPUT_DIR> --heter-factor 1.5

  Generate an hour-long trace for clusters with 24 cores and 50 unit of memory.
  $ spar <OUTPUT_DIR> --machine-conf (24, 50)

Options:
  --trace-dir PATH                The location of Alibaba trace.
  --load-factor FLOAT             A factor adjusting the average load (i.e., #
                                  jobs/hour) of the output trace.
  --duration FLOAT RANGE          The duration (in hours) of the trace.
  --heter-factor FLOAT            A factor adjusting the heterogeneity
                                  (defined as the ratio: value/average) of the
                                  output trace.
  --machine-conf <INTEGER INTEGER>...
                                  An integer pair indicating the (CPU, memory)
                                  of each server. Default: (96, 100) as in
                                  Alibaba cluster.
  --help                          Show this message and exit.

Please refer to our wiki for the detailed format of input and output.

Publication

For more details, please refer to the following paper.

Huangshi Tian, Yunchuan Zheng, and Wei Wang. "Characterizing and Synthesizing Task Dependencies of Data-Parallel Jobs in Alibaba Cloud." In SoCC. 2019.

Contributing

Any form of contribution is welcome! If you find a bug, create an issue; if you extend a feature, send a pull request.

Acknowledgement

@SimonZYC has significantly contributed to this project.