Spark Cyclone

Spark Cyclone is an Apache Spark plug-in that accelerates the performance of Spark by using the SX-Aurora TSUBASA "Vector Engine" (VE). The plugin enables Spark users to accelerate their existing jobs by generating optimized C++ code and executing it on the VE, with minimal or no effort.

Spark Cyclone currently offers three pathways to accelerate Spark on the VE:

Spark SQL: The plugin leverages Spark SQL's extensibility to rewrite SQL queries on the fly and executes dynamically-generated C++ code on the VE with no user code changes necessary.
RDD: For more direct control, the plugin's VERDD API API provides Scala macros that can be used to transpile normal Scala code into C++ and thus execute common RDD operations such as map() on the VE.
MLlib: CycloneML is a fork of MLlib that uses Spark Cyclone to accelerate many of the ML algorithms using either the VE or CPU.

Spark Cyclone Homepage

Plugin Usage

Integrating the Spark Cyclone plugin into an existing Spark job is very straightforward. The following is the minimum set of flags that need to be added to an existing Spark job configuration:

$ $SPARK_HOME/bin/spark-submit \
    --name YourSparkJobName \
    --master yarn \
    --deploy-mode cluster \
    --num-executors=8 --executor-cores=1 --executor-memory=8G \                                 # Specify 1 executor per VE core
    --jars /path/to/spark-cyclone-sql-plugin.jar \                                              # Add the Spark Cyclone plugin JAR
    --conf spark.executor.extraClassPath=/path/to/spark-cyclone-sql-plugin.jar \                # Add Spark Cyclone libraries to the classpath
    --conf spark.plugins=io.sparkcyclone.plugin.AuroraSqlPlugin \                               # Specify the plugin's main class
    --conf spark.executor.resource.ve.amount=1 \                                                # Specify the number of VEs to use
    --conf spark.resources.discoveryPlugin=io.sparkcyclone.plugin.DiscoverVectorEnginesPlugin \ # Specify the class used to discover VE resources
    --conf spark.cyclone.kernel.directory=/path/to/kernel/directory \                           # Specify a directory where the plugin builds and caches C++ kernels
    YourSparkJob.py

Configuration

Please refer to the Plugin Configuration guide for an overview of the configuration options available to Spark Cyclone.

Plugin Development

System Setup

While parts of the codebase can be developed on a standard x86 machine running Linux or MacOS, building and testing the plugin requires a system that has VEs properly installed and set up - please refer to the VE Documentation for more information on this. The following guides contain all the necessary setup and installation steps:

In particular, the system should have the following software ready after setup:

VEOS, the set of daemons and commands providing operating system functionality to VE programs
AVEO, the offloading framework for running code on the VE
NCC, NEC's C compiler for building code to VE target

Development Guide

The following pages cover all aspects of Spark Cyclone development:

Usage:
Development:
External Dependencies:
- Frovedis Library
- JavaCPP Layer Around AVEO

License

Spark Cyclone is licensed under the Apache License, Version 2.0.

For additional information, please see the LICENSE and NOTICE files.

Name		Name	Last commit message	Last commit date
Latest commit History 1,373 Commits
.contrib		.contrib
.github/workflows		.github/workflows
adr		adr
aveobench		aveobench
docker		docker
docs		docs
examples		examples
fun-bench		fun-bench
project		project
rddbench		rddbench
src		src
tests		tests
tpcbench-run		tpcbench-run
tracing/src		tracing/src
.gitignore		.gitignore
.scalafmt.conf		.scalafmt.conf
.sdkmanrc		.sdkmanrc
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
build.sbt		build.sbt
sonar-project.properties		sonar-project.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spark Cyclone

Plugin Usage

Configuration

Plugin Development

System Setup

Development Guide

License

About

Releases 11

Packages

Contributors 12

Languages

License

XpressAI/SparkCyclone

Folders and files

Latest commit

History

Repository files navigation

Spark Cyclone

Plugin Usage

Configuration

Plugin Development

System Setup

Development Guide

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 12

Languages

Packages