SAC: Scalable Array Comprehensions

An array comprehension is a monolithic array construction that is as expressive as basic SQL by supporting a group-by syntax that allows us to capture many array computations in declarative form.

SAC translates array comprehensions to Scala code that calls Spark RDD operations whose functional arguments call the Scala's Parallel Collections library for multicore parallelism.

Benchmarks

The SAC benchmarks were evaluated on SDSC Comet. The SBATCH shell script used to run the benchmarks on Comet is in tests/spark file comet.run. The log files generated by the scripts that contain the run times are run*.log in the same directory.

The cluster should support Slurm Workload Manager, Hadoop 2.*, and myhadoop.

You compile SAC, use mvn install on the top directory.

Steps to run the scripts on Comet (or on any Slurm-managed cluster):

Install Scala 2.12.
Install Spark 3.0 on Hadoop 2.7.
Change SCALA_HOME and SPARK_HOME in the SBATCH scripts to point to your installations.
Execute the scripts using sbatch, eg, sbatch comet.run.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
project		project
src/main/scala/edu/uta/array		src/main/scala/edu/uta/array
tests		tests
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
build.sbt		build.sbt
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAC: Scalable Array Comprehensions

Benchmarks

About

Releases

Packages

Languages

License

fegaras/array

Folders and files

Latest commit

History

Repository files navigation

SAC: Scalable Array Comprehensions

Benchmarks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages