This repository contains SGX-PySpark demo.
First, please shoot us ([email protected]) an email providing your Docker ID to get access to the Docker image: sconecuratedimages/www2019:sgx-pyspark
Try it out by executing:
git clone https://github.com/doflink/sgx-pyspark-demo && cd sgx-pyspark-demo
docker run -it --rm -v `pwd`:/fspf --privileged -p 8080:8080 -p 6868:6868 -p 28778:28778 sconecuratedimages/www2019:sgx-pyspark sh
Go to the demo directory:
cd /fspf/
Create a file system protection file (meta file) to store all the metadata required for checking the consistency of files. Then add encrypted regions and encrypt the input PySpark codes and data. Store the encryption key and the tag of the fspf to the file keytag:
./fspf.sh
Export environment variables
export SCONE_FSPF_KEY=$(cat input/keytag | awk '{print $11}')
export SCONE_FSPF_TAG=$(cat input/keytag | awk '{print $9}')
export SCONE_FSPF=/fspf/encrypted-files/volume.fspf
Now, run the wordcount application with SGX-PySpark.
/spark/bin/spark-submit encrypted-files/enc-wordcount.py input/encrypted-sensitive-input.txt spark://$(hostname -f):7077 &> output.txt &
Try to dump memory of the application to steal secrets
./memory-dump.sh
Send email to [email protected] or [email protected]