-
Notifications
You must be signed in to change notification settings - Fork 4
Performance Comparison on Spark
Guanghui.Zhu edited this page Jan 31, 2018
·
1 revision
DGST outperforms the state-of-the-art ERa algorithm with about 3 times speedup on both DNA and English text datasets.
We first compare the performance of DGST with ERa on the DNA dataset. We extract strings of different lengths from the Pine genome (with a total length of 12 GBps). The performance comparison is shown below. We can see that DGST performs with 3 times speedup on average.
We also compare the performance of DGST with ERa on the English text dataset. We extract strings of different lengths from the Wikipedia (with a total length of 10G characters). The performance comparison is shown below. We can see that DGST achieves 2.6 times speedup on average.