support caching for previously computed datasets #291

levinas · 2015-02-12T03:02:13Z

Dan has this idea that we should check if the service has seen an input and just return the previously computed output. This would reduce the computation burden.

We could check for file size, file MD5, assembly method, and arast version to determine if we could serve a precomputed result. Not sure if shock computes MD5 by default.

It would mean providing a --force option for rerunning the assembly.

levinas · 2015-02-12T03:05:15Z

We may need to look into data caching as well. --data is great, but we need something equivalent for the jobs invoked in the narrative. I'm seeing some 25GB SRA reads being pulled over and over again from shock. Maybe handle ID could be another key for such caching.

levinas added the enhancement label Feb 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support caching for previously computed datasets #291

support caching for previously computed datasets #291

levinas commented Feb 12, 2015

levinas commented Feb 12, 2015

support caching for previously computed datasets #291

support caching for previously computed datasets #291

Comments

levinas commented Feb 12, 2015

levinas commented Feb 12, 2015