Cloud computing and the DNA data race

Given the accumulation of DNA sequence data sets at ever-faster rates, what are the key factors you should consider when using distributed and multicore computing systems for analysis?

Figure 1: Map-shuffle-scan framework used by Crossbow.


The authors were supported in part by US National Science Foundation grant IIS-0844494 and by US National Institutes of Health grant R01-LM006845.

Schatz, M., Langmead, B. & Salzberg, S. Cloud computing and the DNA data race. Nat Biotechnol 28, 691–693 (2010).

