Cloud computing and the DNA data race

Journal name:
Nature Biotechnology
Volume:
28,
Pages:
691–693
Year published:
DOI:
doi:10.1038/nbt0710-691

Given the accumulation of DNA sequence data sets at ever-faster rates, what are the key factors you should consider when using distributed and multicore computing systems for analysis?

References

  1. Stein, L.D. Genome Biol. 11, 207 (2010).
  2. Moore, G.E. Electronics 38, 47 (1965).
  3. Dongarra, J.J., Otto, S.W., Snir, M. & Walker, D. Commun. Assoc. Comput. Machinery 39, 8490 (1996).
  4. Litzkow, M., Livny, M. & Mutka, M. in Proceedings of the 8th International Conference of Distributed Computing Systems 104111 (IEEE, Washington DC, 1988).
  5. Dagum, L. & Menon, R. IEEE Comput. Sci. Eng. 5, 4655 (1998).
  6. Markoff, J. & Hansell, S. Hiding in plain sight, Google seeks more power. New York Times http://www.nytimes.com/2006/06/14/technology/14search.html (14 June 2006).
  7. Foley, J. Eli Lilly on what's next in cloud computing. Plug Into the Cloud http://www.informationweek.com/cloud-computing/blog/archives/2009/01/whats_next_in_t.html (14 January 2009).
  8. Netflix selects Amazon web services to power mission-critical technology infrastructure. Amazon.com http://phx.corporate-ir.net/phoenix.zhtml?c=176060&p=irol-newsArticle&ID=1423977 (7 May 2010).
  9. AWS case study: Harvard Medical School. Amazon Web Services http://aws.amazon.com/solutions/case-studies/harvard/.
  10. Jeffrey, D. & Sanjay, G. Commun. Assoc. Comput. Machinery 51, 107113 (2008).
  11. Lin, J. & Dyer, C. Synthesis Lectures on Human Language Technologies 3, 1177 (2010).
  12. Chu, C.-T. et al. Adv. Neural Inf. Process. Syst. 19, 281288 (2007).
  13. Schatz, M.C. Bioinformatics 25, 13631369 (2009).
  14. Brin, S. & Page, L. Comput. Netw. ISDN Syst. 30, 107117 (1998).
  15. Matthews, S.J. & Williams, T.L. BMC Bioinformatics 11 Suppl 1, S15 (2010).
  16. Langmead, B., Schatz, M.C., Lin, J., Pop, M. & Salzberg, S.L. Genome Biol. 10, R134 (2009).
  17. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).
  18. Li, R. et al. Genome Res. 19, 11241132 (2009).
  19. Wall, D. et al. BMC Bioinformatics 11, 259 (2010).
  20. Giardine, B. et al. Genome Res. 15, 14511455 (2005).
  21. Anonymous. Creating HIPAA-compliant medical data applications with AWS. Amazon Web Services http://aws.amazon.com/about-aws/whats-new/2009/04/06/whitepaper-hipaa/ (April 2009).
  22. Yu, Y. et al. DryadLINQ: a system for general-purpose distributed data-parallel computing using a high-level language. Symposium on Operating System Design and Implementation (OSDI), San Diego, California, 8–10 December 2008.
  23. Malewicz, G. et al. in PODC 09: Proceedings of the 28th ACM Symposium on Principles of Distributed Computing 6 (ACM, 2009).
  24. Matsunaga, A., Tsugawa, M. & Fortes, J. in Proceedings of the IEEE Fourth International Conference on eScience, 222229 (IEEE, Washington, DC, 2008).

Download references

Author information

Affiliations

  1. Michael C. Schatz and Steven L. Salzberg are at the Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA.

  2. Ben Langmead is at the Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA.

Competing financial interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to:

Author details

Additional data