Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Performance comparison of whole-genome sequencing platforms

A Corrigendum to this article was published on 07 June 2012

This article has been updated


Whole-genome sequencing is becoming commonplace, but the accuracy and completeness of variant calling by the most widely used platforms from Illumina and Complete Genomics have not been reported. Here we sequenced the genome of an individual with both technologies to a high average coverage of 76×, and compared their performance with respect to sequence coverage and calling of single-nucleotide variants (SNVs), insertions and deletions (indels). Although 88.1% of the 3.7 million unique SNVs were concordant between platforms, there were tens of thousands of platform-specific calls located in genes and other genomic regions. In contrast, 26.5% of indels were concordant between platforms. Target enrichment validated 92.7% of the concordant SNVs, whereas validation by genotyping array revealed a sensitivity of 99.3%. The validation experiments also suggested that >60% of the platform-specific variants were indeed present in the genome. Our results have important implications for understanding the accuracy and completeness of the genome sequencing platforms.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Genome coverage at different read depths.
Figure 2: SNV detection and intersection.
Figure 3: SNV association with different genomic elements.
Figure 4: Indel detection and intersection.

Similar content being viewed by others

Accession codes


Sequence Read Archive

Change history

  • 07 June 2012

    In the version of this article initially published, the accession code to obtain raw sequence data was given as SRA045736.2; the correct code is SRA045736. The error has been corrected in the HTML and PDF versions of the article.


  1. Ajay, S.S., Parker, S.C., Ozel Abaan, H., Fuentes Fajardo, K.V. & Margulies, E.H. Accurate and comprehensive sequencing of personal genomes. Genome Research 21, 1498–1505 (2011).

    Article  Google Scholar 

  2. Ashley, E.A. et al. Clinical assessment incorporating a personal genome. Lancet 375, 1525–1535 (2010).

    Article  CAS  Google Scholar 

  3. Wheeler, D.A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008).

    Article  CAS  Google Scholar 

  4. McKernan, K.J. et al. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Res. 19, 1527–1541 (2009).

    Article  CAS  Google Scholar 

  5. Roach, J.C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636–639 (2010).

    Article  CAS  Google Scholar 

  6. Pushkarev, D., Neff, N. & Quake, S. Single-molecule sequencing of an individual human genome. Nat. Biotechnol. 27, 847–852 (2009).

    Article  CAS  Google Scholar 

  7. Korbel, J.O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).

    Article  CAS  Google Scholar 

  8. Snyder, M., Du, J. & Gerstein, M. Personal genome sequencing: current approaches and challenges. Genes Dev. 24, 423–431 (2010).

    Article  CAS  Google Scholar 

  9. Rios, J., Stein, E., Shendure, J., Hobbs, H.H. & Cohen, J.C. Identification by whole-genome resequencing of gene defect responsible for severe hypercholesterolemia. Hum. Mol. Genet. 19, 4313–4318 (2010).

    Article  CAS  Google Scholar 

  10. Lee, W. et al. The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465, 473–477 (2010).

    Article  CAS  Google Scholar 

  11. The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  12. Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    Article  CAS  Google Scholar 

  13. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  14. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  Google Scholar 

  15. Sherry, S.T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).

    Article  CAS  Google Scholar 

  16. Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    Article  Google Scholar 

  17. Chen, R., Davydov, E.V., Sirota, M. & Butte, A.J. Non-synonymous and synonymous coding SNPs show similar likelihood and effect size of human disease association. PLoS ONE 5, e13574 (2010).

    Article  Google Scholar 

  18. Kaur, I. et al. Variants in the 10q26 gene cluster (LOC387715 and HTRA1) exhibit enhanced risk of age-related macular degeneration along with CFH in Indian patients. Invest. Ophthalmol. Vis. Sci. 49, 1771–1776 (2008).

    Article  Google Scholar 

  19. Tam, P.O. et al. HTRA1 variants in exudative age-related macular degeneration and interactions with smoking and CFH. Invest. Ophthalmol. Vis. Sci. 49, 2357–2365 (2008).

    Article  Google Scholar 

  20. Yamaguchi, H. et al. Mutations in TERT, the gene for telomerase reverse transcriptase, in aplastic anemia. N. Engl. J. Med. 352, 1413–1424 (2005).

    Article  CAS  Google Scholar 

  21. Albers, C.A. et al. Dindel: Accurate indel calls from short-read data. Genome Res. 21, 961–973 (2011).

    Article  CAS  Google Scholar 

  22. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  Google Scholar 

  23. Clark, M.J. et al. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29, 908–914 (2011).

    Article  CAS  Google Scholar 

Download references


This work is supported by the Stanford Department of Genetics and the US National Institutes of Health.

Author information

Authors and Affiliations



H.Y.K.L. and M.J.C. did the analysis. G.N. and L.H. assisted in the analysis. Rui C. did DNA sequencing. Rong C. did the disease-association study. Rui C. and M.O'H. did the validation experiments. H.Y.K.L., F.E.D., E.A.A., M.B.G., A.J.B., H.P.J. and M.S. coordinated the analysis and revised the manuscript. H.Y.K.L., M.J.C. and M.S. wrote the manuscript.

Corresponding author

Correspondence to Michael Snyder.

Ethics declarations

Competing interests

M.S. is a scientific advisory board member for Genapsys, Inc.; a scientific advisory board member and cofounder of Personalis, Inc.; and a consultant for Illumina.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1 and 2 (PDF 1129 kb)

Supplementary Table 1

Disease association of all platform-specific SNPs. (XLSX 40 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lam, H., Clark, M., Chen, R. et al. Performance comparison of whole-genome sequencing platforms. Nat Biotechnol 30, 78–82 (2012).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research