Abstract
Whole exome sequencing by high-throughput sequencing of target-enriched genomic DNA (exome-seq) has become common in basic and translational research as a means of interrogating the interpretable part of the human genome at relatively low cost. We present a comparison of three major commercial exome sequencing platforms from Agilent, Illumina and Nimblegen applied to the same human blood sample. Our results suggest that the Nimblegen platform, which is the only one to use high-density overlapping baits, covers fewer genomic regions than the other platforms but requires the least amount of sequencing to sensitively detect small variants. Agilent and Illumina are able to detect a greater total number of variants with additional sequencing. Illumina captures untranslated regions, which are not targeted by the Nimblegen and Agilent platforms. We also compare exome sequencing and whole genome sequencing (WGS) of the same sample, demonstrating that exome sequencing can detect additional small variants missed by WGS.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).
Hedges, D. et al. Exome sequencing of a multigenerational human pedigree. PLoS ONE 4, e8232 (2009).
Lee, H. et al. Improving the efficiency of genomic loci capture using oligonucleotide arrays for high throughput resequencing. BMC Genomics 10, 646 (2009).
Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010).
Bainbridge, M.N. et al. Whole exome capture in solution with 3 Gbp of data. Genome Biol. 11, R62 (2010).
Ng, S.B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).
Nazarian, R. et al. Melanomas acquire resistance to B-RAF(V600E) inhibition by RTK or N-RAS upregulation. Nature 468, 973–977 (2010).
Glazov, E.A. et al. Whole-exome re-sequencing in a family quartet identifies POP1 mutations as the cause of a novel skeletal dysplasia. PLoS Genet. 7, e1002027 (2011).
Kalay, E. et al. CEP152 is a genome maintenance protein disrupted in Seckel syndrome. Nat. Genet. 43, 23–26 (2011).
Shi, Y. et al. Exome sequencing identifies ZNF644 mutations in high myopia. PLoS Genet. 7, e1002084 (2011).
Snape, K. et al. Mutations in CEP57 cause mosaic variegated aneuploidy syndrome. Nat. Genet. 43, 527–529 (2011).
Wheeler, D.A. et al. The complete genome of an individual by massively parallel DNA sequencing. Nature 452, 872–876 (2008).
Ng, S.B. et al. Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42, 30–35 (2010).
Ng, S.B. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki syndrome. Nat. Genet. 42, 790–793 (2010).
The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).
Pruitt, K.D., Tatusova, T., Klimke, W. & Maglott, D.R. NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Res. 37, D32–D36 (2009).
Hsu, F. et al. The UCSC known genes. Bioinformatics 22, 1036–1046 (2006).
Flicek, P. et al. Ensembl 2011. Nucleic Acids Res. 39, D800–D806 (2011).
Griffiths-Jones, S., Saini, H.K., van Dongen, S. & Enright, A.J. miRBase: tools for microRNA genomics. Nucleic Acids Res. 36, D154–D158 (2008).
Dohm, J.C., Lottaz, C., Borodina, T. & Himmelbauer, H. Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic Acids Res. 36, e105 (2008).
Aird, D. et al. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 12, R18 (2011).
Kane, M.D. et al. Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res. 28, 4552–4557 (2000).
Kucho, K., Yoneda, H., Harada, M. & Ishiura, M. Determinants of sensitivity and specificity in spotted DNA microarrays with unmodified oligonucleotides. Genes Genet. Syst. 79, 189–197 (2004).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Degner, J.F. et al. Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25, 3207–3212 (2009).
Zhang, Z. & Gerstein, M. Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Res. 31, 5338–5348 (2003).
Mills, R.E. et al. An initial map of insertion and deletion (INDEL) variation in the human genome. Genome Res. 16, 1182–1190 (2006).
Taylor, M.S., Ponting, C.P. & Copley, R.R. Occurrence and consequences of coding sequence insertions and deletions in mammalian genomes. Genome Res. 14, 555–566 (2004).
Ashley, E.A. et al. Clinical assessment incorporating a personal genome. Lancet 375, 1525–1535 (2010).
Chen, R., Davydov, E.V., Sirota, M. & Butte, A.J. Non-synonymous and synonymous coding SNPs show similar likelihood and effect size of human disease association. PLoS ONE 5, e13574 (2010).
Wetterstrand, K.A. DNA Sequencing Costs: Data from the NHGRI Large-Scale Genome Sequencing Program. <http://www.genome.gov/sequencingcosts/> (accessed July 15, 2011).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Albers, C.A. et al. Dindel: accurate indel calls from short-read data. Genome Res. 21, 961–973 (2011).
Acknowledgements
We thank P. LaCroute for assistance with data processing and analysis. Thanks to A. Boyle and Y. Cheng for consulting with data analysis and display methods. We thank representatives from Agilent, Illumina and Nimblegen for their support and feedback as we performed these tests. We also thank the Hewlett Packard Foundation and Lucile Packard Foundation for Children's Health for support in creation of our disease/trait SNP database. This work was supported by the US National Institutes of Health grant no. HG002357.
Author information
Authors and Affiliations
Contributions
M.S. and R.C. conceived and planned the study. R.C. performed the experiments. G.E. provided sequencing services. M.J.C. conducted the data analysis. R.C. and M.S. both contributed to the data analysis and discussion. H.Y.K.L. and M.J.C. analyzed the whole genome data. K.J.K., R.C. and A.J.B. created the disease/trait SNP database and analyzed our data against it. M.J.C., R.C. and M.S. prepared the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Supplementary information
Supplementary Text and Figures
Supplementary Tables 1–5 and Supplementary Figures 1–3 (PDF 1279 kb)
Supplementary Data 1.
SNVs detected by exome-seq with three platforms. (ZIP 5966 kb)
Supplementary Data 2.
Indels detected by exome-seq with three platforms. (ZIP 761 kb)
Rights and permissions
About this article
Cite this article
Clark, M., Chen, R., Lam, H. et al. Performance comparison of exome DNA sequencing technologies. Nat Biotechnol 29, 908–914 (2011). https://doi.org/10.1038/nbt.1975
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt.1975
This article is cited by
-
Human whole-exome genotype data for Alzheimer’s disease
Nature Communications (2024)
-
Twist exome capture allows for lower average sequence coverage in clinical exome sequencing
Human Genomics (2023)
-
System analysis of the sequencing quality of human whole exome samples on BGI NGS platform
Scientific Reports (2022)
-
Genome-wide investigations reveal the population structure and selection signatures of Nigerian cattle adaptation in the sub-Saharan tropics
BMC Genomics (2022)
-
Integrated proteogenomic characterization of medullary thyroid carcinoma
Cell Discovery (2022)