Review

The expanding scope of DNA sequencing

Received:
Accepted:
Published online:

Abstract

In just seven years, next-generation technologies have reduced the cost and increased the speed of DNA sequencing by four orders of magnitude, and experiments requiring many millions of sequencing reads are now routine. In research, sequencing is being applied not only to assemble genomes and to investigate the genetic basis of human disease, but also to explore myriad phenomena in organismic and cellular biology. In the clinic, the utility of sequence data is being intensively evaluated in diverse contexts, including reproductive medicine, oncology and infectious disease. A recurrent theme in the development of new sequencing applications is the creative 'recombination' of existing experimental building blocks. However, there remain many potentially high-impact applications of next-generation DNA sequencing that are not yet fully realized.

  • Subscribe to Nature Biotechnology for full access:

    $250

    Subscribe

Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.

References

  1. 1.

    et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728–1732 (2005).

  2. 2.

    et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).

  3. 3.

    & Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008).

  4. 4.

    DNA sequencing costs: data from the NHGRI large-scale genome sequencing program. . Accessed 2012 10 01.

  5. 5.

    , , & Advanced sequencing technologies: methods and goals. Nat. Rev. Genet. 5, 335–344 (2004).

  6. 6.

    et al. The challenges of sequencing by synthesis. Nat. Biotechnol. 27, 1013–1023 (2009).

  7. 7.

    et al. The potential and challenges of nanopore sequencing. Nat. Biotechnol. 26, 1146–1153 (2008).

  8. 8.

    Sequencing technologies—the next generation. Nat. Rev. Genet. 11, 31–46 (2010).

  9. 9.

    et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).

  10. 10.

    Genome 10K Community of Scientists. Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. J. Hered. 100, 659–674 (2009).

  11. 11.

    , & An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. USA 98, 9748–9753 (2001).

  12. 12.

    et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl. Acad. Sci. USA 108, 1513–1518 (2011).

  13. 13.

    , & Limitations of next-generation genome sequence assembly. Nat. Methods 8, 61–65 (2011).

  14. 14.

    & Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

  15. 15.

    et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59 (2008).

  16. 16.

    et al. Haplotype-resolved genome sequencing of a Gujarati Indian individual. Nat. Biotechnol. 29, 59–63 (2011).

  17. 17.

    et al. Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells. Nature 487, 190–195 (2012).

  18. 18.

    , , & Whole-genome molecular haplotyping of single cells. Nat. Biotechnol. 29, 51–57 (2011).

  19. 19.

    et al. Direct determination of molecular haplotypes by chromosome microdissection. Nat. Methods 7, 299–301 (2010).

  20. 20.

    et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).

  21. 21.

    et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc. Natl. Acad. Sci. USA 106, 19096–19101 (2009).

  22. 22.

    et al. A de novo paradigm for mental retardation. Nat. Genet. 42, 1109–1112 (2010).

  23. 23.

    et al. Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat. Genet. 43, 585–589 (2011).

  24. 24.

    et al. Increased exonic de novo mutation rate in individuals with schizophrenia. Nat. Genet. 43, 860–863 (2011).

  25. 25.

    et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009).

  26. 26.

    & What's a genome worth? Sci. Transl. Med. 4, 133fs113 (2012).

  27. 27.

    et al. The predictive capacity of personal genome sequencing. Sci. Transl. Med. 4, 133ra158 (2012).

  28. 28.

    et al. Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet. Med. 13, 255–262 (2011).

  29. 29.

    et al. Whole-genome sequencing for optimized patient management. Sci. Transl. Med. 3, 87re83 (2011).

  30. 30.

    et al. Carrier testing for severe childhood recessive diseases by next-generation sequencing. Sci. Transl. Med. 3, 65ra64 (2011).

  31. 31.

    , , , & Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood. Proc. Natl. Acad. Sci. USA 105, 16266–16271 (2008).

  32. 32.

    et al. Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma. Proc. Natl. Acad. Sci. USA 105, 20458–20463 (2008).

  33. 33.

    et al. Noninvasive whole-genome sequencing of a human fetus. Sci. Transl. Med. 4, 137ra176 (2012).

  34. 34.

    et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315–322 (2009).

  35. 35.

    , , & Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 1497–1502 (2007).

  36. 36.

    et al. High-resolution profiling of histone methylations in the human genome. Cell 129, 823–837 (2007).

  37. 37.

    et al. Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283–289 (2009).

  38. 38.

    , , , & Genome-wide protein-DNA binding dynamics suggest a molecular clutch for transcription factor function. Nature 484, 251–255 (2012).

  39. 39.

    , , , & Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

  40. 40.

    et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

  41. 41.

    & Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature 469, 368–373 (2011).

  42. 42.

    et al. Genome-wide measurement of RNA secondary structure in yeast. Nature 467, 103–107 (2010).

  43. 43.

    et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

  44. 44.

    et al. Genome-wide identification of human RNA editing sites by parallel DNA capturing and sequencing. Science 324, 1210–1213 (2009).

  45. 45.

    et al. Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts. Genome Res. 19, 381–394 (2009).

  46. 46.

    et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008).

  47. 47.

    et al. Metabolic labeling of RNA uncovers principles of RNA production and degradation dynamics in mammalian cells. Nat. Biotechnol. 29, 436–442 (2011).

  48. 48.

    , , & Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

  49. 49.

    & Nanopores as protein sensors. Nat. Biotechnol. 30, 506–507 (2012).

  50. 50.

    , & Protein quantification in complex mixtures by solid phase single-molecule counting. Anal. Chem. 81, 7141–7148 (2009).

  51. 51.

    et al. Measurement and clinical monitoring of human lymphocyte clonality by massively parallel VDJ pyrosequencing. Sci. Transl. Med. 1, 12ra23 (2009).

  52. 52.

    , , , & Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencing. Genome Res. 19, 1817–1824 (2009).

  53. 53.

    et al. High throughput sequencing reveals a complex pattern of dynamic interrelationships among human T cell subsets. Proc. Natl. Acad. Sci. USA 107, 1518–1523 (2010).

  54. 54.

    et al. Comprehensive assessment of T-cell receptor beta-chain diversity in alphabeta T cells. Blood 114, 4099–4107 (2009).

  55. 55.

    et al. Public clonotype usage identifies protective Gag-specific CD8+ T cell responses in SIV infection. J. Exp. Med. 206, 923–936 (2009).

  56. 56.

    & Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).

  57. 57.

    et al. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature 456, 66–72 (2008).

  58. 58.

    et al. Use of whole-genome sequencing to diagnose a cryptic fusion oncogene. J. Am. Med. Assoc. 305, 1577–1584 (2011).

  59. 59.

    et al. Evolution of an adenocarcinoma in response to selection by targeted kinase inhibitors. Genome Biol. 11, R82 (2010).

  60. 60.

    et al. Development of personalized tumor biomarkers using massively parallel sequencing. Sci. Transl. Med. 2, 20ra14 (2010).

  61. 61.

    et al. Sensitive digital quantification of DNA methylation in clinical samples. Nat. Biotechnol. 27, 858–863 (2009).

  62. 62.

    et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).

  63. 63.

    et al. Topographical and temporal diversity of the human skin microbiome. Science 324, 1190–1192 (2009).

  64. 64.

    et al. Metagenomic analysis of the human distal gut microbiome. Science 312, 1355–1359 (2006).

  65. 65.

    et al. Whole-genome sequencing and social-network analysis of a tuberculosis outbreak. N. Engl. J. Med. 364, 730–739 (2011).

  66. 66.

    et al. Evolution of MRSA during hospital transmission and intercontinental spread. Science 327, 469–474 (2010).

  67. 67.

    et al. Added value of deep sequencing relative to population sequencing in heavily pre-treated HIV-1-infected subjects. PLoS ONE 6, e19461 (2011).

  68. 68.

    , , & Error correction of next-generation sequencing data and reliable estimation of HIV quasispecies. Nucleic Acids Res. 38, 7400–7409 (2010).

  69. 69.

    et al. A core gut microbiome in obese and lean twins. Nature 457, 480–484 (2009).

  70. 70.

    , & Interaction between transcription regulatory regions of prolactin chromatin. Science 261, 203–206 (1993).

  71. 71.

    et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).

  72. 72.

    et al. A three-dimensional model of the yeast genome. Nature 465, 363–367 (2010).

  73. 73.

    et al. An oestrogen-receptor-alpha-bound human chromatin interactome. Nature 462, 58–64 (2009).

  74. 74.

    et al. The genomic complexity of primary human prostate cancer. Nature 470, 214–220 (2011).

  75. 75.

    et al. Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. Proc. Natl. Acad. Sci. USA 107, 139–144 (2010).

  76. 76.

    ENCODE Project Consortium et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  77. 77.

    et al. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330, 1775–1787 (2010).

  78. 78.

    et al. Somatic coding mutations in human induced pluripotent stem cells. Nature 471, 63–67 (2011).

  79. 79.

    et al. Genome, epigenome and RNA sequences of monozygotic twins discordant for multiple sclerosis. Nature 464, 1351–1356 (2010).

  80. 80.

    et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

  81. 81.

    , , & Comprehensive polymorphism survey elucidates population structure of Saccharomyces cerevisiae. Nature 458, 342–345 (2009).

  82. 82.

    et al. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 331, 463–467 (2011).

  83. 83.

    & Molecular characterization of clonal interference during adaptive evolution in asexual populations of Saccharomyces cerevisiae. Nat. Genet. 40, 1499–1504 (2008).

  84. 84.

    et al. The repertoire and dynamics of evolutionary adaptations to controlled nutrient-limited environments in yeast. PLoS Genet. 4, e1000303 (2008).

  85. 85.

    et al. A diarylquinoline drug active on the ATP synthase of Mycobacterium tuberculosis. Science 307, 223–227 (2005).

  86. 86.

    et al. High-throughput VDJ sequencing for quantification of minimal residual disease in chronic lymphocytic leukemia and immune reconstitution assessment. Proc. Natl. Acad. Sci. USA 108, 21194–21199 (2011).

  87. 87.

    et al. Rapid creation and quantitative monitoring of high coverage shRNA libraries. Nat. Methods 6, 443–445 (2009).

  88. 88.

    , , , & Millisecond-timescale, genetically targeted optical control of neural activity. Nat. Neurosci. 8, 1263–1268 (2005).

  89. 89.

    et al. Unbiased reconstruction of a mammalian transcriptional network mediating pathogen responses. Science 326, 257–263 (2009).

  90. 90.

    et al. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe 6, 279–289 (2009).

  91. 91.

    , & Genome-scale identification of resistance functions in Pseudomonas aeruginosa using Tn-seq. MBio 2, e00315–e00310 (2011).

  92. 92.

    et al. Global gene disruption in human cells to assign genes to phenotypes by deep sequencing. Nat. Biotechnol. 29, 542–546 (2011).

  93. 93.

    et al. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. 27, 1173–1175 (2009).

  94. 94.

    et al. Systematic dissection and optimization of inducible enhancers in human cells using a massively parallel reporter assay. Nat. Biotechnol. 30, 271–277 (2012).

  95. 95.

    et al. High-resolution mapping of protein sequence-function relationships. Nat. Methods 7, 741–746 (2010).

  96. 96.

    et al. Massively parallel functional dissection of mammalian enhancers in vivo. Nat. Biotechnol. 30, 265–270 (2012).

  97. 97.

    et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131, 861–872 (2007).

  98. 98.

    , , , & Induced pluripotent stem cells generated without viral integration. Science 322, 945–949 (2008).

  99. 99.

    et al. Virus-free induction of pluripotency and subsequent excision of reprogramming factors. Nature 458, 771–775 (2009).

  100. 100.

    et al. Human induced pluripotent stem cells free of vector and transgene sequences. Science 324, 797–801 (2009).

  101. 101.

    & Rapid construction of empirical RNA fitness landscapes. Science 330, 376–379 (2010).

  102. 102.

    et al. Activators of the glutamate-dependent acid resistance system alleviate deleterious effects of YidC depletion in Escherichia coli. J. Bacteriol. 193, 1308–1316 (2011).

  103. 103.

    , & Analysis of combinatorial cis-regulation in synthetic and genomic promoters. Nature 457, 215–218 (2009).

  104. 104.

    et al. A combinatorial genetic library approach to target heterologous glycosylation enzymes to the endoplasmic reticulum or the Golgi apparatus of Pichia pastoris. Yeast 28, 237–252 (2011).

  105. 105.

    et al. Rapid interactome profiling by massive sequencing. Nucleic Acids Res. 38, e110 (2010).

  106. 106.

    et al. Quantitative analysis of fitness and genetic interactions in yeast on a genome scale. Nat. Methods 7, 1017–1024 (2010).

  107. 107.

    et al. Conservation and rewiring of functional modules revealed by an epistasis map in fission yeast. Science 322, 405–410 (2008).

  108. 108.

    , , , & Massively parallel exon capture and library-free resequencing across 16 genomes. Nat. Methods 6, 315–316 (2009).

  109. 109.

    , , , & Multiplex amplification enabled by selective circularization of large sets of genomic DNA fragments. Nucleic Acids Res. 33, e71 (2005).

  110. 110.

    et al. Direct genomic selection. Nat. Methods 2, 63–69 (2005).

  111. 111.

    et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat. Biotechnol. 27, 182–189 (2009).

  112. 112.

    et al. Reduced representation bisulfite sequencing for comparative high-resolution DNA methylation analysis. Nucleic Acids Res. 33, 5868–5877 (2005).

  113. 113.

    et al. Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nat. Genet. 43, 264–268 (2011).

  114. 114.

    et al. Digital transcriptome profiling using selective hexamer priming for cDNA synthesis. Nat. Methods 6, 647–649 (2009).

  115. 115.

    et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).

  116. 116.

    et al. Selective chemical labeling reveals the genome-wide distribution of 5-hydroxymethylcytosine. Nat. Biotechnol. 29, 68–72 (2011).

  117. 117.

    , , & Traces of post-transcriptional RNA modifications in deep sequencing data. Biol. Chem. 392, 305–313 (2011).

  118. 118.

    & Mapping in vivo protein-RNA interactions at single-nucleotide resolution from HITS-CLIP data. Nat. Biotechnol. 29, 607–614 (2011).

  119. 119.

    et al. FRT-seq: amplification-free, strand-specific transcriptome sequencing. Nat. Methods 7, 130–132 (2010).

  120. 120.

    et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010).

  121. 121.

    et al. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 12, R18 (2011).

  122. 122.

    et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133–138 (2009).

  123. 123.

    , , , & Parallel, tag-directed assembly of locally derived short sequence reads. Nat. Methods 7, 119–122 (2010).

  124. 124.

    , , & Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).

  125. 125.

    , & Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes. Genes Dev. 23, 1379–1386 (2009).

  126. 126.

    et al. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat. Biotechnol. 30, 693–700 (2012).

  127. 127.

    et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat. Biotechnol. 30, 434–439 (2012).

  128. 128.

    et al. Base-resolution analysis of 5-hydroxymethylcytosine in the mammalian genome. Cell 149, 1368–1380 (2012).

  129. 129.

    et al. Targeted bisulfite sequencing reveals changes in DNA methylation associated with nuclear reprogramming. Nat. Biotechnol. 27, 353–360 (2009).

  130. 130.

    et al. Nucleosome landscape and control of transcription in the human malaria parasite. Genome Res. 20, 228–238 (2010).

  131. 131.

    et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature 448, 553–560 (2007).

  132. 132.

    et al. A translational profiling approach for the molecular characterization of CNS cell types. Cell 135, 738–748 (2008).

  133. 133.

    et al. Finished bacterial genomes from shotgun sequence data. Genome Res. Advance online

  134. 134.

    et al. In situ genotyping individual DNA molecules by target-primed rolling-circle amplification of padlock probes. Nat. Methods 1, 227–232 (2004).

  135. 135.

    et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).

  136. 136.

    et al. The genetic landscape of a cell. Science 327, 425–431 (2010).

  137. 137.

    et al. Next-generation sequencing to generate interactome datasets. Nat. Methods 8, 478–480 (2011).

  138. 138.

    , , & Integrated analysis of receptor activation and downstream signaling with EXTassays. Nat. Methods 7, 74–80 (2010).

  139. 139.

    et al. Decoding cell lineage from acquired mutations using arbitrary deep sequencing. Nat. Methods 9, 78–80 (2012).

  140. 140.

    et al. Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature 450, 56–62 (2007).

Download references

Acknowledgements

We thank L. Solomon and L. Gaffney of the Broad Institute for assistance with the design and preparation of figures; B. Wong and S. Arbesman for input on figure design; A.P. Aiden for valuable comments; and members of the Shendure lab and of the Laboratory at Large for discussions.

Author information

Affiliations

  1. Department of Genome Sciences, University of Washington, Seattle, Washington, USA.

    • Jay Shendure
  2. Laboratory at Large, School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA.

    • Erez Lieberman Aiden
  3. Broad Institute of Harvard and MIT, Harvard University, Cambridge, Massachusetts, USA.

    • Erez Lieberman Aiden
  4. Harvard Society of Fellows, Harvard University, Cambridge, Massachusetts, USA.

    • Erez Lieberman Aiden

Authors

  1. Search for Jay Shendure in:

  2. Search for Erez Lieberman Aiden in:

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Jay Shendure or Erez Lieberman Aiden.