The increased adoption of DNA sequencing in genetic association studies is uncovering a wide range of population genetic variation, including rare genetic variants. Although this rarity limits the statistical power of associating individual rare variants with phenotypes, this Review discusses the diverse methods for leveraging the collective effects of rare variants in order to uncover important roles in complex traits, particularly human diseases.
Applications of next-generation sequencing
The power of high–throughput DNA sequencing technologies is being harnessed by researchers to address an increasingly diverse range of biological problems. The scale and efficiency of sequencing that can now be achieved is providing unprecedented progress in areas from the analysis of genomes themselves to how proteins interact with nucleic acids. This series highlights the breadth of next–generation sequencing applications and the importance of the insights that are being gained through these methods.
The interactions between tumours and the immune system are highly complex. This article discusses methods — primarily computational tools — for characterizing diverse aspects of cancer–immune cell interactions, including antigen presentation, T cell repertoires and heterogeneity in cell types and cell states. The Review particularly highlights the insights from single-cell data from both sequencing technologies and in situ imaging of tissues.
High-throughput sequencing technology is enabling the structures of RNA to be determined on an unprecedented scale, providing insights into the relationship between the structures adopted by RNAs and the functions they perform in the cell.
Various genomics-related fields are increasingly taking advantage of long-read sequencing and long-range mapping technologies, but making sense of the data requires new analysis strategies. This Review discusses bioinformatics tools that have been devised to handle the numerous characteristic features of these long-range data types, with applications in genome assembly, genetic variant detection, haplotype phasing, transcriptomics and epigenomics.
Despite the remarkable throughput of next-generation sequencing technologies, standard techniques are limited by the difficulty in distinguishing sequencing errors from genuine low-frequency DNA variants within heterogeneous cellular or molecular populations. This Review discusses sequencing methodologies and bioinformatic strategies that have been devised for the reliable detection of rare mutations and describes various important applications in diverse fields including cancer, ageing and metagenomics.
Although cancer genome sequencing is becoming routine in cancer research, cancer transcriptome profiling through methods such as RNA sequencing (RNA-seq) provides information not only on mutations but also on their functional cellular consequences. This Review discusses how technical and analytical advances in cancer transcriptomics have provided various clinically valuable insights into gene expression signatures, driver gene prioritization, cancer microenvironments, immuno-oncology and prognostic biomarkers.
Next-generation sequencing has the potential to support public health surveillance systems to improve the early detection of emerging infectious diseases. This Review delineates the role of genomics in rapid outbreak response and the challenges that need to be tackled for genomics-informed pathogen surveillance to become a global reality.
Ancient genomes can inform our understanding of the history of human adaptation through the direct tracking of changes in genetic variant frequency across different geographical locations and time periods. The authors review recent ancient DNA analyses of human, archaic hominin, pathogen, and domesticated animal and plant genomes, as well as the insights gained regarding past human evolution and behaviour.
Technical errors can hamper the interpretation of next-generation sequencing (NGS) data, which poses a major challenge for the clinical application of this technology. This Review discusses how reference standards circumvent this issue by calibrating NGS measurements and evaluating diagnostic performance of NGS-based genetic tests.
Genetic variation of the human Y chromosome plays a key part in studies of human evolution, population history, genealogy, forensics and male medical genetics. This Review outlines how next-generation sequencing has contributed to recent progress in these fields.
Next-generation sequencing technologies have enabled the comprehensive characterization of human and mouse genomes, including at the transcriptional level. This article reviews the degree of conservation of human and mouse transcriptomes, along with the challenges of identifying when the mouse is a suitable model of human physiology.
A genome sequence is only useful once the information encoded in it can be deciphered. In this Review, Mudge and Harrow describe the latest approaches to higher eukaryote gene annotation, including making the best use of complex transcriptome data sets, integrating evidence for functionality and extending annotations to encompass regulatory features.
Mutation is the source of genetic diversity on which natural selection acts, therefore understanding the rates of mutations is crucial for understanding evolutionary trajectories. In this Opinion article, the authors discuss how emerging experimental mutation-rate data from genome-wide sequencing studies, combined with population-genetic theory, can provide unifying explanations for the diversity in mutation rates between species and across genomic locations.
Precision medicine is a strategy for tailoring clinical decision making to the underlying genetic causes of disease. This Review describes how, despite the straightforward overall principles of precision medicine, adopting it responsibly into clinical practice will require many technical and conceptual hurdles to be overcome. Such challenges include optimized sequencing strategies, clinically focused bioinformatics pipelines and reliable metrics for the disease causality of genetic variants.
Marine vertebrates are key contributors to global biodiversity and human food supply. In this Review, the authors discuss how comparative genomics studies in marine vertebrates have provided insight into major evolutionary transitions between the land and sea, as well as intra-species adaptation to diverse types of aquatic environments. They also highlight applications in species management and conservation.
Advances in DNA sequencing technologies have led to vast increases in the diversity of sequencing-based applications and in the amount of data generated. This Review discusses the current state-of-the-art technologies in both short-read and long-read DNA sequencing, their underlying mechanisms, relative strengths and limitations, and emerging applications.
RNA sequencing (RNA-seq) is a powerful approach for comprehensive analyses of transcriptomes. This Review describes the widespread potential applications of RNA-seq in clinical medicine, such as detecting disease-associated mutations and gene expression disruptions, as well as characteristic non-coding RNAs, circulating extracellular RNAs or pathogen RNAs. The authors also highlight the challenges in adopting RNA-seq routinely into clinical practice.
Genomic analyses of cancer genomes have largely focused on mutations in protein-coding regions, but the functional importance of alterations to non-coding regions is becoming increasingly appreciated through whole-genome sequencing. This Review discusses our current understanding of non-coding sequence variants in cancer — both somatic mutations and germline variants, and their interplay — including their identification, computational and experimental evidence for functional impact, and their diverse mechanisms of action for dysregulating coding genes and non-coding RNAs.
Single-cell genome sequencing can provide detailed insights into the composition of single genomes that are not readily apparent when studying bulk cell populations. This Review discusses the considerable technical challenges of amplifying and interrogating genomes from single cells, emerging innovative solutions and various applications in microbiology and human disease, in particular in cancer.
The phenotypic heterogeneity of intellectual disability (ID) disorders has hampered studies of the underlying genetics, but major progress has been achieved by recent applications of next-generation sequencing. This Review discusses our latest understanding of ID genetics, including the identification ofde novoand inherited mutations of various types, strategies for assigning disease causality to the mutations, emerging pathological mechanisms and future research directions.
The wealth of existing and emerging DNA-sequencing data provides an opportunity for a comprehensive understanding of human genetic variation, including the discovery of disease-causing variants. This Review describes how the limitations of current reference-genome assemblies confound the characterization of genetic variation and how this can be mitigated by important advances in algorithms and sequencing technology that facilitate thede novoassembly of genomes.
Combining experimental evolution with next-generation sequencing, the evolve and resequence (E&R) approach is a powerful method for dissecting the genomic changes underlying the adaptation of populations of laboratory organisms or molecules. This Review describes the E&R results from diverse systems and discusses the extent to which various features, including population genetics, experimental setups and reproduction modes, account for the distinct observed outcomes.
Sequencing genomes of ancient specimens, including human ancestors, can provide rich insights into evolutionary histories. However, ancient DNA samples are frequently degraded, damaged and contaminated with ancient and modern DNA from various sources. This Review describes the methodological and bioinformatic advances that allow these challenges to be overcome in order to process and sequence ancient samples for genome reconstruction, as well as recent progress in characterizing ancient epigenomes.
High-throughput DNA sequencing technologies are providing an ever-expanding wealth of genome sequence data, including detailed information on human genetic variation. However, such data typically lack haplotype information (that is, thecis-connectivity of variants along individual chromosomes). This Review describes diverse recent experimental methods by which genetic variants can be resolved into haplotypes, accompanying computational methods and important applications of these methods in genomics and biomedical science.
Various small molecules, including numerous anticancer agents, act by targeting DNA or protein components of chromatin. This Review describes how various complementary technologies use high-throughput sequencing to delineate drug responses, from identifying the genomic binding sites of drugs or their targets, to the ensuing changes to chromatin states and gene expression. These insights should facilitate the rational use of these therapies.
The resolution of epigenomic profiling has been vastly augmented with the adoption of new approaches to interrogate varied features of the epigenome. This Review describes these techniques and outlines the ways in which these genome-wide tools can be used to examine the epigenome.
This Review describes how whole-genome sequencing of pooled DNA from many individuals (Pool-seq) is an economical alternative to sequencing the genomes of individuals separately. The authors outline the strengths and pitfalls of Pool-seq, and provide example applications across diverse species and biological questions.
Forward genetic screens have a long history of uncovering the genetic mutations underlying phenotypes of interest. This Review describes how next-generation sequencing technology can be integrated into forward genetic screens not only to enhance their efficiency but also to allow them to be carried out using expanded repertoires of species, populations and experimental strategies.
Single-cell sequencing of uncultivated microbial species is rapidly providing a wealth of new information. Here, the authors provide an update on recent progress in capturing novel genomes, large-scale environmental studies and research relating to human health, as well as recent methodological improvements and remaining technical challenges.
Whole-genome assemblies of humans and non-human primates are yielding data on the evolutionary origins of the human genome, as well as insights into genetic similarities and differences between species used as models for disease-related research. This Review discusses current knowledge and opportunities for comparative primate genomics created by recent advances in genome sequencing technologies.
There continues to be active debate about the timings, locations and details of various events in human population history. This Review describes how whole-genome sequencing of modern and ancient humans has complemented more traditional methods to provide valuable historical insights.
Ribosome profiling is a recently developed technique that uses deep sequencing to study translationin vivo. This approach has provided new insights into the identities and amounts of proteins produced by cells, as well as into the mechanism of protein synthesis itself.
Next-generation sequencing for variant identification is now becoming widespread, although pipelines have not yet been optimized. In this Perspective article, the authors discuss ways to minimize erroneous variant calls, in particular, by using replicates.
Bacterial whole-genome sequencing is showing promise in clinical applications. Here, the authors present their opinions on what the main bioinformatic challenges are in transferring bacterial whole-genome sequencing to medical diagnostics.
Technologies that are based on next-generation sequencing are increasingly being used to study individual cells. The authors discuss the application of this approach to single-cell genomics and transcriptomics, and explore the implications for both basic research and medicine.
Next-generation sequencing is now poised for the discovery of genetic variants involved in common and rare diseases. Here, the authors present considerations for the workflow of these studies in order to identify true associations of disease and mutation.
This Review discusses the considerations for designing cancer genome-sequencing studies to fulfil different study aims, such as detecting recurrent mutations or assessing clonal evolution. For example, the cohort type and depth of sequencing can influence the downstream analysis.
Clinical sequencing tests that focus on genes linked to specific diseases or phenotypes are increasingly widely being used. This article discusses how disease-targeting tests retain several advantages despite moves towards the clinical application of whole-genome or exome sequencing.
We asked five experts their opinions on issues that arise from new clinical tests that are based on next-generation sequencing. Crucial gaps in infrastructure need to be addressed for the results of these tests to be optimally handled.
This Review sets out the emerging potential of next-generation sequencing in the context of clinical microbiology. Using bacterial genome sequencing as an example, the authors discuss the options and challenges for species identification, testing for virulence and drug resistance and monitoring outbreaks.
There are many different methods and tools available for the analysis of next-generation sequencing data. The challenges towards applying these analysis tools in a transparent and reproducible manner are presented, and a way forward for analysing these data in life sciences research is discussed.
Recent family-based genomic studies are providing a window into the incidence of new mutations in human genomes. This Review discusses our understanding of various types ofde novomutation, including the determinants and consequences of their occurrence rates, and the challenges both for their detection and for linking them to disease pathogenesis.
A growing understanding of the relationship between the microbiome and human health is made possible by advances in sequencing technologies and computational tools. These studies highlight how the composition and function of the microbiome varies across individuals and anatomical sites, over time, and also in disease.
Protein–RNA interactions are central to the regulation of gene expression. Emerging technologies for pinpointing these interactions, both in large complexes and between individual proteins and RNA, are discussed. Methods for analysing these data are also considered.
Studies of the composition, dynamics and function of the human microbiome have taken off in the past two years thanks to the development of new sequencing technologies and advanced algorithms. This article provides a guide to the experimental and analytical best practices in this flourishing field.
Repeat sequences in DNA remain one of the most challenging aspects of next-generation sequencing data analysis and interpretation. This Review explains the problems and current strategies for handling repeats; ignoring repeats risks missing important biological information.
Exome sequencing is a powerful approach for accelerating the discovery of the genes underlying Mendelian disorders and, increasingly, of genes underlying complex traits. This Review describes the experimental and analytical options for applying exome sequencing and the key challenges in using this approach.
Advances in sequencing technologies, assembly algorithms and computing power are making it feasible to assemble the entire transcriptome from short RNA reads. The article reviews the transcriptome assembly strategies, their advantages and limitations and how to apply them effectively.
Next-generation sequencing has now been used to produce the first ancient hominin genome sequences and is also being used to sequence modern humans from many different populations. Together with SNP genotyping, these data are transforming views of human history.
The recent surge in sequencing output has uncovered a wealth of genetic variation, but interpretation of these data remains a challenge. This Review discusses computational and experimental methods for estimating the deleteriousness and functional significance of genetic variants to better identify those that are potentially causal for disease.
Structural variation in the genome can influence disease, complex traits and evolution, but comprehensive characterization of variants is challenging. This Review compares current methods — particularly microarray platforms and sequencing-based computational analysis — and considers future research strategies.
Recent developments in methods for RNA sequencing have led to an increased understanding of transcriptomes — both qualitative and quantitative. Ongoing developments include advances in direct RNA sequencing and approaches that allow RNA quantification from very small amounts of cellular materials.
The amount of genome-scale data on covalent histone modification patterns is rapidly increasing. This Review brings together current knowledge on how modification 'signatures' relate to the structure and function of chromatin, from regulatory elements and gene structure to organization in the nucleus.
This article reviews the increasing range of genome-scale methods that are being used to analyse eukaryotic DNA replication. Studies in different species and of replication timing or origin location have yielded varying degrees of success; technical hurdles remain, but important biological insights have been gained.
Cancer is fundamentally a disease of the genome and so high-throughput sequencing technologies offer great potential for improving our understanding of the biology and treatment of cancer. Experimental strategies, computational approaches and cancer-specific considerations for detecting different types of genomic alterations are discussed.
Recent technological advances are allowing genome-wide analysis of the function of individual alleles in terms of expression levels, histone modifications and DNA methylation. Approaches that discriminate between alleles offer great potential for improving our understanding ofcis-regulatory variation.
A huge range of genome-scale data sets — including genomic, epigenomic and transcriptomic information — are now available, and it is widely acknowledged that combining several data sets can provide important biological insights. However, there are practical, conceptual and computational challenges to data integration.
Genome-wide association studies have explained only a small fraction of the genetic basis of complex diseases. This Review argues that rare variants could have a substantial effect on genetic predisposition to common disease, and the authors outline discovery strategies based on whole-genome sequencing for identifying these genetic risk factors.
Mapping DNA methylation is vital for understanding the importance of this epigenetic mark in health and disease. Recent years have seen rapid progress in the development of techniques for genome-scale methylation profiling; this Review introduces and evaluates the available methods.
Until recently, large-scale transcriptome studies in bacteria and archaea were limited by technical challenges, and there was a perception that microbial transcription was relatively simple compared with eukaryotic transcription. Now, prokaryotic transcriptomics is revealing unexpected aspects of transcriptional control, genome organization and non-coding RNAs.
There is an increasing demand for next-generation sequencing technologies that rapidly deliver high volumes of accurate genome information at a low cost. This Review provides a guide to the features of the different platforms, and describes the recent advances in this fast-moving area.
mRNA repertoires can be diversified by many mechanisms, including alternative splicing and alternative polyadenylation. Technological advances are now allowing genome–wide insights into the extent of RNA processing, the actions of RNA–binding proteins and how regulation at the RNA level helps to control biological systems.
Recent transcriptomic studies have revealed that diverse small RNAs are transcribed from the regions around gene promoters. This Review considers questions prompted by the discovery of these transcripts; for example, what is their origin and are they functional?
Coupling next-generation sequencing to chromatin immunoprecipitation has transformed the resolution and genomic coverage of DNA-binding protein and nucleosome mapping studies. However, successful ChIP–seq requires careful consideration of the experimental and analytical approaches; this Review evaluates the current strategies and challenges.
Genome-wide maps of transcription factor binding are prompting the re-examination of traditional concepts of transcriptional regulation. Current challenges centre on understanding which binding events are functional, how transcription factors cooperate and how to integrate the genomic and chromatin context into models of gene regulation.
The development of high-throughput DNA sequencing methods provides a new method for mapping and quantifying transcriptomes — RNA sequencing (RNA-Seq). This article explains how RNA-Seq works, the challenges it faces and how it is changing our view of eukaryotic transcriptomes.