The genomes of contemporary humans contain considerable information about the history of our species. Although the general contours of human evolutionary history have been defined with increasing resolution throughout the past several decades, the continuing deluge of massively large sequencing data sets presents new opportunities and challenges for understanding human evolutionary history. Here, we review the signatures that demographic history imparts on patterns of DNA sequence variation, statistical methods that have been developed to leverage information contained in genome-scale data sets and insights gleaned from these studies. We also discuss the importance of using exploratory analyses to assess data quality, the strengths and limitations of commonly used population genomics methods, and factors that confound population genomics inferences.

  1. Identifying demographically informative genomic regions.
    Figure 1: Identifying demographically informative genomic regions.

    Functional and comparative genomics data can be leveraged to identify putatively neutral regions in a principled way. This schematic shows various functional and comparative genomics data, as well as sequences that are structurally complex (segmental duplications (SegDups)) or subject to adaptive evolution (human accelerated regions (HARs)). ChIP–seq, chromatin immunoprecipitation followed by sequencing; DHS, DNase I-hypersensitive sites; H3K27ac, histone H3 acetylated at lysine 27; Txn, transcription.

  2. Inferring population demographic history.
    Figure 2: Inferring population demographic history.

    A simple three-population model with changes in population size and asymmetric gene flow is shown. a | Demographic model. b–f | Schematics for the output of various methodological tools discussed in the text are illustrated. Principal components analysis (PCA) qualitatively illustrates population structure and admixture of populations 2 and 3 by the spread of individuals along PC2 (part b). Treemix illustrates different migration rates between populations 2 and 3 (part c). The difference in arrowhead size indicates asymmetrical migration rates. dadi (diffusion approximations for demographic inference) shows a contour plot of the likelihood surface of the effective population size of populations 2 and 3, as well as profile likelihoods of effective size for each population (part d). STRUCTURE provides estimates of the proportion of each individual genomes from populations 1, 2 and 3 (part e). Chromosomal painting shows the specific tracts of sequences inherited from ancestors in each population (part f). Ind., individual; mi,j, migration rate from population i to population j; Ne.i, effective population size of population i; Pop., population.

  3. The effect of demographic perturbations on gene genealogies and the SFS.
    Figure 3: The effect of demographic perturbations on gene genealogies and the SFS.

    Four simple population demographic models are shown: constant model, bottleneck model, expansion model and structured population model. Below each model schematic, we show average gene genealogies from five sampled lineages obtained by coalescent simulations and stylized site frequency spectrum (SFS; plotted on a logarithmic scale) generated from each model. The SFS from the constant-sized population model is shown in red on each subsequent plot to facilitate comparison among models. Demographic events influence the shape and structure of the genealogies, which in turn influence patterns of genetic variation, such as the SFS. Many popular methods leverage the SFS for inferring population demographic history. The double-ended arrow indicates bidirectional migration.


  Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, box 355065, Seattle, Washington 98195–5065, USA.

    Joshua G. Schraiber
    Joshua M. Akey

J.M.A. is a paid consultant of Glenview Capital. J.G.S. declares no competing interests.

  Joshua G. Schraiber

    Joshua G. Schraiber is a postdoctoral fellow in the Department of Genome Sciences at the University of Washington, Seattle, USA. He is interested in developing methods applicable to leverage large data sets to investigate aspects of both macro- and microevolution. His recent work includes the development of methods to infer selection and demography from ancient DNA, studying the importance of rapid adaptation on geological timescales, and the development of theoretical models for the evolution of molecular phenotypes within and between species.

  Joshua M. Akey

    Joshua M. Akey is a professor in the Department of Genome Sciences at the University of Washington, Seattle, USA. He is broadly interested in understanding the evolutionary forces that shape patterns of genomic variation within and between species and the genetic architecture of high-dimensional molecular phenotypes. His research group is currently pursuing the answers to various questions in yeast and human evolutionary genomics.

