Article | Published:

A southern African origin and cryptic structure in the highly mobile plains zebra


The plains zebra (Equus quagga) is an ecologically important species of the African savannah. It is also one of the most numerous and widely distributed ungulates, and six subspecies have been described based on morphological variation. However, the within-species evolutionary processes have been difficult to resolve due to its high mobility and a lack of consensus regarding the population structure. We obtained genome-wide DNA polymorphism data from more than 167,000 loci for 59 plains zebras from across the species range, encompassing all recognized extant subspecies, as well as three mountain zebras (Equus zebra) and three Grevy’s zebras (Equus grevyi). Surprisingly, the population genetic structure does not mirror the morphology-based subspecies delineation, underlining the dangers of basing management units exclusively on morphological variation. We use demographic modelling to provide insights into the past phylogeography of the species. The results identify a southern African location as the most likely source region from which all extant populations expanded around 370,000 years ago. We show evidence for inclusion of the extinct and phenotypically divergent quagga (Equus quagga quagga) in the plains zebra variation and reveal that it was less divergent from the other subspecies than the northernmost (Ugandan) extant population.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Owen-Smith, N. & Cumming, D. H. M. Comparative foraging strategies of grazing ungulates in African savanna grasslands. In Proc. XVII Int. Grasslands Congress New Zealand, 691–698 (SIR Publishing, Wellington, 1993).

  2. 2.

    Lorenzen, E. D., Heller, R. & Siegismund, H. R. Comparative phylogeography of African savannah ungulates. Mol. Ecol. 21, 3656–3670 (2012).

  3. 3.

    Bell, R. H. V. A grazing ecosystem in the Serengeti. Sci. Am. 225, 86–93 (1971).

  4. 4.

    Hack, M. A., East, R., Rubenstein, D. I. & Moehlman, P. A. in Equids: Zebras, Asses and Horses: Status Survey and Conservation Action Plan (ed. Moehlman, P. D.) 43–60 (IUCN/SSC Equid Specialist Group, Gland and Cambridge, 2002).

  5. 5.

    Lorenzen, E. D., Arctander, P. & Siegismund, H. R. High variation and very low differentiation in wide ranging plains zebra (Equus quagga): insights from mtDNA and microsatellites. Mol. Ecol. 17, 2812–2824 (2008).

  6. 6.

    Groves, C. P. & Bell, C. H. New investigations on the taxonomy of the zebras genus Equus, subgenus Hippotigris. Mamm. Biol. 69, 182–196 (2004).

  7. 7.

    Leonard, J. et al. A rapid loss of stripes: the evolutionary history of the extinct quagga. Biol. Lett. 1, 291–295 (2005).

  8. 8.

    Oakenfull, E. A., Lim, H. N. & Ryder, O. A. A survey of equid mitochondrial DNA: implications for the evolution, genetic diversity and conservation of Equus. Conserv. Genet. 1, 341–355 (2000).

  9. 9.

    Rubenstein, D., Low Mackey, B., Davidson, Z. D., Kebede, F. & King, S. R. B. Equus grevyi. IUCN Red List of Threatened Species 2016: e.T7950A89624491 (IUCN, accessed 10 September 2017);

  10. 10.

    Moritz, C. Defining ‘evolutionarily significant units’ for conservation. Trends Ecol. Evol. 9, 373–375 (1994).

  11. 11.

    Crandall, K. A., Bininda-Emonds, O. R. R., Mace, G. M. & Wayne, R. K. Considering evolutionary processes in conservation biology. Trends Ecol. Evol. 15, 290–295 (2000).

  12. 12.

    Churcher, C. S. & Richardson, M. L. in Evolution of African Mammals (eds Maglio, V. J. & Cooke, H. B. S.) 379–422 (Harvard Univ. Press, Cambridge, 1978).

  13. 13.

    Klein, R. G. & Cruz-Uribe, K. Craniometry of the genus Equus and the taxonomic affinities of the extinct South African quagga. S. Afr. J. Sci. 95, 81–86 (1999).

  14. 14.

    Orlando, L. et al. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499, 74–78 (2013).

  15. 15.

    Vilstrup, J. T. et al. Mitochondrial phylogenomics of modern and ancient equids. PLoS. ONE 8, e55950 (2013).

  16. 16.

    Caro, T., Jones, T. & Davenport, T. R. B. Realities of documenting wildlife corridors in tropical countries. Biol. Conserv. 142, 2807–2811 (2009).

  17. 17.

    Bradburd, G. S., Ralph, P. L., Coop, G. M. & Slatkin, M. A spatial framework for understanding population structure and admixture. PLoS Genet. 12, e1005703 (2016).

  18. 18.

    Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

  19. 19.

    Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).

  20. 20.

    Jónsson, H. et al. Speciation with gene flow in equids despite extensive chromosomal plasticity. Proc. Natl. Acad. Sci. USA 111, 18655–18660 (2014).

  21. 21.

    Heller, R., Chikhi, L. & Siegismund, H. R. The confounding effect of population structure on Bayesian skyline plot inferences of demographic history. PLoS. ONE 8, e62992 (2013).

  22. 22.

    Mazet, O., Rodríguez, W., Grusea, S., Boitard, S. & Chikhi, L. On the importance of being structured: instantaneous coalescence rates and a re-evaluation of human evolution. Heredity 116, 362–371 (2016).

  23. 23.

    Orlando, L. et al. Revising the recent evolutionary history of equids using ancient DNA. Proc. Natl. Acad. Sci. USA 106, 21754–21759 (2009).

  24. 24.

    Groves, C. & Grubb, P. Ungulate Taxonomy (Johns Hopkins Univ. Press, Baltimore, 2011).

  25. 25.

    Lorenzen, E. D., De Neergaard, R., Arctander, P. & Siegismund, H. R. Phylogeography, hybridization and Pleistocene refugia of the kob antelope (Kobus kob). Mol. Ecol. 16, 3241–3252 (2007).

  26. 26.

    Siegismund, H. R., Lorenzen, E. D. & Arctander, P. in Mammals of Africa: Volume VI. Pigs, Hippopotamuses, Chevrotain, Giraffes, Deer and Bovids (eds Kingdon, J. & Hoffmann, M.) 373–379 (Bloomesbury, London, 2013).

  27. 27.

    Smitz, N. et al. Pan-African genetic structure in the African buffalo (Syncerus caffer): investigating intraspecific divergence. PLoS ONE 8, e56235 (2013).

  28. 28.

    Lorenzen, E. D., Simonsen, B. T., Kat, P. W., Arctander, P. & Siegismund, H. R. Hybridization between subspecies of waterbuck (Kobus ellipsiprymnus) in zones of overlap with limited introgression. Mol. Ecol. 15, 3787–3799 (2006).

  29. 29.

    Castañeda, I. S. et al. Hydroclimate variability in the Nile River Basin during the past 28,000 years. Earth Planet. Sci. Lett. 438, 47–56 (2016).

  30. 30.

    Reynolds, D. J. et al. Reconstructing North Atlantic marine climate variability using an absolutely-dated sclerochronological network. Palaeogeogr. Palaeoclimatol. Palaeoecol. 465, 333–346 (2017).

  31. 31.

    Reynolds, S. C. Mammalian body size changes and Plio–Pleistocene environmental shifts: implications for understanding hominin evolution in eastern and southern Africa. J. Hum. Evol. 53, 528–548 (2007).

  32. 32.

    Lorenzen, E. D., Arctander, P. & Siegismund, H. R. Regional genetic structuring and evolutionary history of the impala Aepyceros melampus. J. Hered. 97, 119–132 (2006).

  33. 33.

    Lorenzen, E. D., Masembe, C., Arctander, P. & Siegismund, H. R. A long-standing Pleistocene refugium in southern Africa and a mosaic of refugia in East Africa: insights from mtDNA and the common eland antelope. J. Biogeogr. 37, 571–581 (2010).

  34. 34.

    Arctander, P., Johansen, C. & Coutellec-Vreto, M.-A. Phylogeography of three closely related African bovids (tribe Alcelaphini). Mol. Biol. Evol. 16, 1724–1739 (1999).

  35. 35.

    Hewitt, G. M. Genetic consequences of climatic oscillations in the Quaternary. Phil. Trans. R. Soc. Lond. B 359, 183–195 (2004).

  36. 36.

    Peter, B. M. & Slatkin, M. Detecting range expansions from genetic data. Evolution 67, 3274–3289 (2013).

  37. 37.

    Städler, T., Haubold, B., Merino, C., Stephan, W. & Pfaffelhuber, P. The impact of sampling schemes on the site frequency spectrum in nonequilibrium subdivided populations. Genetics 182, 205–216 (2009).

  38. 38.

    Davey, J. W. et al. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat. Rev. Genet. 12, 499–510 (2011).

  39. 39.

    Puckett, E. E., Etter, P. D., Johnson, E. A. & Eggert, L. S. Phylogeographic analyses of American black bears (Ursus americanus) suggest four glacial refugia and complex patterns of postglacial admixture. Mol. Biol. Evol. 32, 2338–2350 (2015).

  40. 40.

    Rašić, G., Filipović, I., Weeks, A. R. & Hoffmann, A. A. Genome-wide SNPs lead to strong signals of geographic structure and relatedness patterns in the major arbovirus vector, Aedes aegypti. BMC Genom. 15, 275 (2014).

  41. 41.

    Sutherland, B. J. G. et al. Salmonid chromosome evolution as revealed by a novel method for comparing RADseq linkage maps. Genome Biol. Evol. 8, 3600–3617 (2016).

  42. 42.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

  43. 43.

    McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

  44. 44.

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

  45. 45.

    Korneliussen, T., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014).

  46. 46.

    Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

  47. 47.

    Zheng, X. et al. A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics 28, 3326–3328 (2012).

  48. 48.

    Petkova, D., Novembre, J. & Stephens, M. Visualizing spatial population structure with estimated effective migration surfaces. Nat. Genet. 48, 94–100 (2014).

  49. 49.

    Skotte, L., Korneliussen, T. S. & Albrechtsen, A. Estimating individual admixture proportions from next generation sequencing data. Genetics 195, 693–702 (2013).

  50. 50.

    Paradis, E., Claude, J. & Strimmer, K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290 (2004).

  51. 51.

    Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLoS. Genet. 8, e1002967 (2012).

  52. 52.

    Slatkin, M. & Excoffier, L. Serial founder effects during range expansion: a spatial analog of genetic drift. Genetics 191, 171–181 (2012).

  53. 53.

    Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).

  54. 54.

    Bhatia, G., Patterson, N., Sankararaman, S. & Price, A. L. Estimating and interpreting F ST: the impact of rare variants. Genome Res. 23, 1514–1521 (2013).

  55. 55.

    Willing, E. M., Dreyer, C. & Ova Oosterhout, C. Estimates of genetic differentiation measured by F ST do not necessarily require large sample sizes when using many SNP markers. PLoS. ONE 7, e42649 (2012).

  56. 56.

    Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).

  57. 57.

    McQuillan, R. et al. Runs of homozygosity in European populations. Am. J. Hum. Genet. 83, 359–372 (2008).

  58. 58.

    Liu, X. & Fu, Y.-X. Exploring population size changes using SNP frequency spectra. Nat. Genet. 47, 555–559 (2015).

  59. 59.

    Excoffier, L. & Foll, M. fastsimcoal: a continuous-time coalescent simulator of genomic diversity under arbitrarily complex evolutionary scenarios. Bioinformatics 27, 1332–1334 (2011).

  60. 60.

    Excoffier, L., Dupanloup, I., Huerta-Sánchez, E., Sousa, V. C. & Foll, M. Robust demographic inference from genomic and SNP data. PLoS. Genet. 9, e1003905 (2013).

  61. 61.

    Mailund, T., Dutheil, J. Y., Hobolth, A., Lunter, G. & Schierup, M. H. Estimating divergence time and ancestral effective population size of Bornean and Sumatran orangutan subspecies using a coalescent hidden Markov model. PLoS. Genet. 7, 1–15 (2011).

  62. 62.

    Luu, K., Bazin, E. & Blum, M. G. B. pcadapt: an R package to perform genome scans for selection based on principal component analysis. Mol. Ecol. Resour. 33, 67–77 (2016).

  63. 63.

    Edmonds, C. A., Lillie, A. S. & Cavalli-Sforza, L. L. Mutations arising in the wave front of an expanding population. Proc. Nat. Acad. Sci. USA 101, 975–979 (2004).

  64. 64.

    Klopfstein, S., Currat, M. & Excoffier, L. The fate of mutations surfing on the wave of a range expansion. Mol. Biol. Evol. 23, 482–490 (2006).

  65. 65.

    Novellie, P. Equus zebra ssp. zebra. IUCN Red List of Threatened Species 2008: e.T7959A12876612(IUCN, accessed 10 September 2017);

Download references


The authors thank A. al-Cher for laboratory work in connection with this study. The work was funded by research grants from the The Danish Council for Independent Research | Natural Sciences, the Lundbeck Foundation and the Villum Foundation.

Author information

C.-E.T.P and R.H. designed and performed the experiments, analysed the data and wrote the paper. A.A. developed the analytical tools and analysed the data. H.R.S., P.D.E., E.A.J., L.O. and L.C. analysed the data and wrote the paper.

Competing interests

The authors declare no competing financial interests.

Correspondence to Casper-Emil T. Pedersen or Rasmus Heller.

Supplementary information

Supplementary Information

Supplementary notes, figures, tables and references.

Life Sciences Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark
Fig. 1: Sampling areas for the identified plains zebra populations.
Fig. 2: Estimating effective migration surfaces and directionality index.
Fig. 3: Admixture proportions using NGSadmix.
Fig. 4: Principal component analysis plot showing clustering in the plains zebras using the 60 individuals in dataset 2.
Fig. 5: Maximum likelihood tree obtained using TreeMix without migration edges for the 67 individuals in dataset 1.
Fig. 6: Stairway plots for the seven plains zebra populations with at least three samples.