Vervet monkeys are among the most widely distributed nonhuman primates, show considerable phenotypic diversity, and have long been an important biomedical model for a variety of human diseases and in vaccine research. Using whole-genome sequencing data from 163 vervets sampled from across Africa and the Caribbean, we find high diversity within and between taxa and clear evidence that taxonomic divergence was reticulate rather than following a simple branching pattern. A scan for diversifying selection across taxa identifies strong and highly polygenic selection signals affecting viral processes. Furthermore, selection scores are elevated in genes whose human orthologs interact with HIV and in genes that show a response to experimental simian immunodeficiency virus (SIV) infection in vervet monkeys but not in rhesus macaques, suggesting that part of the signal reflects taxon-specific adaptation to SIV.

  • Subscribe to Nature Genetics for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.


Primary accessions


  1. 1.

    & The recognition of social alliances by vervet monkeys. Anim. Behav. 34, 1722–1731 (1986).

  2. 2.

    , & Vervet monkey alarm calls: semantic communication in a free-ranging primate. Anim. Behav. 28, 1070–1094 (1980).

  3. 3.

    et al. Systems biology of the vervet monkey. ILAR J. 54, 122–143 (2013).

  4. 4.

    et al. Live attenuated tetravalent dengue virus host range vaccine is immunogenic in African green monkeys following a single vaccination. J. Virol. 88, 6729–6742 (2014).

  5. 5.

    et al. African green monkeys recapitulate the clinical experience with replication of live attenuated pandemic influenza virus vaccine candidates. J. Virol. 88, 8139–8152 (2014).

  6. 6.

    et al. Exploring of primate models of tick-borne flaviviruses infection for evaluation of vaccines and drugs efficacy. PLoS One 8, e61094 (2013).

  7. 7.

    et al. Sequencing strategies and characterization of 721 vervet monkey genomes for future genetic analyses of medically relevant traits. BMC Biol. 13, 41 (2015).

  8. 8.

    et al. Genetic variation and gene expression across multiple tissues and developmental stages in a nonhuman primate. Nat. Genet. (2017).

  9. 9.

    et al. SIVagm infection in wild African green monkeys from South Africa: epidemiology, natural history, and evolutionary considerations. PLoS Pathog. 9, e1003011 (2013).

  10. 10.

    et al. Factors associated with simian immunodeficiency virus transmission in a natural African nonhuman primate host in the wild. J. Virol. 88, 5687–5705 (2014).

  11. 11.

    et al. The genome of the vervet (Chlorocebusaethiopssabaeus). Genome Res. 25, 1921–1933 (2015).

  12. 12.

    & in Primates in Perspective (ed. Campbell, C.J.) 252–274 (Oxford University Press, 2007).

  13. 13.

    et al. Evidence for polygenic adaptation to pathogens in the human genome. Mol. Biol. Evol. 30, 1544–1558 (2013).

  14. 14.

    , , & Viruses are a dominant driver of protein adaptation in mammals. eLife 5, e12469 (2016).

  15. 15.

    et al. Genetic adaptation and Neandertal admixture shaped the immune system of human populations. Cell 167, 643–656 (2016).

  16. 16.

    et al. Revisiting an old riddle: what determines genetic diversity levels within species? PLoS Biol. 10, e1001388 (2012).

  17. 17.

    et al. Great ape genetic diversity and population history. Nature 499, 471–475 (2013).

  18. 18.

    et al. Comparative RNA sequencing reveals substantial genetic variation in endangered primates. Genome Res. 22, 602–610 (2012).

  19. 19.

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  20. 20.

    et al. Demographic histories and patterns of linkage disequilibrium in Chinese and Indian rhesus macaques. Science 316, 240–243 (2007).

  21. 21.

    , , & Genomewide comparison of DNA sequences between humans and chimpanzees. Am. J. Hum. Genet. 70, 1490–1497 (2002).

  22. 22.

    , & Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

  23. 23.

    , , & Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).

  24. 24.

    et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

  25. 25.

    & Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

  26. 26.

    , & Population differentiation as a test for selective sweeps. Genome Res. 20, 393–402 (2010).

  27. 27.

    & A bimolecular mechanism of HIV-1 Tat protein interaction with RNA polymerase II transcription elongation complexes. J. Mol. Biol. 320, 925–942 (2002).

  28. 28.

    et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

  29. 29.

    & The effects of background and interference selection on patterns of genetic variation in subdivided populations. Genetics 201, 1539–1554 (2015).

  30. 30.

    et al. HIV-1, human interaction database: current status and new features. Nucleic Acids Res. 43, D566–D570 (2015).

  31. 31.

    et al. Local virus extinctions following a host population bottleneck. J. Virol. 89, 8152–8161 (2015).

  32. 32.

    et al. Simian immunodeficiency viruses from central and western Africa: evidence for a new species-specific lentivirus in tantalus monkeys. J. Virol. 67, 1227–1235 (1993).

  33. 33.

    et al. Plateau levels of viremia correlate with the degree of CD4+-T-cell loss in simian immunodeficiency virus SIVagm-infected pigtailed macaques: variable pathogenicity of natural SIVagm isolates. J. Virol. 79, 5153–5162 (2005).

  34. 34.

    et al. Pathogenic features associated with increased virulence upon simian immunodeficiency virus cross-species transmission from natural hosts. J. Virol. 88, 6778–6792 (2014).

  35. 35.

    et al. Nonpathogenic SIV infection of African green monkeys induces a strong but rapidly controlled type I IFN response. J. Clin. Invest. 119, 3544–3555 (2009).

  36. 36.

    et al. Innate immune responses and rapid control of inflammation in African green monkeys treated or not with interferon-α during primary SIVagm infection. PLoS Pathog. 10, e1004241 (2014).

  37. 37.

    & WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).

  38. 38.

    , , , & HIV enters cells via endocytosis and dynamin-dependent fusion with endosomes. Cell 137, 433–444 (2009).

  39. 39.

    et al. Rapid inflammasome activation following mucosal SIV infection of rhesus monkeys. Cell 165, 656–667 (2016).

  40. 40.

    et al. Heat shock factor 1 mediates latent HIV reactivation. Sci. Rep. 6, 26294 (2016).

  41. 41.

    et al. Somatic mutations of the immunoglobulin framework are generally required for broad and potent HIV-1 neutralization. Cell 153, 126–138 (2013).

  42. 42.

    , & Sooty mangabeys and rhesus macaques exhibit significant divergent natural killer cell responses during both acute and chronic phases of SIV infection. Cell. Immunol. 254, 10–19 (2008).

  43. 43.

    et al. Early induction of polyfunctional simian immunodeficiency virus (SIV)-specific T lymphocytes and rapid disappearance of SIV from lymph nodes of sooty mangabeys during primary infection. J. Immunol. 186, 5151–5161 (2011).

  44. 44.

    et al. Differential infection patterns of CD4+ T cells and lymphoid tissue viral burden distinguish progressive and nonprogressive lentiviral infections. Blood 120, 4172–4181 (2012).

  45. 45.

    et al. Envelope-specific B-cell populations in African green monkeys chronically infected with simian immunodeficiency virus. Nat. Commun. 7, 12131 (2016).

  46. 46.

    & The role of Ran-binding protein 3 during influenza A virus replication. J. Gen. Virol. 94, 977–984 (2013).

  47. 47.

    , & A multifunctional domain in human CRM1 (exportin 1) mediates RanBP3 binding and multimerization of human T-cell leukemia virus type 1 Rex protein. Mol. Cell. Biol. 23, 8751–8761 (2003).

  48. 48.

    , , , & Insights into the function of the CRM1 cofactor RanBP3 from the structure of its Ran-binding domain. PLoS One 6, e17011 (2011).

  49. 49.

    Role of nucleocytoplasmic RNA transport during the life cycle of retroviruses. Front. Microbiol. 3, 179 (2012).

  50. 50.

    , , & Lineage pathway of human brain progenitor cells identified by JC virus susceptibility. Ann. Neurol. 53, 636–646 (2003).

  51. 51.

    & Sequence-specific interactions between a cellular DNA-binding protein and the simian virus 40 origin of DNA replication. Mol. Cell. Biol. 8, 903–911 (1988).

  52. 52.

    et al. Frequent infection of neurons by SV40 virus in SIV-infected macaque monkeys with progressive multifocal leukoencephalopathy and meningoencephalitis. Am. J. Pathol. 183, 1910–1917 (2013).

  53. 53.

    et al. Molecular biology, epidemiology, and pathogenesis of progressive multifocal leukoencephalopathy, the JC virus–induced demyelinating disease of the human brain. Clin. Microbiol. Rev. 25, 471–506 (2012).

  54. 54.

    The demographic and adaptive history of the African green monkey. Mol. Biol. Evol. 34, 1055–1065 (2017).

  55. 55.

    et al. Comparative population genomics in animals uncovers the determinants of genetic diversity. Nature 515, 261–263 (2014).

  56. 56.

    et al. Sequencing of the genus Arabidopsis identifies a complex history of nonbifurcating speciation and abundant trans-specific polymorphism. Nat. Genet. 48, 1077–1082 (2016).

  57. 57.

    , & How reticulated are species? BioEssays 38, 140–149 (2016).

  58. 58.

    , , , & A case of patas–vervet hybrid in captivity. Primates 19, 785–793 (1978).

  59. 59.

    & Three Sykes's monkey Cercopithecus mitis × vervet monkey Chlorocebus pygerythrus hybrids in Kenya. Primate Conserv. 25, 43–56 (2010).

  60. 60.

    et al. Natural selection has shaped coding and non-coding transcription in primate CD4+ T-cells. Preprint at bioRxiv (2016).

  61. 61.

    et al. Mitochondrial diversity and distribution of African green monkeys (Chlorocebus gray, 1870). Am. J. Primatol. 75, 350–360 (2013).

  62. 62.

    in Primates, Comparative Anatomy and Taxonomy 533–581 (Edinburgh University Press, 1966).

  63. 63.

    et al. Cytoscape: the network visualization tool for GenomeSpace workflows. F1000Res 3, 151 (2014).

  64. 64.

    Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at (2013).

  65. 65.

    et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

  66. 66.

    et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 43, 11.10.1–11.10.33 (2013).

  67. 67.

    et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).

  68. 68.

    , & Genome scans for detecting footprints of local adaptation using a Bayesian factor model. Mol. Biol. Evol. 31, 2483–2495 (2014).

  69. 69.

    & Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).

  70. 70.

    et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

  71. 71.

    MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).

  72. 72.

    , & Dating of the human–ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174 (1985).

  73. 73.

    & in The Phylogenetic Handbook (eds. Lemey, P., Salemi, M. & Vandamme, A.M.) 267–312 (Cambridge University Press, 2009).

  74. 74.

    & Selecting the best-fit model of nucleotide substitution. Syst. Biol. 50, 580–601 (2001).

  75. 75.

    & Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).

  76. 76.

    , , , & Haplotype estimation using sequencing reads. Am. J. Hum. Genet. 93, 687–696 (2013).

  77. 77.

    & A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. 4, Article17 (2005).

Download references


Samples were collected through the UCLA Systems Biology Sample Repository funded by US National Institutes of Health grants R01RR016300 and R01OD010980 to N.F. For permits allowing us to collect samples, we thank the Gambia Department of Parks & Wildlife Management; the Botswana Ministry of Environment & Wildlife and Tourism; the Ghana Wildlife Division, Forestry Commission; the Zambia Wildlife Authority; the Ethiopian Wildlife Conservation Authority; the Ministry of Forestry & the Environment, Department of Environmental Affairs, South Africa; the Department of Economic Development and Environmental Affairs, Eastern Cape; the Department of Tourism, Environmental and Economic Affairs, Free State; Ezemvelo KZN Wildlife, KwaZulu-Natal; and the Department of Economic Development, Environment and Tourism, Limpopo. We also thank G. Redmond and the St. Kitts Biomedical Research Foundation for facilitating sample collection in St. Kitts and Nevis. We thank J. Brenchley, K. Reimann (R24OD010976), and J. Baulu and the Barbados Primate Research Center and Wildlife Reserve for providing samples of Tanzanian origin and Barbadian vervets. For help with sample collection and processing, we thank J. Danzy-Cramer, Y. Jung, O. Morton and J. Freimer. We thank J. Kamm for discussion, Ü. Seren, J. Wasserscheid and N. Juretic for IT support, and R. Halai for help with figure design. H.S. has been supported by a travel grant from the Austrian Ministry of Science and Research. C.A. is supported by RO1 AI119346 from the National Institute of Allergy and Infectious Diseases (NIAID). We acknowledge the support of the National Institute of Neurological Disorders and Stroke (NINDS) Informatics Center for Neurogenetics and Neurogenomics (P30 NS062691). We would like to thank F. Gao for assistance with microarray data analysis.

Author information

Author notes

    • Hannes Svardal
    •  & Richard K Wilson

    Present addresses: Department of Genetics, University of Cambridge, Cambridge, UK (H.S.) and Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, Ohio, USA (R.K.W.).


  1. Gregor Mendel Institute, Austrian Academy of Sciences, Vienna Biocenter (VBC), Vienna, Austria.

    • Hannes Svardal
    •  & Magnus Nordborg
  2. Center for Neurobehavioral Genetics, University of California, Los Angeles, Los Angeles, California, USA.

    • Anna J Jasinska
    • , Giovanni Coppola
    • , Vasily Ramensky
    •  & Nelson B Freimer
  3. Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.

    • Anna J Jasinska
  4. Center for Vaccine Research, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

    • Cristian Apetrei
  5. Department of Microbiology and Molecular Genetics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA.

    • Cristian Apetrei
  6. Department of Neurology, University of California, Los Angeles, Los Angeles, California, USA.

    • Giovanni Coppola
  7. State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China.

    • Yu Huang
  8. Department of Anthropology, Boston University, Boston, Massachusetts, USA.

    • Christopher A Schmitt
  9. Institut Pasteur, Unité HIPER, Paris, France.

    • Beatrice Jacquelin
    •  & Michaela Müller-Trutwin
  10. Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation.

    • Vasily Ramensky
  11. Medical Research Council (MRC), The Gambia Unit, The Gambia.

    • Martin Antonio
  12. Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA.

    • George Weinstock
    •  & Richard K Wilson
  13. Department of Genetics, University of the Free State, Bloemfontein, South Africa.

    • J Paul Grobler
    •  & Trudy R Turner
  14. Department of Human Genetics, McGill University, Montreal, Quebec, Canada.

    • Ken Dewar
  15. Department of Anthropology, University of Wisconsin–Milwaukee, Milwaukee, Wisconsin, USA.

    • Trudy R Turner
  16. McDonnell Genome Institute, Washington University in St. Louis, St. Louis, Missouri, USA.

    • Wesley C Warren


  1. Search for Hannes Svardal in:

  2. Search for Anna J Jasinska in:

  3. Search for Cristian Apetrei in:

  4. Search for Giovanni Coppola in:

  5. Search for Yu Huang in:

  6. Search for Christopher A Schmitt in:

  7. Search for Beatrice Jacquelin in:

  8. Search for Vasily Ramensky in:

  9. Search for Michaela Müller-Trutwin in:

  10. Search for Martin Antonio in:

  11. Search for George Weinstock in:

  12. Search for J Paul Grobler in:

  13. Search for Ken Dewar in:

  14. Search for Richard K Wilson in:

  15. Search for Trudy R Turner in:

  16. Search for Wesley C Warren in:

  17. Search for Nelson B Freimer in:

  18. Search for Magnus Nordborg in:


N.B.F., T.R.T., M.N., A.J.J., K.D., W.C.W. and R.K.W. conceived the study. M.N. and H.S. designed the analysis strategy. H.S. analyzed the data and prepared tables and figures. C.A. contributed the SIVagm sequence analysis. G.C. contributed the WGCNA analysis. B.J. and M.M.-T. provided expertise on the expression data analysis and SIV. Y.H. and V.R. provided bioinformatic support. C.A.S., J.P.G., M.A. and T.R.T. collected samples and obtained permits. N.B.F., G.W., R.K.W., K.D. and W.C.W. oversaw sequencing. M.N., H.S., N.F. and A.J.J. wrote the manuscript. All authors read and approved the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Magnus Nordborg.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Note, Supplementary Figures 1–36 and Supplementary Tables 1–4

  2. 2.

    Life Sciences Reporting Summary

Excel files

  1. 1.

    Supplementary Data 1

    Table of sample IDs, taxonomic group attribution (c.f., Fig. 1a), taxonomic classification from the Integrated Taxonomic Information System (http://www.itis.gov/; last accessed June 2016), collection site, country, coordinates, actual fold coverage, percentage of mapped reads (including all scaffolds), SRA Sample ID and BioProject accession number.

  2. 2.

    Supplementary Data 2

    Results for all D-statistic (ABBA-BABA test) comparisons that are consistent with the UPGMA clustering tree of pairwise differences. z scores were obtained through block jackknifing. Samples were grouped by country. Figure 2e and Supplementary Figures 14 and 36 show a subset of the data. See Supplementary Table 4 for IDs of the samples used in the single-sample analysis.

  3. 3.

    Supplementary Data 3

    Average and maximum of XP-CLR root-mean-square average selection scores for each gene. Details on how these scores were obtained are given in the Online Methods.

  4. 4.

    Supplementary Data 4

    Significance P values for enrichment of selection scores in gene ontology categories using the R package TopGO with a Kolmogorov–Smirnov test and the weight01 algorithm for all categories with P < 0.1.

  5. 5.

    Supplementary Data 5

    Significance P values for sumstat enrichment of selection scores in NCBI HIV-1–human interaction gene categories (Online Methods).

  6. 6.

    Supplementary Data 6

    Significance P values for enrichment of selection scores in gene expression categories (Online Methods).

  7. 7.

    Supplementary Data 7

    Significant GO enrichments for WGCNA modules significantly enriched in high selection scores that only show a short-term response in vervet (mainly day 6 after infection), i.e., for genes from the green, blue and magenta modules with asterisks in Supplementary Figure 35. The R package TopGO with Fisher's exact test and the weight01 algorithm was used.

  8. 8.

    Supplementary Data 8

    Significant GO enrichments for WGCNA modules significantly enriched in high selection scores that show a long-term response in vervet (day 115 after infection), i.e., for genes from the yellow and tan modules with asterisks in Supplementary Figure 35. The R package TopGO with Fisher's exact test and the weight01 algorithm was used.