This article has been updated


The 17q21.31 inversion polymorphism exists either as direct (H1) or inverted (H2) haplotypes with differential predispositions to disease and selection. We investigated its genetic diversity in 2,700 individuals, with an emphasis on African populations. We characterize eight structural haplotypes due to complex rearrangements that vary in size from 1.08–1.49 Mb and provide evidence for a 30-kb H1-H2 double recombination event. We show that recurrent partial duplications of the KANSL1 gene have occurred on both the H1 and H2 haplotypes and have risen to high frequency in European populations. We identify a likely ancestral H2 haplotype (H2′) lacking these duplications that is enriched among African hunter-gatherer groups yet essentially absent from West African populations. Whereas H1 and H2 segmental duplications arose independently and before human migration out of Africa, they have reached high frequencies recently among Europeans, either because of extraordinary genetic drift or selective sweeps.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Change history

  • 11 July 2012

    In the version of this article initially published online, the final sentence in the second paragraph of the Results section incorrectly referred to the H2-specific duplication as CNP205. The correct designation for the H2-specific duplication is CNP155. Also, in the legend to Figure 5, the phrases describing panels c and d were inadvertently switched. These errors have been corrected for the print, PDF and HTML versions of this article.


Primary accessions

Sequence Read Archive


  1. 1.

    The genetics of natural populations. Genetics 35, 288–302 (1950).

  2. 2.

    & Inversions in the chromosomes of Drosophila pseudoobscura. Genetics 23, 28–64 (1938).

  3. 3.

    & A widespread chromosomal inversion polymorphism contributes to a major life-history transition, local adaptation, and reproductive isolation. PLoS Biol. 8, e1000500 (2010).

  4. 4.

    et al. Fine-scale structural variation of the human genome. Nat. Genet. 37, 727–732 (2005).

  5. 5.

    et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008).

  6. 6.

    et al. Recent segmental duplications in the human genome. Science 297, 1003–1007 (2002).

  7. 7.

    Emerging themes and new challenges in defining the role of structural variation in human disease. Hum. Mutat. 30, 135–144 (2009).

  8. 8.

    Genome structural variation and sporadic disease traits. Nat. Genet. 38, 974–976 (2006).

  9. 9.

    et al. Characterization of six human disease-associated inversion polymorphisms. Hum. Mol. Genet. 18, 2555–2566 (2009).

  10. 10.

    et al. A common inversion under selection in Europeans. Nat. Genet. 37, 129–137 (2005).

  11. 11.

    et al. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat. Genet. 38, 1038–1042 (2006).

  12. 12.

    et al. A new chromosome 17q21.31 microdeletion syndrome associated with a common inversion polymorphism. Nat. Genet. 38, 999–1001 (2006).

  13. 13.

    et al. Evolutionary toggling of the MAPT 17q21.31 inversion region. Nat. Genet. 40, 1076–1083 (2008).

  14. 14.

    et al. Clinical and molecular delineation of the 17q21.31 microdeletion syndrome. J. Med. Genet. 45, 710–720 (2008).

  15. 15.

    et al. Diversity of human copy number variation and multicopy genes. Science 330, 641–646 (2010).

  16. 16.

    et al. The distribution and most recent common ancestor of the 17q21 inversion in humans. Am. J. Hum. Genet. 86, 161–171 (2010).

  17. 17.

    et al. Complete Khoisan and Bantu genomes from southern Africa. Nature 463, 943–947 (2010).

  18. 18.

    et al. Association of an extended haplotype in the tau gene with progressive supranuclear palsy. Hum. Mol. Genet. 8, 711–715 (1999).

  19. 19.

    , , , & Recurrent inversion events at 17q21.31 microdeletion locus are linked to the MAPT H2 haplotype. Cytogenet. Genome Res. 129, 275–279 (2010).

  20. 20.

    , , & Structural haplotypes and recent evolution of the human 17q21.31 region. Nat. Genet. published online, doi:10.1038/ng.2334 (1 July 2012).

  21. 21.

    et al. Hunter-gatherer genomic diversity suggests a southern African origin for modern humans. Proc. Natl. Acad. Sci. USA 108, 5154–5162 (2011).

  22. 22.

    et al. Haplotype sorting using human fosmid clone end-sequence pairs. Genome Res. 18, 2016–2023 (2008).

  23. 23.

    et al. Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231–238 (1999).

  24. 24.

    et al. Effect of childhood trauma on adult depression and neuroendocrine function: sex-specific moderation by CRH receptor 1 gene. Front. Behav. Neurosci. 3, 41 (2009).

  25. 25.

    et al. Association of corticotropin-releasing hormone receptor 1 gene SNP and haplotype with major depression. Neurosci. Lett. 404, 358–362 (2006).

  26. 26.

    et al. Association study of corticotropin-releasing hormone receptor 1 gene polymorphisms and antidepressant response in major depressive disorders. Neurosci. Lett. 414, 155–158 (2007).

  27. 27.

    et al. Influence of child abuse on adult depression: moderation by the corticotropin-releasing hormone receptor gene. Arch. Gen. Psychiatry 65, 190–200 (2008).

  28. 28.

    et al. Protective effect of CRHR1 gene variants on the development of adult depression following childhood maltreatment: replication and extension. Arch. Gen. Psychiatry 66, 978–985 (2009).

  29. 29.

    et al. Microdeletion encompassing MAPT at chromosome 17q21.3 is associated with developmental delay and learning disability. Nat. Genet. 38, 1032–1037 (2006).

  30. 30.

    et al. A copy number variation morbidity map of developmental delay. Nat. Genet. 43, 838–846 (2011).

  31. 31.

    et al. A method for rapid, targeted CNV genotyping identifies rare variants associated with neurocognitive disease. Genome Res. 19, 1579–1585 (2009).

  32. 32.

    et al. The genetic structure and history of Africans and African Americans. Science 324, 1035–1044 (2009).

  33. 33.

    et al. Variation in human recombination rates and its genetic determinants. PLoS ONE 6, e20321 (2011).

  34. 34.

    et al. Recombination rate and reproductive success in humans. Nat. Genet. 36, 1203–1206 (2004).

  35. 35.

    , , , & High-resolution mapping of crossovers reveals extensive variation in fine-scale recombination patterns among humans. Science 319, 1395–1398 (2008).

  36. 36.

    , & Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 30, 3894–3900 (2002).

  37. 37.

    et al. Genome-wide association study reveals genetic risk underlying Parkinson's disease. Nat. Genet. 41, 1308–1312 (2009).

  38. 38.

    et al. Mutations in the chromatin modifier gene KANSL1 cause the 17q21.31 microdeletion syndrome. Nat. Genet. 44, 639–641 (2012).

  39. 39.

    et al. Mutations in KANSL1 cause the 17q21.31 microdeletion syndrome phenotype. Nat. Genet. 44, 636–638 (2012).

  40. 40.

    et al. Haplotypes and gene expression implicate the MAPT region for Parkinson disease: the GenePD Study. Neurology 71, 28–34 (2008).

  41. 41.

    et al. Linkage disequilibrium and association of MAPT H1 in Parkinson disease. Am. J. Hum. Genet. 75, 669–677 (2004).

  42. 42.

    et al. The MAPT H1c risk haplotype is associated with increased expression of tau and especially of 4 repeat containing transcripts. Neurobiol. Dis. 25, 561–570 (2007).

  43. 43.

    et al. The structure of the tau haplotype in controls and in progressive supranuclear palsy. Hum. Mol. Genet. 13, 1267–1274 (2004).

  44. 44.

    et al. Identification of common variants influencing risk of the tauopathy progressive supranuclear palsy. Nat. Genet. 43, 699–705 (2011).

  45. 45.

    et al. Dementia risk in Parkinson disease: disentangling the role of MAPT haplotypes. Arch. Neurol. 68, 359–364 (2011).

  46. 46.

    et al. Clinical and molecular characterization of 17q21.31 microdeletion syndrome in 14 French patients with mental retardation. Eur. J. Med. Genet. 54, 144–151 (2011).

  47. 47.

    et al. Genomics of the major histocompatibility complex: haplotypes, duplication, retroviruses and disease. Immunol. Rev. 167, 275–304 (1999).

  48. 48.

    et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 1053–1060 (2010).

  49. 49.

    et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

  50. 50.

    et al. Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution. Nat. Genet. 39, 1361–1368 (2007).

  51. 51.

    , , & MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599 (2007).

  52. 52.

    et al. A large and complex structural polymorphism at 16p12.1 underlies microdeletion disease risk. Nat. Genet. 42, 745–750 (2010).

  53. 53.

    et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010).

  54. 54.

    et al. Targeted enrichment of specific regions in the human genome by array hybridization. Curr. Protoc. Hum. Genet. Chapter 18, Unit 18 3 (2010).

  55. 55.

    et al. Population-genetic properties of differentiated human copy-number polymorphisms. Am. J. Hum. Genet. 88, 317–332 (2011).

  56. 56.

    & A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).

  57. 57.

    & Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

  58. 58.

    PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5, 164–166 (1989).

  59. 59.

    et al. Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948 (2007).

  60. 60.

    & DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–1452 (2009).

Download references


We thank J. Akey, M. Dennis and B. Dumont for helpful discussions and C. Alkan for computational assistance. We thank Z. Jiang for his initial work on the H1-H2 alignments. We are grateful to T. Brown for assistance with manuscript preparation, to C. Lee for technical assistance and to the anonymous reviewers of this paper who provided insightful comments. We thank the 1000 Genomes Project Consortium for access to unpublished sequence data for the 17q21.31 locus. K.M.S. was supported by a Ruth L. Kirschstein National Research Service Award (NRSA) training grant to the University of Washington (T32HG00035) and an individual NRSA Fellowship (F32GM097807). C.D.C. was supported by an individual NRSA Fellowship (F32HG006070). P.H.S. was supported by a Natural Sciences and Engineering Research Council of Canada Fellowship. J.M.K. was supported by a Ruth L. Kirschstein NRSA training grant to Stanford University (T32HG000044). This work was supported by the US National Institutes of Health (grants HG002385 and HG004120 to E.E.E.). E.E.E. is an Investigator of the Howard Hughes Medical Institute.

Author information

Author notes

    • Jeffrey M Kidd
    •  & Michael P Donnelly

    Present addresses: Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, USA (J.M.K.), Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA (J.M.K.) and Department of Human Genetics, University of Chicago, Chicago, Illionois, USA (M.P.D.).

    • Karyn Meltz Steinberg
    •  & Francesca Antonacci

    These authors contributed equally to this work.


  1. Department of Genome Sciences, University of Washington, Seattle, Washington, USA.

    • Karyn Meltz Steinberg
    • , Francesca Antonacci
    • , Peter H Sudmant
    • , Jeffrey M Kidd
    • , Catarina D Campbell
    • , Laura Vives
    • , Maika Malig
    •  & Evan E Eichler
  2. Department of Genetics and Biology, University of Pennsylvania, Philadelphia, Pennsylvania, USA.

    • Laura Scheinfeldt
    • , William Beggs
    •  & Sarah A Tishkoff
  3. Department of Molecular Biology, Institute of Endemic Diseases, University of Khartoum, Khartoum, Sudan.

    • Muntaser Ibrahim
  4. Department of Biochemistry, Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania.

    • Godfrey Lema
    •  & Thomas B Nyambo
  5. Kenya Medical Research Institute, Center for Biotechnology Research and Development, Nairobi, Kenya.

    • Sabah A Omar
  6. Unité Mixte de Recherche (UMR) 208, Institut de Recherche pour le Dévelopment (IRD)–Muséum National d'Histoire Naturelle (MNHN), Musée de l'Homme, Paris, France.

    • Jean-Marie Bodo
  7. Ministère de la Recherche Scientifique et de l'Innovation, Yaoundé, Cameroon.

    • Alain Froment
  8. Department of Genetics, Yale University, New Haven, Connecticut, USA.

    • Michael P Donnelly
    •  & Kenneth K Kidd
  9. Howard Hughes Medical Institute, University of Washington, Seattle, Washington, USA.

    • Evan E Eichler


  1. Search for Karyn Meltz Steinberg in:

  2. Search for Francesca Antonacci in:

  3. Search for Peter H Sudmant in:

  4. Search for Jeffrey M Kidd in:

  5. Search for Catarina D Campbell in:

  6. Search for Laura Vives in:

  7. Search for Maika Malig in:

  8. Search for Laura Scheinfeldt in:

  9. Search for William Beggs in:

  10. Search for Muntaser Ibrahim in:

  11. Search for Godfrey Lema in:

  12. Search for Thomas B Nyambo in:

  13. Search for Sabah A Omar in:

  14. Search for Jean-Marie Bodo in:

  15. Search for Alain Froment in:

  16. Search for Michael P Donnelly in:

  17. Search for Kenneth K Kidd in:

  18. Search for Sarah A Tishkoff in:

  19. Search for Evan E Eichler in:


K.M.S., F.A. and E.E.E. designed the study. K.M.S. performed aCGH, genotyping and sequence analysis. F.A. performed FISH experiments and fosmid shotgun sequencing library construction. P.H.S. performed read depth–based copy-number analysis. J.M.K. performed sequence analysis on the double recombination region. C.D.C. performed aCGH analysis. L.V. and M.M. performed whole-genome shotgun sequencing library construction and PCR genotyping. L.S. and W.B. performed PCR genotyping and SNP array genotyping. M.I., G.L., T.B.N., S.A.O., J.-M.B. and A.F. contributed to African sample collection. M.P.D. and K.K.K. contributed to H2 Diversity Panel sample collection and genotyping. S.A.T. contributed to African sample collection and SNP array data. K.M.S., F.A., J.M.K., S.A.T. and E.E.E. contributed to data interpretation. K.M.S., F.A. and E.E.E. wrote the manuscript.

Competing interests

E.E.E. is on the scientific advisory boards for Pacific Biosciences, Inc., SynapDx Corp and DNAnexus, Inc.

Corresponding author

Correspondence to Evan E Eichler.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Note, Supplementary Figures 1–4 and Supplementary Tables 1 and 3–11

Excel files

  1. 1.

    Supplementary Table 2

    Individual-level 17q21.31 haplotype results

About this article

Publication history





Further reading