Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Structural haplotypes and recent evolution of the human 17q21.31 region


Structurally complex genomic regions are not yet well understood. One such locus, human chromosome 17q21.31, contains a megabase-long inversion polymorphism1, many uncharacterized copy-number variations (CNVs) and markers that associate with female fertility1, female meiotic recombination1,2,3 and neurological disease4,5. Additionally, the inverted H2 form of 17q21.31 seems to be positively selected in Europeans1. We developed a population genetics approach to analyze complex genome structures and identified nine segregating structural forms of 17q21.31. Both the H1 and H2 forms of the 17q21.31 inversion polymorphism contain independently derived, partial duplications of the KANSL1 gene; these duplications, which produce novel KANSL1 transcripts, have both recently risen to high allele frequencies (26% and 19%) in Europeans. An older H2 form lacking such a duplication is present at low frequency in European and central African hunter-gatherer populations. We further show that complex genome structures can be analyzed by imputation from SNPs.

This is a preview of subscription content, access via your institution


Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Inference of complex CNV and SNP haplotypes at the 17q21.31 locus.
Figure 2: Structural forms of the human 17q21.31 locus and their population frequencies.
Figure 3: Structural forms of 17q21.31 segregate on specific SNP haplotype backgrounds.

Accession codes

Primary accessions

Sequence Read Archive

Referenced accessions

NCBI Reference Sequence


  1. Stefansson, H. et al. A common inversion under selection in Europeans. Nat. Genet. 37, 129–137 (2005).

    CAS  Article  Google Scholar 

  2. Chowdhury, R., Bois, P.R., Feingold, E., Sherman, S.L. & Cheung, V.G. Genetic analysis of variation in human meiotic recombination. PLoS Genet. 5, e1000648 (2009).

    Article  Google Scholar 

  3. Fledel-Alon, A. et al. Variation in human recombination rates and its genetic determinants. PLoS ONE 6, e20321 (2011).

    CAS  Article  Google Scholar 

  4. Skipper, L. et al. Linkage disequilibrium and association of MAPT H1 in Parkinson disease. Am. J. Hum. Genet. 75, 669–677 (2004).

    CAS  Article  Google Scholar 

  5. Simón-Sánchez, J. et al. Genome-wide association study reveals genetic risk underlying Parkinson's disease. Nat. Genet. 41, 1308–1312 (2009).

    Article  Google Scholar 

  6. McCarroll, S.A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, 1166–1174 (2008).

    CAS  Article  Google Scholar 

  7. Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).

    CAS  Article  Google Scholar 

  8. McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn′s disease. Nat. Genet. 40, 1107–1112 (2008).

    CAS  Article  Google Scholar 

  9. Willer, C.J. et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat. Genet. 41, 25–34 (2009).

    CAS  Article  Google Scholar 

  10. Handsaker, R.E., Korn, J.M., Nemesh, J. & McCarroll, S.A. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43, 269–276 (2011).

    CAS  Article  Google Scholar 

  11. Quinlan, A.R. & Hall, I.M. Characterizing complex structural variation in germline and somatic genomes. Trends Genet. 28, 43–53 (2012).

    CAS  Article  Google Scholar 

  12. International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005).

  13. International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).

  14. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  15. Mills, R.E. et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011).

    CAS  Article  Google Scholar 

  16. Zody, M.C. et al. Evolutionary toggling of the MAPT 17q21.31 inversion region. Nat. Genet. 40, 1076–1083 (2008).

    CAS  Article  Google Scholar 

  17. McCarroll, S.A. Copy-number analysis goes more than skin deep. Nat. Genet. 40, 5–6 (2008).

    CAS  Article  Google Scholar 

  18. Hindson, B.J. et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011).

    CAS  Article  Google Scholar 

  19. Tishkoff, S.A. et al. Convergent adaptation of human lactase persistence in Africa and Europe. Nat. Genet. 39, 31–40 (2007).

    CAS  Article  Google Scholar 

  20. Genovese, G. et al. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329, 841–845 (2010).

    CAS  Article  Google Scholar 

  21. Yu, L., Song, Y. & Wharton, R.P. E(nos)/CG4699 required for nanos function in the female germ line of Drosophila. Genesis 48, 161–170 (2010).

    CAS  Article  Google Scholar 

  22. Smith, E.R. et al. A human protein complex homologous to the Drosophila MSL complex is responsible for the majority of histone H4 acetylation at lysine 16. Mol. Cell. Biol. 25, 9175–9188 (2005).

    CAS  Article  Google Scholar 

  23. Li, X., Wu, L., Corsa, C.A., Kunkel, S. & Dou, Y. Two mammalian MOF complexes regulate transcription activation by distinct mechanisms. Mol. Cell 36, 290–301 (2009).

    CAS  Article  Google Scholar 

  24. Sharp, A.J. et al. Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77, 78–88 (2005).

    CAS  Article  Google Scholar 

  25. Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006).

    CAS  Article  Google Scholar 

  26. Browning, S.R. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum. Genet. 124, 439–450 (2008).

    CAS  Article  Google Scholar 

  27. Li, Y., Willer, C., Sanna, S. & Abecasis, G. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 10, 387–406 (2009).

    CAS  Article  Google Scholar 

  28. Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 11, 499–511 (2010).

    CAS  Article  Google Scholar 

  29. Browning, B.L. & Browning, S.R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).

    CAS  Article  Google Scholar 

  30. Steinberg, K.M. et al. Structural diversity and African origin of the 17q21.31 inversion polymorphism. Nat. Genet. published online: doi:10.1038/ng.2335 (1 July 2012).

    CAS  Article  Google Scholar 

  31. Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

    CAS  Article  Google Scholar 

Download references


J. Korn provided an early version of software for visualizing haplotype diversity. N. Rohland and T. Mullen contributed expertise on laboratory experiments. We thank N. Patterson, D. Reich, D. Altshuler, E. Lander, B. Browning, J. Korn, J. Gray, C. Patil, G. Genovese, A. Sekar and S. Grossman for helpful conversations and/or comments on the manuscript. This work was supported by a Smith Family Award for Excellence in Biomedical Research to S.A.M., by the National Human Genome Research Institute (U01HG005208) and by startup resources from the Harvard Medical School Department of Genetics.

Author information

Authors and Affiliations



S.A.M., L.M.B. and R.E.H. conceived the strategy for population genetics dissection of structurally complex loci. L.M.B. performed all laboratory experiments and multiple computational analyses, including the estimation of haplotype frequencies, delineation of CNV regions and alignment of next-generation sequence data. R.E.H. performed computational analyses of the 1000 Genomes Project data, including finding breakpoint-spanning reads for CNVs and integrated analyses of SNP-CNV haplotypes. M.C.Z. performed analyses of sequence data to determine large-scale structures, estimate coalescence and mutation dates and reconstruct the evolutionary history of the locus. R.E.H. and L.M.B. developed the imputation strategy. S.A.M., L.M.B., R.E.H. and M.C.Z. wrote the manuscript.

Corresponding author

Correspondence to Steven A McCarroll.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Note, Supplementary Tables 1–17 and Supplementary Figures 1–8 (PDF 3406 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Boettger, L., Handsaker, R., Zody, M. et al. Structural haplotypes and recent evolution of the human 17q21.31 region. Nat Genet 44, 881–885 (2012).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing