Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Large multiallelic copy number variations in humans

Abstract

Thousands of genomic segments appear to be present in widely varying copy numbers in different human genomes. We developed ways to use increasingly abundant whole-genome sequence data to identify the copy numbers, alleles and haplotypes present at most large multiallelic CNVs (mCNVs). We analyzed 849 genomes sequenced by the 1000 Genomes Project to identify most large (>5-kb) mCNVs, including 3,878 duplications, of which 1,356 appear to have 3 or more segregating alleles. We find that mCNVs give rise to most human variation in gene dosage—seven times the combined contribution of deletions and biallelic duplications—and that this variation in gene dosage generates abundant variation in gene expression. We describe 'runaway duplication haplotypes' in which genes, including HPR and ORM1, have mutated to high copy number on specific haplotypes. We also describe partially successful initial strategies for analyzing mCNVs via imputation and provide an initial data resource to support such analyses.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Ascertainment of multiallelic copy number variations (mCNVs) across the human genome.
Figure 2: Determination of the copy number levels and alleles present at mCNV loci.
Figure 3: Critical evaluation of copy number genotypes by ddPCR.
Figure 4: Relationship of gene copy number (in genomic DNA) to gene expression (in mRNA) for mCNVs.
Figure 5: Relationship between the imputability of mCNVs and the features of each mCNV locus.
Figure 6: Haplotypes with runaway copy number.

Similar content being viewed by others

References

  1. Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007).

    Article  CAS  Google Scholar 

  2. International Schizophrenia Consortium. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455, 237–241 (2008).

  3. Weiss, L.A. et al. Association between microdeletion and microduplication at 16p11.2 and autism. N. Engl. J. Med. 358, 667–675 (2008).

    Article  CAS  Google Scholar 

  4. McCarthy, S.E. et al. Microduplications of 16p11.2 are associated with schizophrenia. Nat. Genet. 41, 1223–1227 (2009).

    Article  CAS  Google Scholar 

  5. Bochukova, E.G. et al. Large, rare chromosomal deletions associated with severe early-onset obesity. Nature 463, 666–670 (2010).

    Article  CAS  Google Scholar 

  6. Vacic, V. et al. Duplications of the neuropeptide receptor gene VIPR2 confer significant risk for schizophrenia. Nature 471, 499–503 (2011).

    Article  CAS  Google Scholar 

  7. Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).

    Article  CAS  Google Scholar 

  8. McCarroll, S.A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nat. Genet. 40, 1166–1174 (2008).

    Article  CAS  Google Scholar 

  9. de Cid, R. et al. Deletion of the late cornified envelope LCE3B and LCE3C genes as a susceptibility factor for psoriasis. Nat. Genet. 41, 211–215 (2009).

    Article  CAS  Google Scholar 

  10. McCarroll, S.A. et al. Donor-recipient mismatch for common gene deletion polymorphisms in graft-versus-host disease. Nat. Genet. 41, 1341–1344 (2009).

    Article  CAS  Google Scholar 

  11. Willer, C.J. et al. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat. Genet. 41, 25–34 (2009).

    Article  CAS  Google Scholar 

  12. McCarroll, S.A. et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat. Genet. 40, 1107–1112 (2008).

    Article  CAS  Google Scholar 

  13. Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007).

    Article  CAS  Google Scholar 

  14. Handsaker, R.E., Korn, J.M., Nemesh, J. & McCarroll, S.A. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43, 269–276 (2011).

    Article  CAS  Google Scholar 

  15. Hollox, E.J., Armour, J.A. & Barber, J.C. Extensive normal copy number variation of a β-defensin antimicrobial-gene cluster. Am. J. Hum. Genet. 73, 591–600 (2003).

    Article  CAS  Google Scholar 

  16. Iafrate, A.J. et al. Detection of large-scale variation in the human genome. Nat. Genet. 36, 949–951 (2004).

    Article  CAS  Google Scholar 

  17. Lee, C., Iafrate, A.J. & Brothman, A.R. Copy number variations and clinical cytogenetic diagnosis of constitutional disorders. Nat. Genet. 39, S48–S54 (2007).

    Article  CAS  Google Scholar 

  18. Perry, G.H. et al. Diet and the evolution of human amylase gene copy number variation. Nat. Genet. 39, 1256–1260 (2007).

    Article  CAS  Google Scholar 

  19. Perry, G.H. et al. The fine-scale and complex architecture of human copy-number variation. Am. J. Hum. Genet. 82, 685–695 (2008).

    Article  CAS  Google Scholar 

  20. Gu, W., Zhang, F. & Lupski, J.R. Mechanisms for human genomic rearrangements. Pathogenetics 1, 4 (2008).

    Article  Google Scholar 

  21. Sudmant, P.H. et al. Diversity of human copy number variation and multicopy genes. Science 330, 641–646 (2010).

    Article  CAS  Google Scholar 

  22. Alkan, C. et al. Personalized copy number and segmental duplication maps using next-generation sequencing. Nat. Genet. 41, 1061–1067 (2009).

    Article  CAS  Google Scholar 

  23. Yoon, S., Xuan, Z., Makarov, V., Ye, K. & Sebat, J. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 19, 1586–1592 (2009).

    Article  CAS  Google Scholar 

  24. Abyzov, A., Urban, A.E., Snyder, M. & Gerstein, M. CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res. 21, 974–984 (2011).

    Article  CAS  Google Scholar 

  25. Bellos, E., Johnson, M.R. & Coin, L.J. cnvHiTSeq: integrative models for high-resolution copy number variation detection and genotyping using population sequencing data. Genome Biol. 13, R120 (2012).

    Article  Google Scholar 

  26. Koren, A. et al. Genetic variation in human DNA replication timing. Cell 159, 1015–1026 (2014).

    Article  CAS  Google Scholar 

  27. Wang, Y., Lu, J., Yu, J., Gibbs, R.A. & Yu, F. An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data. Genome Res. 23, 833–842 (2013).

    Article  CAS  Google Scholar 

  28. DePristo, M.A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

    Article  CAS  Google Scholar 

  29. Li, H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).

    Article  CAS  Google Scholar 

  30. Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).

    Article  CAS  Google Scholar 

  31. Mills, R.E. et al. Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011).

    Article  CAS  Google Scholar 

  32. McCarroll, S.A. & Altshuler, D.M. Copy-number variation and association studies of human disease. Nat. Genet. 39, S37–S42 (2007).

    Article  CAS  Google Scholar 

  33. Hindson, B.J. et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83, 8604–8610 (2011).

    Article  CAS  Google Scholar 

  34. Boettger, L.M., Handsaker, R.E., Zody, M.C. & McCarroll, S.A. Structural haplotypes and recent evolution of the human 17q21.31 region. Nat. Genet. 44, 881–885 (2012).

    Article  CAS  Google Scholar 

  35. Su, S.Y. et al. Inferring combined CNV/SNP haplotypes from genotype data. Bioinformatics 26, 1437–1445 (2010).

    Article  CAS  Google Scholar 

  36. Kato, M., Nakamura, Y. & Tsunoda, T. An algorithm for inferring complex haplotypes in a region of copy-number variation. Am. J. Hum. Genet. 83, 157–169 (2008).

    Article  CAS  Google Scholar 

  37. Browning, B.L. & Browning, S.R. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am. J. Hum. Genet. 84, 210–223 (2009).

    Article  CAS  Google Scholar 

  38. Assaad, F.F., Tucker, K.L. & Signer, E.R. Epigenetic repeat-induced gene silencing (RIGS) in Arabidopsis. Plant Mol. Biol. 22, 1067–1085 (1993).

    Article  CAS  Google Scholar 

  39. Dorer, D.R. & Henikoff, S. Expansions of transgene repeats cause heterochromatin formation and gene silencing in Drosophila. Cell 77, 993–1002 (1994).

    Article  CAS  Google Scholar 

  40. Dorer, D.R. & Henikoff, S. Transgene repeat arrays interact with distant heterochromatin and cause silencing in cis and trans. Genetics 147, 1181–1190 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  41. Garrick, D., Fiering, S., Martin, D.I. & Whitelaw, E. Repeat-induced gene silencing in mammals. Nat. Genet. 18, 56–59 (1998).

    Article  CAS  Google Scholar 

  42. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

    Article  CAS  Google Scholar 

  43. Wellcome Trust Case Control Consortium. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464, 713–720 (2010).

  44. Abu Bakar, S., Hollox, E.J. & Armour, J.A. Allelic recombination between distinct genomic locations generates copy number diversity in human β-defensins. Proc. Natl. Acad. Sci. USA 106, 853–858 (2009).

    Article  Google Scholar 

  45. Dennis, M.Y. et al. Evolution of human-specific neural SRGAP2 genes by incomplete segmental duplication. Cell 149, 912–922 (2012).

    Article  CAS  Google Scholar 

  46. Smith, A.B., Esko, J.D. & Hajduk, S.L. Killing of trypanosomes by the human haptoglobin-related protein. Science 268, 284–286 (1995).

    Article  CAS  Google Scholar 

  47. Harrington, J.M., Howell, S. & Hajduk, S.L. Membrane permeabilization by trypanosome lytic factor, a cytolytic human high density lipoprotein. J. Biol. Chem. 284, 13505–13512 (2009).

    Article  CAS  Google Scholar 

  48. Genovese, G. et al. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329, 841–845 (2010).

    Article  CAS  Google Scholar 

  49. Genovese, G., Friedman, D.J. & Pollak, M.R. APOL1 variants and kidney disease in people of recent African ancestry. Nat. Rev. Nephrol. 9, 240–244 (2013).

    Article  CAS  Google Scholar 

  50. Moffatt, M.F. et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature 448, 470–473 (2007).

    Article  CAS  Google Scholar 

  51. Zanda, M. et al. A genome-wide assessment of the role of untagged copy number variants in type 1 diabetes. PLoS Genet. 10, e1004367 (2014).

    Article  Google Scholar 

  52. Hollox, E.J. et al. Psoriasis is associated with increased β-defensin genomic copy number. Nat. Genet. 40, 23–25 (2008).

    Article  CAS  Google Scholar 

  53. Steinberg, K.M. et al. Structural diversity and African origin of the 17q21.31 inversion polymorphism. Nat. Genet. 44, 872–880 (2012).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank D. Skvortsov, M. Thornton, N. Klitgord and B. Zhang for contributions to ddPCR assay design and validation. We also thank members of the 1000 Genomes Project for helpful conversations about analysis methods. This work was supported by a grant from the National Human Genome Research Institute (NHGRI; R01 HG006855). An additional grant from NHGRI (U01 HG006510) is supporting follow-on work to develop these methods into production-ready software that can be used by any research laboratory.

Author information

Authors and Affiliations

Authors

Contributions

R.E.H. and S.A.M. designed the study. R.E.H. devised the computational approaches, performed the analysis and wrote the Genome STRiP software. V.V.D. performed the ddPCR experiments and initial data analyses. J.R.B. designed assays for the ddPCR experiments and provided technical guidance and materials. G.G. contributed to the statistical analyses of gene dosage and dispersed duplications. S.K. helped automate and refine the algorithms and produce the public software release. L.M.B. contributed to the analysis of the HPR locus. S.A.M. and R.E.H. interpreted the data and wrote the manuscript.

Corresponding author

Correspondence to Steven A McCarroll.

Ethics declarations

Competing interests

J.R.B. is an employee of Bio-Rad Laboratories, Inc.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–11, Supplementary Tables 1–13 and Supplementary Note. (PDF 14119 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Handsaker, R., Van Doren, V., Berman, J. et al. Large multiallelic copy number variations in humans. Nat Genet 47, 296–303 (2015). https://doi.org/10.1038/ng.3200

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3200

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research