Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Mapping and sequencing of structural variation from eight human genomes

Abstract

Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale—particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation—a standard for genotyping platforms and a prelude to future individual genome sequencing projects.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Map of structural variation in the human genome.
Figure 2: Frequency distribution.
Figure 3: Discovery of novel human sequences that are CNV.
Figure 4: Sequence resolution of human structural variation.
Figure 5: Regions of enriched SNP density.

Similar content being viewed by others

References

  1. Iafrate, A. J. et al. Detection of large-scale variation in the human genome. Nature Genet. 36, 949–951 (2004)

    Article  CAS  Google Scholar 

  2. Sebat, J. et al. Large-scale copy number polymorphism in the human genome. Science 305, 525–528 (2004)

    Article  ADS  CAS  Google Scholar 

  3. Tuzun, E. et al. Fine-scale structural variation of the human genome. Nature Genet. 37, 727–732 (2005)

    Article  CAS  Google Scholar 

  4. Sharp, A. J. et al. Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77, 78–88 (2005)

    Article  CAS  Google Scholar 

  5. Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006)

    Article  ADS  CAS  Google Scholar 

  6. Wong, K. K. et al. A comprehensive analysis of common copy-number variations in the human genome. Am. J. Hum. Genet. 80, 91–104 (2007)

    Article  CAS  Google Scholar 

  7. Conrad, D. F., Andrews, T. D., Carter, N. P., Hurles, M. E. & Pritchard, J. K. A high-resolution survey of deletion polymorphisms in the human genome. Nature Genet. 38, 75–81 (2006)

    Article  CAS  Google Scholar 

  8. McCarroll, S. A. et al. Common deletion polymorphisms in the human genome. Nature Genet. 38, 86–92 (2006)

    Article  CAS  Google Scholar 

  9. Hinds, D. A., Kloek, A. P., Jen, M., Chen, X. & Frazer, K. A. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nature Genet. 38, 82–85 (2006)

    Article  CAS  Google Scholar 

  10. Cheng, Z. et al. A genome-wide comparison of recent chimpanzee and human segmental duplications. Nature 437, 88–93 (2005)

    Article  ADS  CAS  Google Scholar 

  11. Aitman, T. J. et al. Copy number polymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans. Nature 439, 851–855 (2006)

    Article  ADS  CAS  Google Scholar 

  12. Gonzalez, E. et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 307, 1434–1440 (2005)

    Article  ADS  CAS  Google Scholar 

  13. Fellermann, K. et al. A chromosome 8 gene-cluster polymorphism with low human β-defensin 2 gene copy number predisposes to Crohn disease of the colon. Am. J. Hum. Genet. 79, 439–448 (2006)

    Article  CAS  Google Scholar 

  14. Hollox, E. J. et al. Psoriasis is associated with increased β-defensin genomic copy number. Nature Genet. 40, 23–25 (2007)

    Article  Google Scholar 

  15. Cooper, G. M., Nickerson, D. A. & Eichler, E. E. Mutational and selective effects on copy-number variants in the human genome. Nature Genet. 39, S22–S29 (2007)

    Article  CAS  Google Scholar 

  16. Istrail, S. et al. Whole-genome shotgun assembly and comparison of human genome assemblies. Proc. Natl Acad. Sci. USA 101, 1916–1921 (2004)

    Article  ADS  CAS  Google Scholar 

  17. Khaja, R. et al. Genome assembly comparison identifies structural variants in the human genome. Nature Genet. 38, 1413–1418 (2006)

    Article  CAS  Google Scholar 

  18. Levy, S. et al. The diploid genome sequence of an individual human. PLoS Biol. 5, e254 (2007)

    Article  Google Scholar 

  19. Eichler, E. E. et al. Completing the map of human genetic variation. Nature 447, 161–165 (2007)

    Article  ADS  CAS  Google Scholar 

  20. The International HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

  21. Donahue, W. & Ebling, H. M. Fosmid libraries for genomic structural variation detection. Curr. Protocols Hum. Genet. 5, 20.1–20.18 (2007)

    Google Scholar 

  22. Volik, S. et al. End-sequence profiling: sequence-based analysis of aberrant genomes. Proc. Natl Acad. Sci. USA 100, 7696–7701 (2003)

    Article  ADS  Google Scholar 

  23. Small, K., Iber, J. & Warren, S. Emerin deletion revals a common X-chromosome inversion mediated by inverted repeats. Nature Genet. 16, 96–99 (1997)

    Article  CAS  Google Scholar 

  24. Giglio, S. et al. Heterozygous submicroscopic inversions involving olfactory receptor-gene clusters mediate the recurrent t(4;8)(p16;p23) translocation. Am. J. Hum. Genet. 71, 276–285 (2002)

    Article  CAS  Google Scholar 

  25. Stefansson, H. et al. A common inversion under selection in Europeans. Nature Genet. 37, 129–137 (2005)

    Article  CAS  Google Scholar 

  26. Sharp, A. J. et al. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nature Genet. 38, 1038–1042 (2006)

    Article  CAS  Google Scholar 

  27. Sharp, A. J. et al. Characterization of a recurrent 15q24 microdeletion syndrome. Hum. Mol. Genet. 16, 567–572 (2007)

    Article  CAS  Google Scholar 

  28. Warburton, P. E., Giordano, J., Cheung, F., Gelfand, Y. & Benson, G. Inverted repeat structure of the human genome: the X-chromosome contains a preponderance of large, highly homologous inverted repeats that contain testes genes. Genome Res. 14, 1861–1869 (2004)

    Article  CAS  Google Scholar 

  29. Sutton, G. G., White, O., Adams, M. D. & Kerlavage, A. TIGR Assembler: a new tool for assembling large shotgun sequencing projects. Genome Sci. Technol. 1, 9–19 (1995)

    Article  CAS  Google Scholar 

  30. Bovee, D. et al. Closing gaps in the human genome with fosmid resources generated from multiple individuals. Nature Genet. 40, 96–101 (2008)

    Article  CAS  Google Scholar 

  31. Scherer, S. W. et al. Challenges and standards in integrating surveys of structural variation. Nature Genet. 39, S7–S15 (2007)

    Article  CAS  Google Scholar 

  32. Nguyen, T. V. et al. Short mucin 6 alleles are associated with H. pylori infection. World J. Gastroenterol. 12, 6021–6025 (2006)

    Article  CAS  Google Scholar 

  33. Lackner, C., Cohen, J. C. & Hobbs, H. H. Molecular definition of the extreme size polymorphism in apolipoprotein(a). Hum. Mol. Genet. 2, 933–940 (1993)

    Article  CAS  Google Scholar 

  34. Ning, Z., Cox, A. J. & Mullikin, J. C. SSAHA: a fast search method for large DNA databases. Genome Res. 11, 1725–1729 (2001)

    Article  CAS  Google Scholar 

  35. ENCODE Project Consortium. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007)

  36. Sachidanandam, R. et al. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001)

    Article  ADS  CAS  Google Scholar 

  37. Nusbaum, C. et al. DNA sequence and analysis of human chromosome 8. Nature 439, 331–335 (2006)

    Article  ADS  CAS  Google Scholar 

  38. de Smith, A. J. et al. Array CGH analysis of copy number variation identifies 1284 new genes variant in healthy white males: implications for association studies of complex diseases. Hum. Mol. Genet. 16, 2783–2794 (2007)

    Article  CAS  Google Scholar 

  39. Sebat, J. et al. Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007)

    Article  ADS  CAS  Google Scholar 

  40. Korbel, J. O. et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007)

    Article  ADS  CAS  Google Scholar 

  41. Gillett, W. et al. Assembly of high-resolution restriction maps based on multiple complete digests of a redundant set of overlapping clones. Genomics 33, 389–408 (1996)

    Article  CAS  Google Scholar 

  42. Wong, G. K., Yu, J., Thayer, E. C. & Olson, M. V. Multiple-complete-digest restriction fragment mapping: generating sequence-ready maps for large-scale DNA sequencing. Proc. Natl Acad. Sci. USA 94, 5225–5230 (1997)

    Article  ADS  CAS  Google Scholar 

Download references

Acknowledgements

We thank the staff from the University of Washington Genome Center and the Washington University Genome Sequencing Center for technical assistance. J.M.K. is supported by a National Science Foundation Graduate Research Fellowship. G.M.C. is supported by a Merck, Jane Coffin Childs Memorial Fund Postdoctoral Fellowship. This work was supported by National Institutes of Health grants HG004120 to E.E.E., D.A.N. and M.V.O., and 3 U54 HG002043 to M.V.O. E.E.E. is an Investigator of the Howard Hughes Medical Institute.

Author Contributions J.M.K., G.M.C., M.V.O, D.A.N, and E.E.E. contributed to the writing of this paper. The study was coordinated by L.B., M.V.O, R.K., D.R.S., J.M.K. and E.E.E. A.B., D.R.S., D.Sa., E.G., H.M.E., K.M., N.T., R.D., W.F.D. and W.T. performed library construction and end sequencing. E.H., H.S.H., K.A.P., M.V.O., R.K., R.K.W., T.G. and W.G. performed clone insert validation and sequencing. C.A., D.A.N., E.T., J.D.S., J.S., L.C., M.D., M.M., M.W., T.L.N. and Z.C. provided technical and analytical support. D.A.P., D.A.A., J.M.Ko. and S.A.M. contributed variation data. G.M.C., J.M.K., L.B., N.A.Y., N.S. and P.T. designed and analysed array CGH experiments. G.M.C. and T.Z. performed the genotype analysis. F.A. performed FISH experiments. B.T. and D.S. performed optical mapping experiments. E.E.E., J.M.K. and L.C. analysed sequenced clones. J.C.M. and N.H. identified SNPs and indels.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evan E. Eichler.

Ethics declarations

Competing interests

Daniel A. Peiffer is currently an employee of Illumina, Inc.; Kevin McKernan and Robert David are currently employed by Applied Biosystems, a manufacturer of DNA-sequencing reagents and instruments; and Laurakay Bruhn, Nick Sampas, Peter Tsang and N. Alice Yamada are employees of Agilent Technologies, Inc.

Supplementary information

Supplementary Information

The file contains extensive Supplementary Information with Supplementary Figures S1-S2, S4-S9. Supplementary Figures S3 and S10 are included in separate files. (PDF 9427 kb)

Supplementary Figure S2

The file contains Supplementary Figure S2 with end-sequence mapping of fosmids against the human genome. All discordant fosmids mapping to the human genome are displayed individually for each library using the following color scheme: ABC7=green, ABC8=forestgreen, ABC10=blue, ABC13=cyan, G248=black, ABC9=purple, ABC11=red, ABC12=orange, and ABC14=hotpink. The end-sequence placements are mapped in the context of gaps within the assembly (purple) and segmental duplications (grey bars). (PDF 4463 kb)

Supplementary Figure S3

The file contains Supplementary Figure S3 with end-sequence mapping of fosmids against the human genome. All discordant fosmids mapping to the human genome are displayed individually for each library using the following color scheme: ABC7=green, ABC8=forestgreen, ABC10=blue, ABC13=cyan, G248=black, ABC9=purple, ABC11=red, ABC12=orange, and ABC14=hotpink. The end-sequence placements are mapped in the context of gaps within the assembly (purple) and segmental duplications (grey bars). (PDF 6227 kb)

Supplementary Figure S4

The file contains Supplementary Figure S4 with end-sequence mapping of fosmids against the human genome. All discordant fosmids mapping to the human genome are displayed individually for each library using the following color scheme: ABC7=green, ABC8=forestgreen, ABC10=blue, ABC13=cyan, G248=black, ABC9=purple, ABC11=red, ABC12=orange, and ABC14=hotpink. The end-sequence placements are mapped in the context of gaps within the assembly (purple) and segmental duplications (grey bars). (PDF 3489 kb)

Supplementary Figure S10

The file contains Supplementary Figure S10 with sequenced Structural Variation and Gene Structure. A graphical representation for sequenced sites (n=266) of structural variation (miropeats view) is provided. Each alignment compares the human reference genome (top) with the sequenced structure of the fosmid clone. (PDF 3571 kb)

Supplementary Table S1

The file contains Supplementary Table S1 showing concordant vs. discordant clone placement summary statistics. (XLS 22 kb)

Supplementary Table S2

The file contains Supplementary Table S2 showing one-end anchored (OEA) clone statistics. (XLS 14 kb)

Supplementary Table S3

The file contains Supplementary Table S3 with All ESP predicted sites of insertions and deletions with associated experimental validation (See Supplementary Material Section 12 for description of column headers) (XLS 5072 kb)

Supplementary Table S4

The file contains Supplementary Table S4 with ESP predicted sites of insertion and deletion loci (non-redundant) across the fosmid libraries (See Supplementary Material Section 12 for description of column headers) (XLS 4106 kb)

Supplementary Table S5

The file contains Supplementary Table S5 with genotyping results for a subset of ESP deletion variants based on analysis of genotypes from the llumina Human1M BeadChip (XLS 38 kb)

Supplementary Table S6

The file contains Supplementary Table S6 with ESP predicted inversion breakpoints (XLS 308 kb)

Supplementary Table S7

The file contains Supplementary Table S7 with merged inversion loci (non-redundant). (XLS 63 kb)

Supplementary Table S8

The file contains Supplementary Table S8 with large insertions of novel sequence confirmed by optical mapping. (XLS 16 kb)

Supplementary Table S9

The file contains Supplementary Table S9 with genbank accession IDs of sequenced clones. (XLS 73 kb)

Supplementary Table S10

The file contains Supplementary Table S10 with sequenced structural variants that affect exons of genes. (XLS 26 kb)

Supplementary Table S11

The file contains Supplementary Table S11 with summary statistics of fosmid end sequences. (XLS 17 kb)

Supplementary Table S12

The file contains Supplementary Table S12 with genotypes based on custom GoldenGate Assay and qPCR. (XLS 78 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kidd, J., Cooper, G., Donahue, W. et al. Mapping and sequencing of structural variation from eight human genomes. Nature 453, 56–64 (2008). https://doi.org/10.1038/nature06862

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature06862

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing