Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale—particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation—a standard for genotyping platforms and a prelude to future individual genome sequencing projects.
Access optionsAccess options
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
We thank the staff from the University of Washington Genome Center and the Washington University Genome Sequencing Center for technical assistance. J.M.K. is supported by a National Science Foundation Graduate Research Fellowship. G.M.C. is supported by a Merck, Jane Coffin Childs Memorial Fund Postdoctoral Fellowship. This work was supported by National Institutes of Health grants HG004120 to E.E.E., D.A.N. and M.V.O., and 3 U54 HG002043 to M.V.O. E.E.E. is an Investigator of the Howard Hughes Medical Institute.
Author Contributions J.M.K., G.M.C., M.V.O, D.A.N, and E.E.E. contributed to the writing of this paper. The study was coordinated by L.B., M.V.O, R.K., D.R.S., J.M.K. and E.E.E. A.B., D.R.S., D.Sa., E.G., H.M.E., K.M., N.T., R.D., W.F.D. and W.T. performed library construction and end sequencing. E.H., H.S.H., K.A.P., M.V.O., R.K., R.K.W., T.G. and W.G. performed clone insert validation and sequencing. C.A., D.A.N., E.T., J.D.S., J.S., L.C., M.D., M.M., M.W., T.L.N. and Z.C. provided technical and analytical support. D.A.P., D.A.A., J.M.Ko. and S.A.M. contributed variation data. G.M.C., J.M.K., L.B., N.A.Y., N.S. and P.T. designed and analysed array CGH experiments. G.M.C. and T.Z. performed the genotype analysis. F.A. performed FISH experiments. B.T. and D.S. performed optical mapping experiments. E.E.E., J.M.K. and L.C. analysed sequenced clones. J.C.M. and N.H. identified SNPs and indels.
The file contains Supplementary Table S1 showing concordant vs. discordant clone placement summary statistics.
The file contains Supplementary Table S2 showing one-end anchored (OEA) clone statistics.
The file contains Supplementary Table S3 with All ESP predicted sites of insertions and deletions with associated experimental validation (See Supplementary Material Section 12 for description of column headers)
The file contains Supplementary Table S4 with ESP predicted sites of insertion and deletion loci (non-redundant) across the fosmid libraries (See Supplementary Material Section 12 for description of column headers)
The file contains Supplementary Table S5 with genotyping results for a subset of ESP deletion variants based on analysis of genotypes from the llumina Human1M BeadChip
The file contains Supplementary Table S6 with ESP predicted inversion breakpoints
The file contains Supplementary Table S7 with merged inversion loci (non-redundant).
The file contains Supplementary Table S8 with large insertions of novel sequence confirmed by optical mapping.
The file contains Supplementary Table S9 with genbank accession IDs of sequenced clones.
The file contains Supplementary Table S10 with sequenced structural variants that affect exons of genes.
The file contains Supplementary Table S11 with summary statistics of fosmid end sequences.
The file contains Supplementary Table S12 with genotypes based on custom GoldenGate Assay and qPCR.
About this article
BMC Bioinformatics (2018)