Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The impact of structural variation on human gene expression


Structural variants (SVs) are an important source of human genetic diversity, but their contribution to traits, disease and gene regulation remains unclear. We mapped cis expression quantitative trait loci (eQTLs) in 13 tissues via joint analysis of SVs, single-nucleotide variants (SNVs) and short insertion/deletion (indel) variants from deep whole-genome sequencing (WGS). We estimated that SVs are causal at 3.5–6.8% of eQTLs—a substantially higher fraction than prior estimates—and that expression-altering SVs have larger effect sizes than do SNVs and indels. We identified 789 putative causal SVs predicted to directly alter gene expression: most (88.3%) were noncoding variants enriched at enhancers and other regulatory elements, and 52 were linked to genome-wide association study loci. We observed a notable abundance of rare high-impact SVs associated with aberrant expression of nearby genes. These results suggest that comprehensive WGS-based SV analyses will increase the power of common- and rare-variant association studies.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Structural-variation call set.
Figure 2: eQTL effect-size distributions and heritability partitioning with linear mixed models.
Figure 3: Feature enrichment of SV-eQTLs.
Figure 4: Candidate SV-eQTLs at GWAS loci.
Figure 5: Gene expression outliers are associated with rare SVs.


  1. Edwards, S.L., Beesley, J., French, J.D. & Dunning, A.M. Beyond GWASs: illuminating the dark road from association to function. Am. J. Hum. Genet. 93, 779–797 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  3. GTEx Consortium. Human genomics: the Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  4. Battle, A. et al. Characterizing the genetic basis of transcriptome diversity through RNA-sequencing of 922 individuals. Genome Res. 24, 14–24 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  5. 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  6. Conrad, D.F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010).

    CAS  PubMed  Google Scholar 

  7. Alkan, C., Coe, B.P. & Eichler, E.E. Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12, 363–376 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Weischenfeldt, J., Symmons, O., Spitz, F. & Korbel, J.O. Phenotypic impact of genomic structural variation: insights from and for human disease. Nat. Rev. Genet. 14, 125–138 (2013).

    CAS  PubMed  Google Scholar 

  9. Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  10. Schlattl, A., Anders, S., Waszak, S.M., Huber, W. & Korbel, J.O. Relating CNVs to transcriptome data at fine resolution: assessment of the effect of variant size, type, and overlap with functional regions. Genome Res. 21, 2004–2013 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. Bryois, J. et al. Cis and trans effects of human genomic variants on gene expression. PLoS Genet. 10, e1004461 (2014).

    PubMed  PubMed Central  Google Scholar 

  12. Gamazon, E.R., Nicolae, D.L. & Cox, N.J. A study of CNVs as trait-associated polymorphisms and as expression quantitative trait loci. PLoS Genet. 7, e1001292 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. Sudmant, P.H. et al. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  14. Chiang, C. et al. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat. Methods 12, 966–968 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at (2013).

  16. Layer, R.M., Chiang, C., Quinlan, A.R. & Hall, I.M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).

    PubMed  PubMed Central  Google Scholar 

  17. Handsaker, R.E., Korn, J.M., Nemesh, J. & McCarroll, S.A. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat. Genet. 43, 269–276 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  18. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. Ongen, H., Buil, A., Brown, A.A., Dermitzakis, E.T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).

    CAS  PubMed  Google Scholar 

  20. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Hormozdiari, F., Kostem, E., Kang, E.Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  22. Gymrek, M. et al. Abundant contribution of short tandem repeats to gene expression variation in humans. Nat. Genet. 48, 22–29 (2016).

    CAS  PubMed  Google Scholar 

  23. Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Gusev, A. et al. Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. Am. J. Hum. Genet. 95, 535–552 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  25. Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  26. ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

  27. Fu, Y. et al. FunSeq2: a framework for prioritizing noncoding regulatory variants in cancer. Genome Biol. 15, 480 (2014).

    PubMed  PubMed Central  Google Scholar 

  28. Ashoor, H., Kleftogiannis, D., Radovanovic, A. & Bajic, V.B. DENdb: database of integrated human enhancers. Database 2015, (2015).

  29. Welter, D. et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 42, D1001–D1006 (2014).

    CAS  PubMed  Google Scholar 

  30. Gretarsdottir, S. et al. Genome-wide association study identifies a sequence variant within the DAB2IP gene conferring susceptibility to abdominal aortic aneurysm. Nat. Genet. 42, 692–697 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Okada, Y. et al. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature 506, 376–381 (2014).

    CAS  PubMed  Google Scholar 

  32. Suzuki, A. et al. Functional haplotypes of PADI4, encoding citrullinating enzyme peptidylarginine deiminase 4, are associated with rheumatoid arthritis. Nat. Genet. 34, 395–402 (2003).

    CAS  PubMed  Google Scholar 

  33. Yang, X.-K. et al. Associations between PADI4 gene polymorphisms and rheumatoid arthritis: an updated meta-analysis. Arch. Med. Res. 46, 317–325 (2015).

    PubMed  Google Scholar 

  34. Wu, C. et al. Joint analysis of three genome-wide association studies of esophageal squamous cell carcinoma in Chinese populations. Nat. Genet. 46, 1001–1006 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. Barrett, J.H. et al. Genome-wide association study identifies three new melanoma susceptibility loci. Nat. Genet. 43, 1108–1113 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Stacey, S.N. et al. Insertion of an SVA-E retrotransposon into the CASP8 gene is associated with protection against prostate cancer. Hum. Mol. Genet. 25, 1008–1018 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. de Cid, R. et al. Deletion of the late cornified envelope LCE3B and LCE3C genes as a susceptibility factor for psoriasis. Nat. Genet. 41, 211–215 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Chambers, J.C. et al. Genome-wide association study identifies loci influencing concentrations of liver enzymes in plasma. Nat. Genet. 43, 1131–1138 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  39. Craddock, N. et al. Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464, 713–720 (2010).

    CAS  PubMed  Google Scholar 

  40. Li, X. et al. The impact of rare variation on gene expression across tissues. Preprint at (2016).

  41. Li, X. & Montgomery, S.B. Detection and impact of rare regulatory variants in human disease. Front. Genet. 4, 67 (2013).

    PubMed  PubMed Central  Google Scholar 

  42. Li, X. et al. Transcriptome sequencing of a large human family identifies the impact of rare noncoding variants. Am. J. Hum. Genet. 95, 245–256 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Quinlan, A.R. & Hall, I.M. Characterizing complex structural variation in germline and somatic genomes. Trends Genet. 28, 43–53 (2012).

    CAS  PubMed  Google Scholar 

  44. Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. Ganel, L. & Abel, H.J. FinMetSeq Consortium & Hall, I.M. SVScore: an impact prediction tool for structural variation. Bioinformatics (2016).

  46. Cooper, N.J. et al. Detection and correction of artefacts in estimation of rare copy number variants and analysis of rare deletions in type 1 diabetes. Hum. Mol. Genet. 24, 1774–1790 (2015).

    CAS  PubMed  Google Scholar 

  47. Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. Harrow, J. et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 7 (Suppl. 1), 1–9 (2006).

    PubMed  Google Scholar 

  50. DeLuca, D.S. et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics 28, 1530–1532 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Ongen, H., Buil, A., Brown, A., Dermitzakis, E. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics (2016).

  52. Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  53. Quinlan, A.R. & Hall, I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  54. Ho, J.W. et al. Comparative analysis of metazoan chromatin organization. Nature 512, 449–452 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


The authors thank R.E. Handsaker for advice on Genome STRiP, H.J. Abel for helpful statistical discussions and R.M. Layer for software contributions. This work was supported by the NIH (MH101810) (D.F.C.), the NIH/NHGRI (1UM1HG008853) (I.M.H.), a Burroughs Wellcome Fund Career Award (I.M.H.), a Mr. and Mrs. Spencer T. Olin Fellowship for Women in Graduate Study (A.J.S.), a Lucille P. Markey Biomedical Research Stanford Graduate Fellowship (J.R.D.), the Stanford Genome Training Program (SGTP; NIH/NHGRI T32HG000044) (J.R.D.), a Hewlett-Packard Stanford Graduate Fellowship (E.K.T.), and a doctoral scholarship from the Natural Science and Engineering Council of Canada (E.K.T.). The Genotype-Tissue Expression (GTEx) Project was supported by the Common Fund of the Office of the Director of the National Institutes of Health. Additional funds were provided by the NCI, NHGRI, NHLBI, NIDA, NIMH and NINDS. Donors were enrolled at Biospecimen Source Sites funded by NCI/SAIC-Frederick, Inc. (SAIC-F) subcontracts to the National Disease Research Interchange (10XS170), Roswell Park Cancer Institute (10XS171) and Science Care, Inc. (X10S172). The Laboratory, Data Analysis, and Coordinating Center (LDACC) was funded through a contract (HHSN268201000029C) to The Broad Institute, Inc. Biorepository operations were funded through an SAIC-F subcontract to the Van Andel Institute (10ST1035). Additional data repository and project management were provided by SAIC-F (HHSN261200800001E). The Brain Bank was supported by supplements to University of Miami grants DA006227 & DA033684 and to contract N01MH000028. Statistical Methods development grants were made to the University of Geneva (MH090941 and MH101814), the University of Chicago (MH090951, MH090937, MH101820 and MH101825), the University of North Carolina—Chapel Hill (MH090936 and MH101819), Harvard University (MH090948), Stanford University (MH101782), Washington University at St. Louis (MH101810) and the University of Pennsylvania (MH101822).

Author information

Authors and Affiliations




C.C., A.B., S.B.M., D.F.C. and I.M.H. designed the experiments. C.C. and A.J.S. performed SV discovery and genotyping. C.C. performed common eQTL mapping, causality analyses, LD tagging and candidate GWAS analyses. J.R.D., E.K.T., X.L., Y.K. and F.N.D. identified gene expression outliers. C.C. and A.J.S. analyzed rare SVs. L.G. and I.M.H. designed SVScore annotation. D.F.C. and T.H. performed microarray-based CNV detection. C.C., D.F.C. and I.M.H. wrote the manuscript.

Corresponding authors

Correspondence to Donald F Conrad or Ira M Hall.

Ethics declarations

Competing interests

D.F.C. is a paid consultant of PierianDx. The authors declare no other competing financial interests.

Additional information

A full list of members and affiliations appears in the Supplementary Note.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–27, Supplementary Tables 1, 3, 4 and 6–9, and Supplementary Note. (PDF 7904 kb)

Supplementary Table 2

Excel file of all SV-only and joint eQTLs, along with causality scores. (XLSX 7084 kb)

Supplementary Table 5

Excel file of all SV-eQTL GWAS hits. (XLSX 94 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chiang, C., Scott, A., Davis, J. et al. The impact of structural variation on human gene expression. Nat Genet 49, 692–699 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research