Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Heritability and genomics of gene expression in peripheral blood

Abstract

We assessed gene expression profiles in 2,752 twins, using a classic twin design to quantify expression heritability and quantitative trait loci (eQTLs) in peripheral blood. The most highly heritable genes (777) were grouped into distinct expression clusters, enriched in gene-poor regions, associated with specific gene function or ontology classes, and strongly associated with disease designation. The design enabled a comparison of twin-based heritability to estimates based on dizygotic identity-by-descent sharing and distant genetic relatedness. Consideration of sampling variation suggests that previous heritability estimates have been upwardly biased. Genotyping of 2,494 twins enabled powerful identification of eQTLs, which we further examined in a replication set of 1,895 unrelated subjects. A large number of non-redundant local eQTLs (6,756) met replication criteria, whereas a relatively small number of distant eQTLs (165) met quality control and replication standards. Our results provide a new resource toward understanding the genetic control of transcription.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Transcriptome-wide estimates of heritability based on 2,752 twins.
Figure 2: Gene density and other predictors of heritability, using 2,616 paired twins and 18,392 genes.
Figure 3: Apparent heritability and local IBD effects versus true underlying distributions.
Figure 4: Comparison and replication of eQTL results.
Figure 5: Properties of distant eQTLs.

References

  1. Hindorff, L.A. et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 9362–9367 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Maurano, M.T. et al. Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 1190–1195 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Hardy, J. Psychiatric genetics: are we there yet? JAMA Psychiatry 70, 569–570 (2013).

    Article  CAS  PubMed  Google Scholar 

  4. Majewski, J. & Pastinen, T. The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet. 27, 72–79 (2011).

    Article  CAS  PubMed  Google Scholar 

  5. Cookson, W., Liang, L., Abecasis, G., Moffatt, M. & Lathrop, M. Mapping complex disease traits with global gene expression. Nat. Rev. Genet. 10, 184–194 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Nicolae, D.L. et al. Trait-associated SNPs are more likely to be eQTLs: annotation to enhance discovery from GWAS. PLoS Genet. 6, e1000888 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Stranger, B.E. et al. Patterns of cis regulatory variation in diverse human populations. PLoS Genet. 8, e1002639 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Lango Allen, H. et al. Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–838 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Emilsson, V. et al. Genetics of gene expression and its effect on disease. Nature 452, 423–428 (2008).

    Article  CAS  PubMed  Google Scholar 

  11. de Jong, S. et al. Expression QTL analysis of top loci from GWAS meta-analysis highlights additional schizophrenia candidate genes. Eur. J. Hum. Genet. 20, 1004–1008 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Fransen, K. et al. Analysis of SNPs with an effect on gene expression identifies UBE2L3 and BCL3 as potential new risk genes for Crohn's disease. Hum. Mol. Genet. 19, 3482–3488 (2010).

    Article  CAS  PubMed  Google Scholar 

  13. Luo, R. et al. Genome-wide transcriptome profiling reveals the functional impact of rare de novo and recurrent CNVs in autism spectrum disorders. Am. J. Hum. Genet. 91, 38–55 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Speliotes, E.K. et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat. Genet. 42, 937–948 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zeller, T. et al. Genetics and beyond—the transcriptome of human monocytes and disease susceptibility. PLoS ONE 5, e10693 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Gamazon, E.R., Huang, R.S., Cox, N.J. & Dolan, M.E. Chemotherapeutic drug susceptibility associated SNPs are enriched in expression quantitative trait loci. Proc. Natl. Acad. Sci. USA 107, 9287–9292 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Thurman, R.E. et al. The accessible chromatin landscape of the human genome. Nature 489, 75–82 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Degner, J.F. et al. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390–394 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Xia, K. et al. seeQTL: a searchable database for human eQTLs. Bioinformatics 28, 451–452 (2012).

    Article  CAS  PubMed  Google Scholar 

  20. Fehrmann, R.S. et al. Trans-eQTLs reveal that independent genetic variants associated with a complex phenotype converge on intermediate genes, with a major role for the HLA. PLoS Genet. 7, e1002197 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Min, J.L. et al. The use of genome-wide eQTL associations in lymphoblastoid cell lines to identify novel genetic pathways involved in complex traits. PLoS ONE 6, e22070 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Grundberg, E. et al. Population genomics in a disease targeted primary cell model. Genome Res. 19, 1942–1952 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Gibbs, J.R. et al. Abundant quantitative trait loci exist for DNA methylation and gene expression in human brain. PLoS Genet. 6, e1000952 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Leek, J.T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 11, 733–739 (2010).

    Article  CAS  PubMed  Google Scholar 

  25. Akey, J.M., Biswas, S., Leek, J.T. & Storey, J.D. On the design and analysis of gene expression studies in human populations. Nat. Genet. 39, 807–808 (2007).

    Article  CAS  PubMed  Google Scholar 

  26. Innocenti, F. et al. Identification, replication, and functional fine-mapping of expression quantitative trait loci in primary human liver tissue. PLoS Genet. 7, e1002078 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Fairfax, B.P. et al. Genetics of gene expression in primary immune cells identifies cell type–specific master regulators and roles of HLA alleles. Nat. Genet. 44, 502–510 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Flutre, T., Wen, X., Pritchard, J. & Stephens, M. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet. 9, e1003486 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Westra, H.J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Powell, J.E. et al. Genetic control of gene expression in whole blood and lymphoblastoid cell lines is largely independent. Genome Res. 22, 456–466 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Choy, E. et al. Genetic analysis of human traits in vitro: drug response and gene expression in lymphoblastoid cell lines. PLoS Genet. 4, e1000287 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. van Dongen, J., Slagboom, P.E., Draisma, H.H., Martin, N.G. & Boomsma, D.I. The continuing value of twin studies in the omics era. Nat. Rev. Genet. 13, 640–653 (2012).

    Article  CAS  PubMed  Google Scholar 

  33. Flicek, P. et al. Ensembl 2013. Nucleic Acids Res. 41, D48–D55 (2013).

    Article  CAS  PubMed  Google Scholar 

  34. Rossin, E.J. et al. Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Huang, W. et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 35, W169–W175 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Grossman, S.R. et al. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327, 883–886 (2010).

    Article  CAS  PubMed  Google Scholar 

  37. Nickel, G.C., Tefft, D. & Adams, M.D. Human PAML browser: a database of positive selection on human genes using phylogenetic methods. Nucleic Acids Res. 36, D800–D808 (2008).

    Article  CAS  PubMed  Google Scholar 

  38. Nielsen, R. et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 3, e170 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Voight, B.F., Kudaravalli, S., Wen, X. & Pritchard, J.K. A map of recent positive selection in the human genome. PLoS Biol. 4, e72 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  40. Andrés, A.M. et al. Targets of balancing selection in the human genome. Mol. Biol. Evol. 26, 2755–2764 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Grossman, S.R. et al. Identifying recent adaptations in large-scale genomic data. Cell 152, 703–713 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Sivakumaran, S. et al. Abundant pleiotropy in human complex diseases and traits. Am. J. Hum. Genet. 89, 607–618 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. McKusick, V.A. Mendelian Inheritance in Man and its online version, OMIM. Am. J. Hum. Genet. 80, 588–604 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Visscher, P.M., Hill, W.G. & Wray, N.R. Heritability in the genomics era—concepts and misconceptions. Nat. Rev. Genet. 9, 255–266 (2008).

    Article  CAS  PubMed  Google Scholar 

  46. Yang, J., Lee, S.H., Goddard, M.E. & Visscher, P.M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Powell, J.E. et al. Congruence of additive and non-additive effects on gene expression estimated from pedigree and SNP data. PLoS Genet. 9, e1003502 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

    Article  CAS  PubMed  Google Scholar 

  50. Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Price, A.L. et al. Effects of cis and trans genetic ancestry on gene expression in African Americans. PLoS Genet. 4, e1000294 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Spielman, R.S. et al. Common genetic variants account for differences in gene expression among ethnic groups. Nat. Genet. 39, 226–231 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Gatti, D.M., Barry, W.T., Nobel, A.B., Rusyn, I. & Wright, F.A. Heading down the wrong pathway: on the influence of correlation within gene sets. BMC Genomics 11, 574 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  54. McLaren, W. et al. Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26, 2069–2070 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Sun, W., Ibrahim, J.G. & Zou, F. Genomewide multiple-loci mapping in experimental crosses by iterative adaptive penalized regression. Genetics 185, 349–359 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Marfil, V. et al. Interaction between Hhex and SOX13 modulates Wnt/TCF activity. J. Biol. Chem. 285, 5726–5737 (2010).

    Article  CAS  PubMed  Google Scholar 

  57. Betancur, C. Etiological heterogeneity in autism spectrum disorders: more than 100 genetic and genomic disorders and still counting. Brain Res. 1380, 42–77 (2011).

    Article  CAS  PubMed  Google Scholar 

  58. Chiurazzi, P., Schwartz, C.E., Gecz, J. & Neri, G. XLMR genes: update 2007. Eur. J. Hum. Genet. 16, 422–434 (2008).

    Article  CAS  PubMed  Google Scholar 

  59. Inlow, J.K. & Restifo, L.L. Molecular and comparative genetics of mental retardation. Genetics 166, 835–881 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Cooper, G.M. et al. A copy number variation morbidity map of developmental delay. Nat. Genet. 43, 838–846 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Sullivan, P.F., Daly, M.J. & O'Donovan, M. Genetic architectures of psychiatric disorders: the emerging picture and its implications. Nat. Rev. Genet. 13, 537–551 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Wise, A.L., Gyi, L. & Manolio, T.A. eXclusion: toward integrating the X chromosome in genome-wide association analyses. Am. J. Hum. Genet. 92, 643–647 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Xavier, R.J. & Rioux, J.D. Genome-wide association studies: a new window into immune-mediated diseases. Nat. Rev. Immunol. 8, 631–643 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Hurst, L.D., Pal, C. & Lercher, M.J. The evolutionary dynamics of eukaryotic gene order. Nat. Rev. Genet. 5, 299–310 (2004).

    Article  CAS  PubMed  Google Scholar 

  65. Osborne, C.S. et al. Active genes dynamically colocalize to shared sites of ongoing transcription. Nat. Genet. 36, 1065–1071 (2004).

    Article  CAS  PubMed  Google Scholar 

  66. Sproul, D., Gilbert, N. & Bickmore, W.A. The role of chromatin structure in regulating the expression of clustered genes. Nat. Rev. Genet. 6, 775–781 (2005).

    Article  CAS  PubMed  Google Scholar 

  67. Hentges, K.E., Pollock, D.D., Liu, B. & Justice, M.J. Regional variation in the density of essential genes in mice. PLoS Genet. 3, e72 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Cai, J.J., Macpherson, J.M., Sella, G. & Petrov, D.A. Pervasive hitchhiking at coding and regulatory sites in humans. PLoS Genet. 5, e1000336 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Davidson, S., Starkey, A. & MacKenzie, A. Evidence of uneven selective pressure on different subsets of the conserved human genome; implications for the significance of intronic and intergenic DNA. BMC Genomics 10, 614 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Willemsen, G. et al. The Netherlands Twin Register biobank: a resource for genetic epidemiological studies. Twin Res. Hum. Genet. 13, 231–245 (2010).

    Article  PubMed  Google Scholar 

  71. Penninx, B.W. et al. The Netherlands Study of Depression and Anxiety (NESDA): rationales, objectives and methods. Int. J. Methods Psychiatr. Res. 17, 121–140 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Boomsma, D.I. et al. Netherlands Twin Register: from twins to twin families. Twin Res. Hum. Genet. 9, 849–857 (2006).

    Article  PubMed  Google Scholar 

  73. Boomsma, D.I. et al. Genome-wide association of major depression: description of samples for the GAIN major depressive disorder study: NTR and NESDA Biobank Projects. Eur. J. Hum. Genet. 16, 335–342 (2008).

    Article  CAS  PubMed  Google Scholar 

  74. Sullivan, P.F. et al. Genomewide association for major depressive disorder: a possible role for the presynaptic protein piccolo. Mol. Psychiatry 14, 359–375 (2009).

    Article  CAS  PubMed  Google Scholar 

  75. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Altshuler, D.M. et al. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).

    Article  CAS  PubMed  Google Scholar 

  77. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  78. Schadt, E.E., Woo, S. & Hao, K. Bayesian method to predict individual SNP genotypes from gene expression data. Nat. Genet. 44, 603–608 (2012).

    Article  CAS  PubMed  Google Scholar 

  79. Leek, J.T. & Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724–1735 (2007).

    Article  CAS  PubMed  Google Scholar 

  80. Falconer, D.S. & Mackay, T.F.C. Introduction to Quantitative Genetics (Longman Group, Ltd., London, 1996).

  81. Neale, M.C. & Cardon, L.R. Methodology for the Study of Twins and Families (Kluwer Academic Publisher Group, Dordrecht, The Netherlands, 1992).

  82. Wang, X., Guo, X., He, M. & Zhang, H. Statistical inference in mixed models and analysis of twin and family data. Biometrics 67, 987–995 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Visscher, P.M. et al. Assumption-free estimation of heritability from genome-wide identity-by-descent sharing between full siblings. PLoS Genet. 2, e41 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Wright, F.A. The phenotypic difference discards sib-pair QTL linkage information. Am. J. Hum. Genet. 60, 740–742 (1997).

    CAS  PubMed  PubMed Central  Google Scholar 

  85. Abecasis, G.R., Cherny, S.S., Cookson, W.O. & Cardon, L.R. Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat. Genet. 30, 97–101 (2002).

    Article  CAS  PubMed  Google Scholar 

  86. Barry, W.T., Nobel, A.B. & Wright, F.A. A statistical framework for testing functional categories in microarray data. Ann. Appl. Stat. 2, 286–315 (2008).

    Article  Google Scholar 

  87. Shabalin, A.A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The work described in this paper was funded by the US National Institute of Mental Health (RC2 MH089951, principal investigator P.F.S.) as part of the American Recovery and Reinvestment Act of 2009. Transport, extraction and preparation of the NTR samples were carried out under a supplement to the NIMH Center for Collaborative Genomics Research on Mental Disorders (U24 MH068457, principal investigator J.T.). We thank T. Lehner (National Institute of Mental Health) for his support. Additional analytic support was provided by grants R01 MH090936, R01 GM074175 and P42 ES005948 and by a Gillings Innovations Award. The Netherlands Study of Depression and Anxiety (NESDA) and the Netherlands Twin Register (NTR) were funded by the Netherlands Organization for Scientific Research (MagW/ZonMW; grants 904-61-090, 985-10-002, 904-61-193, 480-04-004, 400-05-717 and 912-100-20; Spinozapremie 56-464-14192; and Geestkracht program grant 10-000-1002), the Center for Medical Systems Biology (CMSB2; NWO Genomics), Biobanking and Biomolecular Resources Research Infrastructure (BBMRI-NL), the VU University EMGO+ Institute for Health and Care Research and the Neuroscience Campus Amsterdam, NBIC/BioAssist/RK (2008.024), the European Science Foundation (EU/QLRT-2001-01254), the European Community's Seventh Framework Programme (FP7/2007-2013), ENGAGE (HEALTH-F4-2007-201413) and the European Research Council (ERC; 230374).

Author information

Authors and Affiliations

Authors

Contributions

Study design and writing: F.A.W., P.F.S., A.I.B., F.Z., W.S., B.W.J.H.P. and D.I.B. Analysis: F.A.W., P.F.S., F.Z., W.S., K.X., V.M., R.J., W.C., Y.-H.Z., A.A., G.C., T.-H.C., P.G., M.J.H., J.J.H., S.H., M.K., J.K., C.M.M., A.Q., A.S., J.-Y.T., Q.W., Wei Wang, Weibo Wang, G.W., J.H.S., E.J.d.G. and Z.Y. Genomic assays: A.I.B., D.D., J.T., A.Q. and Q.W. Phenotype collection: G.v.G. and J.M.V. Project management: L.T. Database design and management: S.B. and C.B.

Corresponding authors

Correspondence to Fred A Wright or Patrick F Sullivan.

Ethics declarations

Competing interests

P.F.S. was on the scientific advisory board of Expression Analysis. The other authors report no conflicts of interest.

Integrated supplementary information

Supplementary Figure 1 Effects of covariates and mean expression on heritability and shared environmental effects (n = 2,752).

(a) Adjusted R2 for all covariates (not including hybridization plate) for predicting expression levels for each of 43,628 transcripts. (b) Heritability estimates (h2) from the ACE model applied to all transcripts, before and after covariate correction. Negative values are implausible, reflecting sampling variation, but the entire range is shown for illustration, and the estimates are unbiased. The correction generally strengthens the evidence for the most highly significant transcripts. (c) Shared environment (twinship) effect estimates c2, before and after covariate correction. (d) Covariate-corrected observed versus expected right-tailed P values for h2 show a large number of significant transcripts. (e) The same plot for positive c2 shows that none reaches transcriptome-wide significance, nor did any transcript show significant negative c2 evidence (data not shown).

Source data

Supplementary Figure 2 Mean heritability estimates.

(a) Mean heritability as a function of gene proximity to 3,931 NHGRI GWAS catalog SNPs with GWAS P < 5.0 × 10−8. For each of the NHGRI catalog SNPs, the closest gene was recorded, then the second closest, and so on, and each gene was designated according to whether it was the kth-closest gene to at least one SNP in the catalog. The mean heritability as a function of these ranks shows that genes with higher proximity rank tend to have higher h2. A simple rank correlation of the two axis values gives P = 0.017. For each k, the number of genes included is 1,600. (b) Mean heritability as a function of the number of SNPs that are closest to the gene. For each gene, the number of significant NHGRI catalog SNPs for which the gene is closest was recorded, and the mean heritability was displayed for that group. Using all genes, the rank correlation of h2 versus the number of closest GWAS SNPs gives P = 8.2 × 10−16. For each point, the number of genes included in the calculation is shown.

Source data

Supplementary Figure 3 Contribution of components to expression variation for 9,060 genes with h2 > 0.1 (whether or not declared expressed).

(a) Ratio of r2 (variation explained) by best local SNP to overall (twin-based) h2, with medians and means of ratios, as well as the proportion of h2 explained by the best local SNP. (b–d) Analogous plots and values for the best distant SNP (b), local GCTA estimation (c) and local IBD estimation (d) using DZ twins. Analyses using only expressed genes are similar throughout, due to the restriction h2 > 0.1.

Source data

Supplementary Figure 4 Effects of sample size on the reliability of h2 estimates in twin-based designs and comparison with additive-only variance components, applied to 18,392 genes.

(a) From the shrunken ‘true” estimate of the h2 distribution in NTR, the distribution of estimated h2 was determined, using the twin proportions of the Brisbane Systems Genetics Study (BSGS) and assuming no family effects. These hypothetical h2 estimates from NTR (using the shrunken h2 distribution and computed error variation applicable to the BSGS sample size and analysis approach) very closely matches their published report. BSGS values were obtained by digitizing the plot from the authors' report using WebPlotDigitizer (http://arohatgi.info/WebPlotDigitizer/app/). (b) Using the shrunken 'true' estimate of the h2 distribution from NTR and the standard error of a twin-based design and ACE model (with MZ vs. DZ proportions the same as in NTR), the rank correlation of estimated h2 versus true h2 for the transcriptome as a function of total sample size.

Source data

Supplementary Figure 5 Comparison of twin-based heritability to alternate sources of heritability information, using the 18,392 best h2 set of unique genes.

(a) Best local SNP r2 versus twin-based h2. The proportion variance in twin-based h2 explained by the best local SNP in a linear regression model is listed . Each gene was classified as ‘local’ or ‘distant’ according to the smaller of the eQTL P values (regardless of genome-wide significance). (b) Local GCTA r2 versus twin-based h2. (c) DZ local IBD analysis versus twin-based h2. The local IBD analysis is less powerful, as it is based on roughly half of the data. (d) A model using the three predictors in a–c produces only a slight improvement in prediction.

Source data

Supplementary Figure 6 Reproducibility of GODOT eQTLs.

(a) Local eQTL –log10 (q) values for NESDA versus NTR. (b) Inset to highlight less significant local eQTLs. (c) Distant eQTL –log10 (q) values for NESDA versus NTR. The values shown are interchromosomal eQTLs, i.e., the SNP and target gene are on different chromosomes, ensuring that they are truly distant. (d) Inset to highlight less significant distant eQTLs.

Source data

Supplementary Figure 7 Characteristics of local eQTLs.

The 6,864 SNPs involved in the 6,941 local eQTLs were annotated using Variant Effect Predictor (version 2.8) of Ensembl (see the Supplementary Note). (a) The proportion of local eQTLs located in regulatory regions. The number on the top of each bar is the P value indicating over-representation (red) and under-representation (blue) of the eQTLs located in regulatory regions. (b) The proportion of replicating local eQTLs.

Supplementary Figure 8 P values of SNPs in NTR and NESDA for genes declared significant in Westra et al.

(a) Local eQTL replication P values in NTR. (b) Local eQTL replication P values in NESDA. (c) Distant eQTL replication P values in NTR. (d) Distant eQTL replication P values in NESDA.

Source data

Supplementary Figure 9 Properties of replicating distant eQTLs.

The 304 SNPs involved in the 348 distant eQTLs were annotated using Variant Effect Predictor (version 2.8) of Ensembl. Fourteen (26%) of 53 SNPs annotated as intergenic variants were replicated in NESDA, which is significantly lower than the overall replication rate of 47%. There was no significant enrichment or deficiency of replication in other categories.

Supplementary Figure 10 Distant eSNPs are more likely to be local eQTLs.

For each of the 304 SNPs of distant eQTLs, we assessed its association with local genes (> 1 Mb away), and we then grouped these 304 SNPs by their minimum local eQTL P values. The value above each black bar shows the number of SNPs belonging to each significance grouping. For comparison, we randomly selected 10,000 SNPs from the remaining SNPs, matched to the 304 SNPs by minor allele frequency and imputation quality R2, and then grouped analogously by their minimum local eQTL P values.

Source data

Supplementary Figure 11 The distribution of 348 distant eQTLs.

A total of 304 SNPs were involved in the 348 distant eQTLs. The SNPs were clustered by genomic position so that a SNP was assigned to a cluster if its distance to any SNP already in the cluster was smaller than 1 Mb. The SNPs grouped into 203 clusters, 160 clusters with only a single SNP. Forty-three clusters with more than one SNP per cluster spanned 2 kb to 2 Mb, with median size of 89 kb. The plot shows the number of eQTLs belonging to each cluster. Eleven clusters with more than five eQTLs are highlighted in the inset table. The number of NESDA eQTLs (q value > 0.01) associated with each cluster is also shown.

Source data

Supplementary Figure 12 Distant eQTLs and their associated genes on eight chromosomes.

(a) A distant eQTL on chromosome 19 is associated with the expression of 12 distant genes and a local gene MOY1F. The network plot shows the partial correlation graph of these 13 genes, where an edge indicating a nonzero partial correlation. The partial correlations were estimated by the penalized estimation method cited in the Supplementary Note. (b) A distant eQTL on chromosome 20 is associated with the expression of six distant genes and a local gene, SMOX. Using a likelihood ratio test approach and independent genotype and gene expression data from NESDA (to avoid winner's curse), we found that causal relations, eSNP → SMOX → a distant eQTL gene, are significantly more likely than other possible relations for these six distant eQTL genes, using the likelihood approach described in the Supplementary Note. (c) Distant eQTLs on six other chromosomes.

Source data

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–12, Supplementary Tables 1–7 and Supplementary Note (PDF 5924 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wright, F., Sullivan, P., Brooks, A. et al. Heritability and genomics of gene expression in peripheral blood. Nat Genet 46, 430–437 (2014). https://doi.org/10.1038/ng.2951

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.2951

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing