Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins

Abstract

Understanding the genetic architecture of gene expression is an intermediate step in understanding the genetic architecture of complex diseases. RNA sequencing technologies have improved the quantification of gene expression and allow measurement of allele-specific expression (ASE). ASE is hypothesized to result from the direct effect of cis regulatory variants, but a proper estimation of the causes of ASE has not been performed thus far. In this study, we take advantage of a sample of twins to measure the relative contributions of genetic and environmental effects to ASE, and we find substantial effects from gene × gene (G×G) and gene × environment (G×E) interactions. We propose a model where ASE requires genetic variability in cis, a difference in the sequence of both alleles, but where the magnitude of the ASE effect depends on trans genetic and environmental factors that interact with the cis genetic variants.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: ASE correlation among twin pairs for different categories of genetic similarity.
Figure 2: Variance components for ASE for the model with cis × trans interaction.
Figure 3: G×E examples discovered using analysis of discordant monozygotic twins.

References

  1. 1

    Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).

    CAS  Article  Google Scholar 

  2. 2

    Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).

    CAS  Article  Google Scholar 

  3. 3

    Stranger, B.E. et al. Patterns of cis regulatory variation in diverse human populations. PLoS Genet. 8, e1002639 (2012).

    CAS  Article  Google Scholar 

  4. 4

    Hemani, G. et al. Detection and replication of epistasis influencing transcription in humans. Nature 508, 249–253 (2014).

    CAS  Article  Google Scholar 

  5. 5

    Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).

    CAS  Article  Google Scholar 

  6. 6

    Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).

    CAS  Article  Google Scholar 

  7. 7

    Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).

    CAS  Article  Google Scholar 

  8. 8

    Essaoui, M. et al. Monozygotic twins discordant for 18q21.2qter deletion detected by array CGH in amniotic fluid. Eur. J. Med. Genet. 56, 502–505 (2013).

    CAS  Article  Google Scholar 

  9. 9

    Souren, N.Y. et al. Adult monozygotic twins discordant for intra-uterine growth have indistinguishable genome-wide DNA methylation profiles. Genome Biol. 14, R44 (2013).

    Article  Google Scholar 

  10. 10

    Surakka, I. et al. A genome-wide association study of monozygotic twin-pairs suggests a locus related to variability of serum high-density lipoprotein cholesterol. Twin Res. Hum. Genet. 15, 691–699 (2012).

    Article  Google Scholar 

  11. 11

    Ferguson, J.F. et al. Gene-nutrient interactions in the metabolic syndrome: single nucleotide polymorphisms in ADIPOQ and ADIPOR1 interact with plasma saturated fatty acids to modulate insulin resistance. Am. J. Clin. Nutr. 91, 794–801 (2010).

    CAS  Article  Google Scholar 

  12. 12

    Joseph, P.G., Pare, G. & Anand, S.S. Exploring gene-environment relationships in cardiovascular disease. Can. J. Cardiol. 29, 37–45 (2013).

    Article  Google Scholar 

  13. 13

    Pérez-Martínez, P. et al. Adiponectin gene variants are associated with insulin sensitivity in response to dietary fat consumption in Caucasian men. J. Nutr. 138, 1609–1614 (2008).

    Article  Google Scholar 

  14. 14

    Warodomwichit, D. et al. ADIPOQ polymorphisms, monounsaturated fatty acids, and obesity risk: the GOLDN study. Obesity (Silver Spring) 17, 510–517 (2009).

    CAS  Article  Google Scholar 

  15. 15

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  16. 16

    Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    Article  Google Scholar 

  17. 17

    Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).

    CAS  Article  Google Scholar 

  18. 18

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  Article  Google Scholar 

  19. 19

    Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    CAS  Article  Google Scholar 

  20. 20

    Parts, L., Stegle, O., Winn, J. & Durbin, R. Joint genetic analysis of gene expression data with inferred cellular phenotypes. PLoS Genet. 7, e1001276 (2011).

    CAS  Article  Google Scholar 

  21. 21

    Aulchenko, Y.S., Ripke, S., Isaacs, A. & van Duijn, C.M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).

    CAS  Article  Google Scholar 

  22. 22

    Shabalin, A.A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).

    CAS  Article  Google Scholar 

  23. 23

    't Hoen, P.A. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).

    CAS  Article  Google Scholar 

  24. 24

    Zhang, R. et al. Quantifying RNA allelic ratios by microfluidic multiplex PCR and sequencing. Nat. Methods 11, 51–54 (2014).

    CAS  Article  Google Scholar 

  25. 25

    Falconer, D.S. & MacKay, T.F.C. Introduction to Quantitative Genetics (Longmans Green, 1996).

  26. 26

    Price, A.L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).

    CAS  Article  Google Scholar 

  27. 27

    R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2008).

Download references

Acknowledgements

We thank the twins for their voluntary contribution to this project. This work has been funded by European Union Framework Programme 7 grant EuroBATS (259749), which also supports A.A.B., A.B., M.N.D., D.G., A.V. and T.D.S. A.A.B. is also supported by a grant from the South-Eastern Norway Health Authority (2011060). R.D. is supported by the Wellcome Trust (098051). The Louis-Jeantet Foundation, the Swiss National Science Foundation, the European Research Council (ERC) and the US National Institutes of Health/National Institute of Mental Health GTEx grant support E.T.D. T.D.S. is a National Institute of Health Research (NIHR) senior investigator and the holder of an ERC Advanced Principal Investigator award. J.B.R. and H.F.Z. are supported by the Canadian Institutes of Health Research, Fonds de Recherche Santé du Québec and the Quebec Consortium for Drug Discovery. Most computations were performed at the Vital-IT center for high-performance computing of the Swiss Institute of Bioinformatics (SIB; http://www.vital-it.ch/). The TwinsUK study was funded by the Wellcome Trust, European Community Framework Programme 7 (2007–2013), and the NIHR Clinical Research Facility at Guy's and St Thomas' National Health Service (NHS) Foundation Trust and the NIHR Biomedical Research Centre based at Guy's and St Thomas' NHS Foundation Trust and King's College London. SNP genotyping was performed by the Wellcome Trust Sanger Institute and National Eye Institute via US National Institutes of Health/Center for Inherited Disease Research (CIDR) funding.

Author information

Affiliations

Authors

Contributions

A.B., R.D., T.D.S. and E.T.D. conceived the study. A.B., A.A.B., A.V. and M.N.D. analyzed the data. T.L. and K.S.S. contributed experimental and technical support as well as discussion. D.G. contributed to sample collection. H.F.Z. and J.B.R. contributed technical support and analyzed data. A.B. prepared the manuscript, with contributions from A.A.B. and E.T.D. All authors read and approved the manuscript.

Corresponding authors

Correspondence to Alfonso Buil or Emmanouil T Dermitzakis.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 ASE models.

(a) Basic model of ASE: genetic variants in cis or epigenetic modifications cause a difference in the expression of the haplotypes. (b) G×G effects of ASE: the effect on ASE of the genetic variants in cis depends on the genotype of a locus in trans. (c) G×E effects on ASE: the magnitud of the ASE caused by genetic variants in cis depends on the effect of an environmental factor.

Supplementary Figure 2 One site versus multiple sites per gene.

Comparison of the variance components estimates using all the ASE sites or only one ASE site per gene.

Supplementary Figure 3 Coverage effect on the variance components estimates.

Comparison of the variance components estimates depending on the read coverage threshold used to filter the ASE sites to be analyzed.

Supplementary Figure 4 G×E examples discovered using discordant MZ twins analysis.

MZ twin pairs show a different ASE effect on some genes depending on the genotype of specific SNPs. Every MZ pair is represented by a vertical line whose extremes represent the ASE ratio (reference allele counts/total counts) of each sister. MZ pairs that are heterozygous at the associated SNP show a much greater difference in ASE. Since MZ twins are genetically identical, this association between ASE difference and a SNP reflects the interaction of the SNP with an unknown environment. (a) ASE in the ACSL1 gene shows G×E interaction with SNP r334710824 in fat. (b) ASE in the ADIPOQ gene shows G×E interaction with SNP rs4686817 in fat. (c) ASE in the EBI3 gene shows G×E interaction with SNP rs67782188 in LCLs.

Supplementary Figure 5 High replication of ASE among tissues.

Histogram of the P values of the significant associations in one tissue analyzed in the other tissues.

Supplementary Figure 6 Relationship of ASE ratio variability and count depth.

Variability in the ASE ratio (counts in the reference allele/total counts) among technical replicates depending on covarage. We used five samples that were sequenced between two and seven times in different labs. The red line is the mean raio difference.

Supplementary Figure 7 Replication rate depending on count depth.

We called significant ASE sites in one experiment and calculated the pi1 of these sites in the other experiments.

Supplementary Figure 8 Comparison of ASE ratios in two LCL samples measured in two different labs using two different technologies.

RNA-seq and mmPCR. Scatter plots show the correlation between ASE ratios for sites that are statistically significant using RNA-seq (19 sites for FDR = 0.05 and 28 sites for FDR = 0.1). Histograms show the P values of these sites when analyzed using mmPCR. From these P values, we calculated pi1, a measure of the proportion of significant tests that, in this case, can be interpreted as the replication rate.

Supplementary Figure 9 Relationship between the mean and variance in the ASE ratio for each site.

For each site, we calculated the mean and variance in the ASE ratio and plotted these two quantities. The correlation is negligible (r = 0.02).

Supplementary Figure 10 Correlation between coverage (total read counts) and our ASE measure for each site in fat.

Correlation is around zero, and it is not biased for positive or negative values. Other measures of ASE such as the difference between the two alleles show a clear positive correlation between coverage and ASE and can cause the false inference of interactions.

Supplementary Figure 11 G×E results separating the twin pairs into those that do not show ASE and those in which at least one twin shows ASE.

There are cases of homozygous pairs with ASE and heterozygous pairs without ASE; in all cases, the difference in ASE is larger for the heterozygous pairs.

Supplementary Figure 12 Examples of ASE sites with significant ASE not showing G×E interaction.

Examples of ASE sites with significant ASE not showing G×E interaction in the ACSL1 and ADIPOQ genes in fat and in the EBI3 gene in LCLs.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–12 and Supplementary Tables 1–5. (PDF 2833 kb)

Supplementary Data Set

Data for the comparison between the RNA-seq and mmPCR techniques. (CSV 5 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Buil, A., Brown, A., Lappalainen, T. et al. Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins. Nat Genet 47, 88–91 (2015). https://doi.org/10.1038/ng.3162

Download citation

Further reading