Understanding the genetic architecture of gene expression is an intermediate step in understanding the genetic architecture of complex diseases. RNA sequencing technologies have improved the quantification of gene expression and allow measurement of allele-specific expression (ASE). ASE is hypothesized to result from the direct effect of cis regulatory variants, but a proper estimation of the causes of ASE has not been performed thus far. In this study, we take advantage of a sample of twins to measure the relative contributions of genetic and environmental effects to ASE, and we find substantial effects from gene × gene (G×G) and gene × environment (G×E) interactions. We propose a model where ASE requires genetic variability in cis, a difference in the sequence of both alleles, but where the magnitude of the ASE effect depends on trans genetic and environmental factors that interact with the cis genetic variants.
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat. Genet. 44, 1084–1089 (2012).
Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853 (2007).
Stranger, B.E. et al. Patterns of cis regulatory variation in diverse human populations. PLoS Genet. 8, e1002639 (2012).
Hemani, G. et al. Detection and replication of epistasis influencing transcription in humans. Nature 508, 249–253 (2014).
Montgomery, S.B. et al. Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464, 773–777 (2010).
Pickrell, J.K. et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464, 768–772 (2010).
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
Essaoui, M. et al. Monozygotic twins discordant for 18q21.2qter deletion detected by array CGH in amniotic fluid. Eur. J. Med. Genet. 56, 502–505 (2013).
Souren, N.Y. et al. Adult monozygotic twins discordant for intra-uterine growth have indistinguishable genome-wide DNA methylation profiles. Genome Biol. 14, R44 (2013).
Surakka, I. et al. A genome-wide association study of monozygotic twin-pairs suggests a locus related to variability of serum high-density lipoprotein cholesterol. Twin Res. Hum. Genet. 15, 691–699 (2012).
Ferguson, J.F. et al. Gene-nutrient interactions in the metabolic syndrome: single nucleotide polymorphisms in ADIPOQ and ADIPOR1 interact with plasma saturated fatty acids to modulate insulin resistance. Am. J. Clin. Nutr. 91, 794–801 (2010).
Joseph, P.G., Pare, G. & Anand, S.S. Exploring gene-environment relationships in cardiovascular disease. Can. J. Cardiol. 29, 37–45 (2013).
Pérez-Martínez, P. et al. Adiponectin gene variants are associated with insulin sensitivity in response to dietary fat consumption in Caucasian men. J. Nutr. 138, 1609–1614 (2008).
Warodomwichit, D. et al. ADIPOQ polymorphisms, monounsaturated fatty acids, and obesity risk: the GOLDN study. Obesity (Silver Spring) 17, 510–517 (2009).
1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).
Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
Parts, L., Stegle, O., Winn, J. & Durbin, R. Joint genetic analysis of gene expression data with inferred cellular phenotypes. PLoS Genet. 7, e1001276 (2011).
Aulchenko, Y.S., Ripke, S., Isaacs, A. & van Duijn, C.M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).
Shabalin, A.A. Matrix eQTL: ultra fast eQTL analysis via large matrix operations. Bioinformatics 28, 1353–1358 (2012).
't Hoen, P.A. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).
Zhang, R. et al. Quantifying RNA allelic ratios by microfluidic multiplex PCR and sequencing. Nat. Methods 11, 51–54 (2014).
Falconer, D.S. & MacKay, T.F.C. Introduction to Quantitative Genetics (Longmans Green, 1996).
Price, A.L. et al. Single-tissue and cross-tissue heritability of gene expression via identity-by-descent in related or unrelated individuals. PLoS Genet. 7, e1001317 (2011).
R Development Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2008).
We thank the twins for their voluntary contribution to this project. This work has been funded by European Union Framework Programme 7 grant EuroBATS (259749), which also supports A.A.B., A.B., M.N.D., D.G., A.V. and T.D.S. A.A.B. is also supported by a grant from the South-Eastern Norway Health Authority (2011060). R.D. is supported by the Wellcome Trust (098051). The Louis-Jeantet Foundation, the Swiss National Science Foundation, the European Research Council (ERC) and the US National Institutes of Health/National Institute of Mental Health GTEx grant support E.T.D. T.D.S. is a National Institute of Health Research (NIHR) senior investigator and the holder of an ERC Advanced Principal Investigator award. J.B.R. and H.F.Z. are supported by the Canadian Institutes of Health Research, Fonds de Recherche Santé du Québec and the Quebec Consortium for Drug Discovery. Most computations were performed at the Vital-IT center for high-performance computing of the Swiss Institute of Bioinformatics (SIB; http://www.vital-it.ch/). The TwinsUK study was funded by the Wellcome Trust, European Community Framework Programme 7 (2007–2013), and the NIHR Clinical Research Facility at Guy's and St Thomas' National Health Service (NHS) Foundation Trust and the NIHR Biomedical Research Centre based at Guy's and St Thomas' NHS Foundation Trust and King's College London. SNP genotyping was performed by the Wellcome Trust Sanger Institute and National Eye Institute via US National Institutes of Health/Center for Inherited Disease Research (CIDR) funding.
The authors declare no competing financial interests.
Integrated supplementary information
(a) Basic model of ASE: genetic variants in cis or epigenetic modifications cause a difference in the expression of the haplotypes. (b) G×G effects of ASE: the effect on ASE of the genetic variants in cis depends on the genotype of a locus in trans. (c) G×E effects on ASE: the magnitud of the ASE caused by genetic variants in cis depends on the effect of an environmental factor.
Comparison of the variance components estimates using all the ASE sites or only one ASE site per gene.
Comparison of the variance components estimates depending on the read coverage threshold used to filter the ASE sites to be analyzed.
MZ twin pairs show a different ASE effect on some genes depending on the genotype of specific SNPs. Every MZ pair is represented by a vertical line whose extremes represent the ASE ratio (reference allele counts/total counts) of each sister. MZ pairs that are heterozygous at the associated SNP show a much greater difference in ASE. Since MZ twins are genetically identical, this association between ASE difference and a SNP reflects the interaction of the SNP with an unknown environment. (a) ASE in the ACSL1 gene shows G×E interaction with SNP r334710824 in fat. (b) ASE in the ADIPOQ gene shows G×E interaction with SNP rs4686817 in fat. (c) ASE in the EBI3 gene shows G×E interaction with SNP rs67782188 in LCLs.
Histogram of the P values of the significant associations in one tissue analyzed in the other tissues.
Variability in the ASE ratio (counts in the reference allele/total counts) among technical replicates depending on covarage. We used five samples that were sequenced between two and seven times in different labs. The red line is the mean raio difference.
We called significant ASE sites in one experiment and calculated the pi1 of these sites in the other experiments.
Supplementary Figure 8 Comparison of ASE ratios in two LCL samples measured in two different labs using two different technologies.
RNA-seq and mmPCR. Scatter plots show the correlation between ASE ratios for sites that are statistically significant using RNA-seq (19 sites for FDR = 0.05 and 28 sites for FDR = 0.1). Histograms show the P values of these sites when analyzed using mmPCR. From these P values, we calculated pi1, a measure of the proportion of significant tests that, in this case, can be interpreted as the replication rate.
For each site, we calculated the mean and variance in the ASE ratio and plotted these two quantities. The correlation is negligible (r = 0.02).
Supplementary Figure 10 Correlation between coverage (total read counts) and our ASE measure for each site in fat.
Correlation is around zero, and it is not biased for positive or negative values. Other measures of ASE such as the difference between the two alleles show a clear positive correlation between coverage and ASE and can cause the false inference of interactions.
Supplementary Figure 11 G×E results separating the twin pairs into those that do not show ASE and those in which at least one twin shows ASE.
There are cases of homozygous pairs with ASE and heterozygous pairs without ASE; in all cases, the difference in ASE is larger for the heterozygous pairs.
Examples of ASE sites with significant ASE not showing G×E interaction in the ACSL1 and ADIPOQ genes in fat and in the EBI3 gene in LCLs.
About this article
Cite this article
Buil, A., Brown, A., Lappalainen, T. et al. Gene-gene and gene-environment interactions detected by transcriptome sequence analysis in twins. Nat Genet 47, 88–91 (2015). https://doi.org/10.1038/ng.3162
Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci
Nature Genetics (2020)
Multi-kernel linear mixed model with adaptive lasso for prediction analysis on high-dimensional multi-omics data
Statistics in Medicine (2020)
Nature Communications (2020)
Frontiers in Psychology (2020)