Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn's disease

Abstract

Gene expression profiling can be used to uncover the mechanisms by which loci identified through genome-wide association studies (GWAS) contribute to pathology1,2. Given that most GWAS hits are in putative regulatory regions and transcript abundance is physiologically closer to the phenotype of interest2, we hypothesized that summation of risk-allele-associated gene expression, namely a transcriptional risk score (TRS), should provide accurate estimates of disease risk. We integrate summary-level GWAS and expression quantitative trait locus (eQTL) data with RNA-seq data from the RISK study, an inception cohort of pediatric Crohn's disease3,4. We show that TRSs based on genes regulated by variants linked to inflammatory bowel disease (IBD) not only outperform genetic risk scores (GRSs) in distinguishing Crohn's disease from healthy samples, but also serve to identify patients who in time will progress to complicated disease. Our dissection of eQTL effects may be used to distinguish genes whose association with disease is through promotion versus protection, thereby linking statistical association to biological mechanism. The TRS approach constitutes a potential strategy for personalized medicine that enhances inference from static genotypic risk assessment.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Transcriptional risk scores integrate GWAS and eQTL results to measure individual risk of disease based on transcript abundance.
Figure 2: Transcriptional risk scores based on ileal gene expression at diagnosis distinguish status and course of Crohn's disease.
Figure 3: Gene expression polarized according to predicted direction of risk uncovers two divergent mechanisms of association with disease.
Figure 4: Incoherent genes show similar patterns in stimulated immune cells and are more weakly associated with IBD according to GWAS.

Accession codes

Primary accessions

Gene Expression Omnibus

Referenced accessions

Gene Expression Omnibus

References

  1. 1

    Fairfax, B.P. & Knight, J.C. Genetics of gene expression in immunity to infection. Curr. Opin. Immunol. 30, 63–71 (2014).

    CAS  Article  Google Scholar 

  2. 2

    Gibson, G., Powell, J.E. & Marigorta, U.M. Expression quantitative trait locus analysis for translational medicine. Genome Med. 7, 60 (2015).

    Article  Google Scholar 

  3. 3

    Haberman, Y. et al. Pediatric Crohn disease patients exhibit specific ileal transcriptome and microbiome signature. J. Clin. Invest. 124, 3617–3633 (2014).

    CAS  Article  Google Scholar 

  4. 4

    Kugathasan, S. et al. Prediction of complicated disease course for children newly diagnosed with Crohn's disease: a multicentre inception cohort study. Lancet 389, 1710–1718 (2017).

    Article  Google Scholar 

  5. 5

    Witte, J.S., Visscher, P.M. & Wray, N.R. The contribution of genetic variants to disease depends on the ruler. Nat. Rev. Genet. 15, 765–776 (2014).

    CAS  Article  Google Scholar 

  6. 6

    Wray, N.R., Yang, J., Goddard, M.E. & Visscher, P.M. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS Genet. 6, e1000864 (2010).

    Article  Google Scholar 

  7. 7

    Wray, N.R., Goddard, M.E. & Visscher, P.M. Prediction of individual genetic risk to disease from genome-wide association studies. Genome Res. 17, 1520–1528 (2007).

    CAS  Article  Google Scholar 

  8. 8

    Chatterjee, N., Shi, J. & García-Closas, M. Developing and evaluating polygenic risk prediction models for stratified disease prevention. Nat. Rev. Genet. 17, 392–406 (2016).

    CAS  Article  Google Scholar 

  9. 9

    Walters, T.D. et al. Increased effectiveness of early therapy with anti–tumor necrosis factor-α vs an immunomodulator in children with Crohn's disease. Gastroenterology 146, 383–391 (2014).

    CAS  Article  Google Scholar 

  10. 10

    Liu, J.Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).

    CAS  Article  Google Scholar 

  11. 11

    Westra, H.J. et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12

    Kabakchiev, B. & Silverberg, M.S. Expression quantitative trait loci analysis identifies associations between genotype and gene expression in human intestine. Gastroenterology 144, 1488–1496 (2013).

    CAS  Article  Google Scholar 

  13. 13

    GTEx Consortium. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).

  14. 14

    Di Narzo, A.F. et al. Blood and intestine eQTLs from an anti-TNF-resistant Crohn's disease cohort inform IBD genetic association loci. Clin. Transl. Gastroenterol. 7, e177 (2016).

    CAS  Article  Google Scholar 

  15. 15

    Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  Google Scholar 

  16. 16

    Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

    CAS  Article  Google Scholar 

  17. 17

    Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).

    CAS  Article  Google Scholar 

  18. 18

    Lee, J.C. et al. Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease. Nat. Genet. 49, 262–268 (2017).

    Article  Google Scholar 

  19. 19

    Ning, K. et al. Improved integrative framework combining association data with gene expression features to prioritize Crohn's disease genes. Hum. Mol. Genet. 24, 4147–4157 (2015).

    CAS  Article  Google Scholar 

  20. 20

    Singh, T. et al. Characterization of expression quantitative trait loci in the human colon. Inflamm. Bowel Dis. 21, 251–256 (2015).

    CAS  Article  Google Scholar 

  21. 21

    Albert, F.W. & Kruglyak, L. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).

    CAS  Article  Google Scholar 

  22. 22

    Gibson, G. & Weir, B. The quantitative genetics of transcription. Trends Genet. 21, 616–623 (2005).

    CAS  Article  Google Scholar 

  23. 23

    de Souza, H.S. & Fiocchi, C. Immunopathogenesis of IBD: current state of the art. Nat. Rev. Gastroenterol. Hepatol. 13, 13–27 (2016).

    CAS  Article  Google Scholar 

  24. 24

    McGovern, D.P., Kugathasan, S. & Cho, J.H. Genetics of inflammatory bowel diseases. Gastroenterology 149, 1163–1176 (2015).

    CAS  Article  Google Scholar 

  25. 25

    Nabekura, T. et al. Costimulatory molecule DNAM-1 is essential for optimal differentiation of memory natural killer cells during mouse cytomegalovirus infection. Immunity 40, 225–234 (2014).

    CAS  Article  Google Scholar 

  26. 26

    Martinet, L. & Smyth, M.J. Balancing natural killer cell activation through paired receptors. Nat. Rev. Immunol. 15, 243–254 (2015).

    CAS  Article  Google Scholar 

  27. 27

    Petrillo, M.G. et al. GITR+ regulatory T cells in the treatment of autoimmune diseases. Autoimmun. Rev. 14, 117–126 (2015).

    CAS  Article  Google Scholar 

  28. 28

    Reikvam, D.H. et al. Increase of regulatory T cells in ileal mucosa of untreated pediatric Crohn's disease patients. Scand. J. Gastroenterol. 46, 550–560 (2011).

    CAS  Article  Google Scholar 

  29. 29

    Ye, C.J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).

    Article  Google Scholar 

  30. 30

    Wiley, S.E. et al. The outer mitochondrial membrane protein mitoNEET contains a novel redox-active 2Fe-2S cluster. J. Biol. Chem. 282, 23745–23749 (2007).

    CAS  Article  Google Scholar 

  31. 31

    Novak, E.A. & Mollen, K.P. Mitochondrial dysfunction in inflammatory bowel disease. Front. Cell Dev. Biol. 3, 62 (2015).

    Article  Google Scholar 

  32. 32

    Levine, A. et al. Pediatric modification of the Montreal classification for inflammatory bowel disease: the Paris classification. Inflamm. Bowel Dis. 17, 1314–1321 (2011).

    Article  Google Scholar 

  33. 33

    Satsangi, J., Silverberg, M.S., Vermeire, S. & Colombel, J.F. The Montreal classification of inflammatory bowel disease: controversies, consensus, and implications. Gut 55, 749–753 (2006).

    CAS  Article  Google Scholar 

  34. 34

    Cleynen, I. et al. Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study. Lancet 387, 156–167 (2016).

    Article  Google Scholar 

  35. 35

    Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    Article  Google Scholar 

  36. 36

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  37. 37

    Anders, S., Pyl, P.T. & Huber, W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015).

    CAS  Google Scholar 

  38. 38

    Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    CAS  Article  Google Scholar 

  39. 39

    Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    CAS  Article  Google Scholar 

  40. 40

    Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 47, 381–386 (2015).

    CAS  Article  Google Scholar 

  41. 41

    Leek, J.T., Johnson, W.E., Parker, H.S., Jaffe, A.E. & Storey, J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    CAS  Article  Google Scholar 

  42. 42

    Mecham, B.H., Nelson, P.S. & Storey, J.D. Supervised normalization of microarrays. Bioinformatics 26, 1308–1315 (2010).

    CAS  Article  Google Scholar 

  43. 43

    Zhou, X. & Stephens, M. Genome-wide efficient mixed-model analysis for association studies. Nat. Genet. 44, 821–824 (2012).

    CAS  Article  Google Scholar 

  44. 44

    Chun, S. et al. Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat. Genet. 49, 600–605 (2017).

    CAS  Article  Google Scholar 

  45. 45

    Lee, M.N. et al. Common genetic variants modulate pathogen-sensing responses in human dendritic cells. Science 343, 1246980 (2014).

    Article  Google Scholar 

Download references

Acknowledgements

We are grateful to B. Zeng, D. Arafat, H. Somineni, S. Venkateswaran, and colleagues from the Gibson and Kugathasan laboratories for their support and helpful comments. We also would like to thank I. Mendizabal, J. Lachance and K. Jordan for comments on the manuscript. This research was supported by Project 3 (G.G., PI) of the NIH program project “Statistical and Quantitative Genetics” grant P01-GM0996568 (B. Weir, University of Washington, Director) as well as research grants from the Crohn's and Colitis Foundation of America (CCFA), New York, to the individual study institutions participating in the RISK study.

Author information

Affiliations

Authors

Contributions

U.M.M. and G.G. conceived the theoretical framework for the TRSs. L.A.D., J.S.H. and S.K. participated in the conception and design of the RISK study. K.M., J.P., T.D.W., A.G., J.D.N., W.V.C., J.R.R., D.R.M., R.K., M.B.H., S.S.B., M.C.S., R.N.B., J.F.M., M.C.D., B.J.A., M.-O. K. and J.C. recruited subjects, collected the data, and worked on its curation and analysis. U.M.M. performed the TRS analyses. U.M.M. and G.G. interpreted the results and drafted the manuscript, while L.A.D., J.S.H. and S.K. assisted with results interpretation and writing.

Corresponding author

Correspondence to Greg Gibson.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Replicability of blood eQTL effects in ileal biopsies from the RISK study.

eQTLs detected in the vicinity of SNPs associated with IBD tend to show concordant effect size and direction in blood and ileum. The effects of 136 eQTLs available in ileum are shown (Supplementary Table 2). The x axis shows the β values for the eQTLs detected in peripheral blood from the Blood eQTL browser; the y axis shows the β values in the eQTL mapping study with ileal biopsies from the RISK study (see “Mapping study in the RISK cohort to build the ileal TRS” in the Online Methods). The dashed best-fitting least-squares regression line corresponds to Spearman r = 0.54 (P = 2 × 10−11). Values in the corners indicate the percentage of loci in each quadrant, showing that 70% are concordant in direction of effect in the two tissues (P = 1.7 × 10−6, sign test).

Supplementary Figure 2 Performance of the GRS and TRS based on the initial set of 157 candidate genes.

(a,b) Each plot shows the TRS based on 157 IBD genes associated with 96 eQTLs that are also associated with IBD or in LD with a SNP associated with the disease (Supplementary Table 1). The discriminatory performance of the GRS versus TRS based on these genes is shown for disease status: comparison of samples with Crohn’s disease (n = 210) and controls (n = 35) (a) and disease course (3-year period after diagnosis): comparison of samples that remain in non-complicated Crohn’s disease (B1; n = 183) and those that develop complicated disease (B2 and/or B3; n = 27) (b). The standardized GRS and TRS are shown on the y axis. Differences between groups (in s.d. units) along with P values (two-sided t test) are reported for each comparison.

Supplementary Figure 3 Selection of genes based on SMR and coloc results.

(a) Each point represents the –log10 (P value) (NLP) for the blood eQTL association and Crohn’s disease GWAS association for 157 candidate genes. Colors represent the significance of the SMR statistic, clearly showing that the most highly significant genes are strongly associated with both traits. Similar plots are observed for ulcerative colitis and IBD. All 39 genes with SMR P < 2.3 × 10−4 (red and brown dots) for all three disease classifications were included in the final SMR-based TRS. (b) The coloc H4 score estimates the posterior probability that the same causal variant drives both the GWAS and eQTL associations. This plot shows that poor SMR values (small NLPs) tend also to have low coloc H4 scores; however, only approximately half of the strong SMR values (large NLPs) have strong coloc H4 posterior probabilities. The 29 genes with coloc H4 greater than 0.8 for the three disease phenotypes were included in the final coloc-based TRS. This includes 14 genes not in the SMR set.

Supplementary Figure 4 Relationship between transcriptional risk scores and location of inflammation.

Because the Paris classification of pediatric Crohn’s disease includes location of disease, which was strongly correlated with the degree of inflammation in the ileum from which biopsies were obtained, we plot here the relationship between disease location and the 29-gene coloc-derived TRS. RISK study patients were classified into two categories according to the presence/absence of visible ileal inflammation in endoscopies performed at diagnosis (L1 (ileum-only) and L3 (ileocolonic) cases were classified as ‘inflamed’; L2 (colonic-only) cases were classified as ‘non-inflamed’). Only two of the cases that progressed to complicated disease were non-inflamed, which are not shown owing to low sample size. The TRS is slightly elevated in inflamed versus endoscopically non-inflamed B1 cases (P < 0.02) and is also elevated in B1 cases with non-inflamed ilea as compared to non-IBD controls (P < 1 × 10−6), confirming that the TRS picks up a signal that is related but complementary to inflammation. Complicated cases have an elevated TRS even relative to inflamed B1 cases (P < 7 × 10−4). A box plot of values is shown for each group along with P values for pairwise comparisons (two-sided t test).

Supplementary Figure 5 Performance of the GRS and TRS based on 39 susceptibility genes detected by SMR.

(a,b) Thirty-nine genes were detected by SMR as being under the control of 29 causal variants that account for the association detected by GWAS and the eQTL effect reported in the Blood eQTL browser (Supplementary Table 4). The performance of the GRS verus TRS based on these genes is shown for disease status: comparison of samples with Crohn’s disease (n = 210) versus non-IBD controls (n = 35) (a) and disease course (3-year period after diagnosis): comparison of samples that remain in non-complicated Crohn’s disease (B1; n = 183) versus those that develop complicated disease (B2 and/or B3; n = 27) (b). The standardized GRS and TRS are shown on the y axis. Differences between groups (in s.d. units) along with P values (two-sided t test) are reported for each comparison.

Supplementary Figure 6 Performance of PRSs based on LD-pruned variants at different significance inclusion thresholds.

(a,b) PRSs at different thresholds (Online Methods) successfully separate Crohn’s disease cases from non-IBD controls (a) but fail to distinguish according to development of complicated disease (b). The performance of PRSs using SNPs that pass a range of liberal P-value thresholds in GWAS analysis is shown (the inclusion threshold and total number of variants used are reported on the y axis). Differences between groups (in s.d. units) along with P values for each comparison are reported on the x axis.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–6 (PDF 1128 kb)

Life Sciences Reporting Summary (PDF 170 kb)

Supplementary Table 1

eQTL association data in peripheral blood for 232 SNPs associated with IBD with genes <1 Mb away (7,389 SNP–gene pairs). (XLSX 1169 kb)

Supplementary Table 2

Replicability of blood eQTL effects in ileal tissue from the RISK study. (XLSX 58 kb)

Supplementary Table 3

coloc results for 163 SNP–gene pairs selected from the Blood eQTL browser. (XLSX 79 kb)

Supplementary Table 4

SMR results for 163 SNP–gene pairs selected from the Blood eQTL browser. (XLSX 86 kb)

Supplementary Table 5

eQTL association and coloc results for 46 genes controlled by SNPs associated with IBD in the RISK ileal eQTL mapping study. (XLSX 60 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Marigorta, U., Denson, L., Hyams, J. et al. Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn's disease. Nat Genet 49, 1517–1521 (2017). https://doi.org/10.1038/ng.3936

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing