Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease


For most immune-mediated diseases, the main determinant of patient well-being is not the diagnosis itself but instead the course that the disease takes over time (prognosis)1,2,3. Prognosis may vary substantially between patients for reasons that are poorly understood. Familial studies support a genetic contribution to prognosis4,5,6, but little evidence has been found for a proposed association between prognosis and the burden of susceptibility variants7,8,9,10,11,12,13. To better characterize how genetic variation influences disease prognosis, we performed a within-cases genome-wide association study in two cohorts of patients with Crohn's disease. We identified four genome-wide significant loci, none of which showed any association with disease susceptibility. Conversely, the aggregated effect of all 170 disease susceptibility loci was not associated with disease prognosis. Together, these data suggest that the genetic contribution to prognosis in Crohn's disease is largely independent of the contribution to disease susceptibility and point to a biology of prognosis that could provide new therapeutic opportunities.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Within-cases GWAS identifies four loci that are associated with prognosis in Crohn's disease.
Figure 2: Association signal in the MHC region is linked to the ancestral 8.1 haplotype.
Figure 3: Pathway analysis implicates regulation of immune responses and mononuclear phagocytes in Crohn's disease prognosis.
Figure 4: Distribution of Crohn's disease susceptibility alleles does not differ between the prognostic subgroups.

Accession codes



Gene Expression Omnibus


  1. Jess, T. et al. Changes in clinical characteristics, course, and prognosis of inflammatory bowel disease during the last 5 decades: a population-based study from Copenhagen, Denmark. Inflamm. Bowel Dis. 13, 481–489 (2007).

    PubMed  Article  Google Scholar 

  2. Pincus, T. Long-term outcomes in rheumatoid arthritis. Br. J. Rheumatol. 34 (Suppl. 2), 59–73 (1995).

    PubMed  Article  Google Scholar 

  3. Weinshenker, B.G. et al. The natural history of multiple sclerosis: a geographically based study. I. Clinical course and disability. Brain 112, 133–146 (1989).

    PubMed  Article  Google Scholar 

  4. Satsangi, J., Grootscholten, C., Holt, H. & Jewell, D.P. Clinical patterns of familial inflammatory bowel disease. Gut 38, 738–741 (1996).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  5. Chataway, J. et al. Multiple sclerosis in sibling pairs: an analysis of 250 families. J. Neurol. Neurosurg. Psychiatry 71, 757–761 (2001).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  6. Jawaheer, D., Lum, R.F., Amos, C.I., Gregersen, P.K. & Criswell, L.A. Clustering of disease features within 512 multicase rheumatoid arthritis families. Arthritis Rheum. 50, 736–741 (2004).

    PubMed  Article  Google Scholar 

  7. Weersma, R.K. et al. Molecular prediction of disease risk and severity in a large Dutch Crohn's disease cohort. Gut 58, 388–395 (2009).

    CAS  PubMed  Article  Google Scholar 

  8. Hilven, K., Patsopoulos, N.A., Dubois, B. & Goris, A. Burden of risk variants correlates with phenotype of multiple sclerosis. Mult. Scler. 21, 1670–1680 (2015).

    CAS  PubMed  Article  Google Scholar 

  9. Chibnik, L.B. et al. Genetic risk score predicting risk of rheumatoid arthritis phenotypes and age of symptom onset. PLoS One 6, e24380 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  10. Ananthakrishnan, A.N. et al. Differential effect of genetic burden on disease phenotypes in Crohn's disease and ulcerative colitis: analysis of a North American cohort. Am. J. Gastroenterol. 109, 395–400 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. Jung, C. et al. Genotype/phenotype analyses for 53 Crohn's disease associated genetic polymorphisms. PLoS One 7, e52223 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  12. Jensen, C.J. et al. Multiple sclerosis susceptibility-associated SNPs do not influence disease severity measures in a cohort of Australian MS patients. PLoS One 5, e10003 (2010).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  13. Scott, I.C. et al. Do genetic susceptibility variants associate with disease severity in early active rheumatoid arthritis? J. Rheumatol. 42, 1131–1140 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  14. Plomin, R., Haworth, C.M. & Davis, O.S. Common disorders are quantitative traits. Nat. Rev. Genet. 10, 872–878 (2009).

    CAS  PubMed  Article  Google Scholar 

  15. Heliö, T. et al. CARD15/NOD2 gene variants are associated with familially occurring and complicated forms of Crohn's disease. Gut 52, 558–562 (2003).

    PubMed  PubMed Central  Article  Google Scholar 

  16. Cleynen, I. et al. Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study. Lancet 387, 156–167 (2016).

    PubMed  PubMed Central  Article  Google Scholar 

  17. McKinney, E.F., Lee, J.C., Jayne, D.R., Lyons, P.A. & Smith, K.G. T-cell exhaustion, co-stimulation and clinical outcome in autoimmunity and infection. Nature 523, 612–616 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  18. Lee, J.C. et al. Human SNP links differential outcomes in inflammatory and infectious disease to a FOXO3-regulated pathway. Cell 155, 57–69 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. Van Gestel, S., Houwing-Duistermaat, J.J., Adolfsson, R., van Duijn, C.M. & Van Broeckhoven, C. Power of selective genotyping in genetic association analyses of quantitative traits. Behav. Genet. 30, 141–146 (2000).

    CAS  PubMed  Article  Google Scholar 

  20. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  21. Walter, K. et al. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).

    CAS  PubMed  Article  Google Scholar 

  22. Vallot, C. et al. Erosion of X chromosome inactivation in human pluripotent cells initiates with XACT coating and depends on a specific heterochromatin landscape. Cell Stem Cell 16, 533–546 (2015).

    CAS  PubMed  Article  Google Scholar 

  23. Franceschi, C. et al. Genes involved in immune response/inflammation, IGF1/insulin pathway and response to oxidative stress play a major role in the genetics of human longevity: the lesson of centenarians. Mech. Ageing Dev. 126, 351–361 (2005).

    CAS  PubMed  Article  Google Scholar 

  24. Padyukov, L. et al. A genome-wide association study suggests contrasting associations in ACPA-positive versus ACPA-negative rheumatoid arthritis. Ann. Rheum. Dis. 70, 259–265 (2011).

    PubMed  Article  Google Scholar 

  25. Egea, E. et al. The cellular basis for lack of antibody response to hepatitis B vaccine in humans. J. Exp. Med. 173, 531–538 (1991).

    CAS  PubMed  Article  Google Scholar 

  26. Modica, M.A., Cammarata, G. & Caruso, C. HLA-B8,DR3 phenotype and lymphocyte responses to phytohaemagglutinin. J. Immunogenet. 17, 101–107 (1990).

    CAS  PubMed  Article  Google Scholar 

  27. Candore, G. et al. T-cell activation in HLA-B8, DR3-positive individuals. Early antigen expression defect in vitro. Hum. Immunol. 42, 289–294 (1995).

    CAS  PubMed  Article  Google Scholar 

  28. Goyette, P. et al. High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis. Nat. Genet. 47, 172–179 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  29. Sands, B.E. et al. Risk of early surgery for Crohn's disease: implications for early treatment strategies. Am. J. Gastroenterol. 98, 2712–2718 (2003).

    PubMed  Article  Google Scholar 

  30. Mabbott, N.A., Baillie, J.K., Brown, H., Freeman, T.C. & Hume, D.A. An expression atlas of human primary cells: inference of gene function from coexpression networks. BMC Genomics 14, 632 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  31. Liu, J.Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  32. Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  33. Wei, Z. et al. Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease. Am. J. Hum. Genet. 92, 1008–1012 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  34. Anderson, C.A. et al. Data quality control in genetic case–control association studies. Nat. Protoc. 5, 1564–1573 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  35. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  36. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  37. Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  38. O'Connell, J. et al. A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet. 10, e1004234 (2014).

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  39. Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

    CAS  PubMed  Article  Google Scholar 

  40. Liu, J.Z. et al. Meta-analysis and imputation refines the association of 15q25 with smoking quantity. Nat. Genet. 42, 436–440 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  41. Zeileis, A., Kleiber, C. & Jackman, S. Regression models for count data in R. J. Stat. Softw. 27, 1–25 (2008).

    Google Scholar 

  42. Cleynen, I. et al. Genetic factors conferring an increased susceptibility to develop Crohn's disease also influence disease phenotype: results from the IBDchip European Project. Gut 62, 1556–1565 (2013).

    CAS  PubMed  Article  Google Scholar 

  43. Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  44. Dilthey, A. et al. Multi-population classical HLA type imputation. PLoS Comput. Biol. 9, e1002877 (2013).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  45. Wellek, S. & Ziegler, A. Cochran–Armitage test versus logistic regression in the analysis of genetic association studies. Hum. Hered. 73, 14–17 (2012).

    PubMed  Article  Google Scholar 

  46. Nielsen, M.M. et al. Identification of expressed and conserved human noncoding RNAs. RNA 20, 236–251 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. Dobin, A. et al. STAR: ultrafast universal RNA–seq aligner. Bioinformatics 29, 15–21 (2013).

    CAS  PubMed  Article  Google Scholar 

  48. Trapnell, C. et al. Transcript assembly and quantification by RNA–Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  49. Slowikowski, K., Hu, X. & Raychaudhuri, S. SNPsea: an algorithm to identify cell types, tissues and pathways affected by risk loci. Bioinformatics 30, 2496–2497 (2014).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  50. Rossin, E.J. et al. Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  51. Kundu, S., Aulchenko, Y.S., van Duijn, C.M. & Janssens, A.C. PredictABEL: an R package for the assessment of risk prediction models. Eur. J. Epidemiol. 26, 261–264 (2011).

    PubMed  PubMed Central  Article  Google Scholar 

  52. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  53. Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics (2016).

Download references


We thank L. Hildyard, E. Gray and other members of the Wellcome Trust Sanger Institute DNA team for their help with sample coordination and A. Groff and C. Weiner for critical reading of the manuscript. This work was supported by NIHR Biomedical Research Centres in Cambridge and Guy's and St Thomas' (in particular, J. Todd and the NIHR Cambridge BRC Genomics Theme), Crohn's and Colitis UK (Medical Research Award M/14/2), the Evelyn Trust (17/07), and the Medical Research Council (programme grant MR/L019027/1). J.C.L. is supported by a Wellcome Trust Intermediate Clinical Fellowship (105920/Z/14/Z), and D.B. is supported by a Marie Curie PhD Fellowship (TranSVIR FP7-PEOPLE-ITN-2008 238756). N.J.P. is supported by a Wellcome Trust University Award (094491/Z/10/Z), and J.A.T. is supported by the European Research Council (695551). C.A.A. is supported by the Wellcome Trust (098051). K.G.C.S. is an NIHR Senior Investigator. This study makes use of data generated by the UK10K Consortium, derived from samples from the ALSPAC and DTR cohorts. A full list of the investigators who contributed to the generation of the data is available from Funding for UK10K was provided by the Wellcome Trust (WT091310).

Author information

Authors and Affiliations




The experiment was conceived by J.C.L., M.P., and K.G.C.S. J.C.L., D.B., and P.A.L. designed the analysis. D.B. performed the analysis with input from J.C.L., L.J., C.A.A., J.A.T., and P.A.L. Patient samples and phenotype data were provided by J.C.L., R.R., R.B.G., J.C.M., T.A., N.J.P., J.S., D.C.W., M.P., and other members of the UK IBD Genetics Consortium. J.C.L. and K.G.C.S. wrote the manuscript with input from D.B., P.A.L., and M.P. All authors reviewed and approved the manuscript prior to submission.

Corresponding authors

Correspondence to James C Lee or Kenneth G C Smith.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

A full list of members and affiliations appears in the Supplementary Note.

Integrated supplementary information

Supplementary Figure 1 Quantile–quantile plot for the combined analysis of cohorts 1 and 2.

Quantile–quantile plot of the observed –log10 (P values) versus the expectation under the null hypothesis. Data are presented for the meta-analysis of cohorts 1 and 2 after imputation and quality control. The overall genomic control inflation factor (λGC) is 1.023, indicating that inflation due to population structure is negligible. SNPs at which the P value is smaller than 1 × 10–8 are represented by triangles at the top of the plot. The gray region represents the 95% concentration band.

Supplementary Figure 2 Fine-mapping at the FOXO3 locus.

Prognosis GWAS results (combined cohorts) at the FOXO3 locus; adapted from the LocusTrack plot1. Top, SNPs in the region with their –log10 (P value) plotted against genomic position and colored according to LD with the lead SNP (rs147856773). Genes in the region are indicated. The expanded plot includes SNPs, genes, and ChIP–seq data from the ENCODE Project2. H3K4me1 and H3K27ac data from CD14+ monocytes and p300 binding data from myeloid K562 cells are shown (no monocyte data were available). The transcription factor binding track displays regions of transcription factor binding identified in a large collection of ChIP–seq experiments performed by the ENCODE Project (further details available at

Supplementary Figure 3 Transcription of XACT in a range of human tissues.

RNA sequencing data from the XACT locus in a range of human tissues. Raw data were downloaded and aligned against the hg19 genome using Star3. (ad) The data sets comprised GEO series GSE45326 (ref. 4; n = 1 per tissue) (a,b) and the Illumina Human Bodythe Map 2.0 project (ArrayExpress E-MTAB-513, n = 1 per tissue) (c,d). The bar plots in a and c depict FPKM for the human tissues studied. The tables in b and d contain the raw and normalized data for each tissue.

Supplementary Figure 4 Relationship between association at classical HLA alleles and the frequency with which these alleles occur in non-ancestral MHC 8.1 haplotypes.

Linear regression demonstrating the relationship between the classical HLA alleles that were associated with prognosis and the frequency with which they occur in haplotypes other than the ancestral MHC 8.1 haplotype in Caucasians. Allele frequency and haplotype data were obtained from the National Bone Marrow Donor Program (Six-Locus High Resolution HLA ACBDRB3/4/5DRB1DQB1 Haplotype Frequencies). Data were not available for HLA-DQA1. In our data, the frequency with which the lead SNP (rs9279411) was associated with non-AH8.1 haplotypes was 0.0149, suggesting that rs9279411 is a better tag for AH8.1 than any of its constituent HLA alleles (and explaining the difference in P values between the HLA alleles and the SNP association).

Supplementary Figure 5 The genetic association signals for Crohn’s disease prognosis and susceptibility at the MHC region are distinct.

(a) Manhattan plots for 22,125 MHC SNPs that were common to this analysis of Crohn’s disease prognosis (top; blue) and a large recent meta-analysis of Crohn’s disease susceptibility (5,956 cases, 14,927 controls5; bottom; red). (b) Scatterplot directly comparing the association P values between Crohn’s disease susceptibility and prognosis at these 22,125 common SNPs. Dotted lines indicate the significance threshold for suggestive association (P < 1 × 10–5). No SNPs that showed suggestive association in one analysis (of susceptibility or prognosis) were also suggestively associated in the other.

Supplementary Figure 6 Protein–protein interaction analysis of genes implicated at prognosis-associated loci.

DAPPLE analysis of prognosis-associated SNPs (meta P < 1 × 10–4) demonstrating known interactions between proteins at implicated loci. Colored dots represent genes at prognosis-associated loci. Gray dots represent proteins at other non-associated loci. Gray lines represent known interactions.

Supplementary Figure 7 Relationship between the observed P value and power for each of the 170 Crohn’s disease susceptibility variants.

Scatterplot of the statistical power to detect a weak general effect (OR = 1.25) plotted against the observed P value in the prognosis analysis for each of the 170 Crohn’s disease susceptibility variants. The line of best fit (dotted line) was calculated by linear regression. Lack of correlation between power and P value is consistent with the null hypothesis that none of the disease susceptibility variants are individually associated with prognosis.

Supplementary Figure 8 Genetic risk scores using the extended Crohn’s disease SNP list (P < 1 × 10–4).

(ac) Box-and-whisker plots of weighted genetic risk scores between good- and poor-prognosis Crohn’s disease subgroups. (a) L1 (ileal disease, n = 742). (b) L2 (colonic disease, n = 724). (c) L3 (ileocolonic disease, n = 947). Boxes represent the mean and interquartile range. Whiskers represent maximum and minimum values. Genetic risk scores were calculated using an extended list of Crohn’s disease–associated SNPs (P < 1 × 10–4) and their published β values6. (d) Distribution of unweighted risk allele counts in the extended list of Crohn’s disease SNPs between the good-prognosis and poor-prognosis Crohn’s disease subgroups. Purple histogram bars represent the poor-prognosis Crohn’s disease subgroup, and yellow histogram bars represent the good-prognosis Crohn’s disease subgroup. Statistical significance was assessed using unpaired two-tailed Student's t tests and were stratified for disease location; n = 2,413.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1–6 and 9, and Supplementary Note (PDF 2834 kb)

Supplementary Table 7

SNPsea results for enrichment of prognosis-associated genes in known biological pathways (Gene Ontology). (XLSX 100 kb)

Supplementary Table 8

SNPsea results for enrichment of prognosis-associated genes in primary human cell types. (XLSX 22 kb)

Supplementary Table 10

Association statistics for 170 Crohn's disease susceptibility SNPs in GWAS of prognosis. (XLSX 30 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lee, J., Biasci, D., Roberts, R. et al. Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease. Nat Genet 49, 262–268 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing