Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease


For most immune-mediated diseases, the main determinant of patient well-being is not the diagnosis itself but instead the course that the disease takes over time (prognosis)1,2,3. Prognosis may vary substantially between patients for reasons that are poorly understood. Familial studies support a genetic contribution to prognosis4,5,6, but little evidence has been found for a proposed association between prognosis and the burden of susceptibility variants7,8,9,10,11,12,13. To better characterize how genetic variation influences disease prognosis, we performed a within-cases genome-wide association study in two cohorts of patients with Crohn's disease. We identified four genome-wide significant loci, none of which showed any association with disease susceptibility. Conversely, the aggregated effect of all 170 disease susceptibility loci was not associated with disease prognosis. Together, these data suggest that the genetic contribution to prognosis in Crohn's disease is largely independent of the contribution to disease susceptibility and point to a biology of prognosis that could provide new therapeutic opportunities.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Within-cases GWAS identifies four loci that are associated with prognosis in Crohn's disease.
Figure 2: Association signal in the MHC region is linked to the ancestral 8.1 haplotype.
Figure 3: Pathway analysis implicates regulation of immune responses and mononuclear phagocytes in Crohn's disease prognosis.
Figure 4: Distribution of Crohn's disease susceptibility alleles does not differ between the prognostic subgroups.

Similar content being viewed by others

Accession codes



Gene Expression Omnibus


  1. Jess, T. et al. Changes in clinical characteristics, course, and prognosis of inflammatory bowel disease during the last 5 decades: a population-based study from Copenhagen, Denmark. Inflamm. Bowel Dis. 13, 481–489 (2007).

    Article  PubMed  Google Scholar 

  2. Pincus, T. Long-term outcomes in rheumatoid arthritis. Br. J. Rheumatol. 34 (Suppl. 2), 59–73 (1995).

    Article  PubMed  Google Scholar 

  3. Weinshenker, B.G. et al. The natural history of multiple sclerosis: a geographically based study. I. Clinical course and disability. Brain 112, 133–146 (1989).

    Article  PubMed  Google Scholar 

  4. Satsangi, J., Grootscholten, C., Holt, H. & Jewell, D.P. Clinical patterns of familial inflammatory bowel disease. Gut 38, 738–741 (1996).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Chataway, J. et al. Multiple sclerosis in sibling pairs: an analysis of 250 families. J. Neurol. Neurosurg. Psychiatry 71, 757–761 (2001).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Jawaheer, D., Lum, R.F., Amos, C.I., Gregersen, P.K. & Criswell, L.A. Clustering of disease features within 512 multicase rheumatoid arthritis families. Arthritis Rheum. 50, 736–741 (2004).

    Article  PubMed  Google Scholar 

  7. Weersma, R.K. et al. Molecular prediction of disease risk and severity in a large Dutch Crohn's disease cohort. Gut 58, 388–395 (2009).

    Article  CAS  PubMed  Google Scholar 

  8. Hilven, K., Patsopoulos, N.A., Dubois, B. & Goris, A. Burden of risk variants correlates with phenotype of multiple sclerosis. Mult. Scler. 21, 1670–1680 (2015).

    Article  CAS  PubMed  Google Scholar 

  9. Chibnik, L.B. et al. Genetic risk score predicting risk of rheumatoid arthritis phenotypes and age of symptom onset. PLoS One 6, e24380 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Ananthakrishnan, A.N. et al. Differential effect of genetic burden on disease phenotypes in Crohn's disease and ulcerative colitis: analysis of a North American cohort. Am. J. Gastroenterol. 109, 395–400 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Jung, C. et al. Genotype/phenotype analyses for 53 Crohn's disease associated genetic polymorphisms. PLoS One 7, e52223 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Jensen, C.J. et al. Multiple sclerosis susceptibility-associated SNPs do not influence disease severity measures in a cohort of Australian MS patients. PLoS One 5, e10003 (2010).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Scott, I.C. et al. Do genetic susceptibility variants associate with disease severity in early active rheumatoid arthritis? J. Rheumatol. 42, 1131–1140 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Plomin, R., Haworth, C.M. & Davis, O.S. Common disorders are quantitative traits. Nat. Rev. Genet. 10, 872–878 (2009).

    Article  CAS  PubMed  Google Scholar 

  15. Heliö, T. et al. CARD15/NOD2 gene variants are associated with familially occurring and complicated forms of Crohn's disease. Gut 52, 558–562 (2003).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Cleynen, I. et al. Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study. Lancet 387, 156–167 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  17. McKinney, E.F., Lee, J.C., Jayne, D.R., Lyons, P.A. & Smith, K.G. T-cell exhaustion, co-stimulation and clinical outcome in autoimmunity and infection. Nature 523, 612–616 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Lee, J.C. et al. Human SNP links differential outcomes in inflammatory and infectious disease to a FOXO3-regulated pathway. Cell 155, 57–69 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Van Gestel, S., Houwing-Duistermaat, J.J., Adolfsson, R., van Duijn, C.M. & Van Broeckhoven, C. Power of selective genotyping in genetic association analyses of quantitative traits. Behav. Genet. 30, 141–146 (2000).

    Article  CAS  PubMed  Google Scholar 

  20. Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007).

  21. Walter, K. et al. The UK10K project identifies rare variants in health and disease. Nature 526, 82–90 (2015).

    Article  CAS  PubMed  Google Scholar 

  22. Vallot, C. et al. Erosion of X chromosome inactivation in human pluripotent cells initiates with XACT coating and depends on a specific heterochromatin landscape. Cell Stem Cell 16, 533–546 (2015).

    Article  CAS  PubMed  Google Scholar 

  23. Franceschi, C. et al. Genes involved in immune response/inflammation, IGF1/insulin pathway and response to oxidative stress play a major role in the genetics of human longevity: the lesson of centenarians. Mech. Ageing Dev. 126, 351–361 (2005).

    Article  CAS  PubMed  Google Scholar 

  24. Padyukov, L. et al. A genome-wide association study suggests contrasting associations in ACPA-positive versus ACPA-negative rheumatoid arthritis. Ann. Rheum. Dis. 70, 259–265 (2011).

    Article  PubMed  Google Scholar 

  25. Egea, E. et al. The cellular basis for lack of antibody response to hepatitis B vaccine in humans. J. Exp. Med. 173, 531–538 (1991).

    Article  CAS  PubMed  Google Scholar 

  26. Modica, M.A., Cammarata, G. & Caruso, C. HLA-B8,DR3 phenotype and lymphocyte responses to phytohaemagglutinin. J. Immunogenet. 17, 101–107 (1990).

    Article  CAS  PubMed  Google Scholar 

  27. Candore, G. et al. T-cell activation in HLA-B8, DR3-positive individuals. Early antigen expression defect in vitro. Hum. Immunol. 42, 289–294 (1995).

    Article  CAS  PubMed  Google Scholar 

  28. Goyette, P. et al. High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis. Nat. Genet. 47, 172–179 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Sands, B.E. et al. Risk of early surgery for Crohn's disease: implications for early treatment strategies. Am. J. Gastroenterol. 98, 2712–2718 (2003).

    Article  PubMed  Google Scholar 

  30. Mabbott, N.A., Baillie, J.K., Brown, H., Freeman, T.C. & Hume, D.A. An expression atlas of human primary cells: inference of gene function from coexpression networks. BMC Genomics 14, 632 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Liu, J.Z. et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat. Genet. 47, 979–986 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Jostins, L. et al. Host–microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 491, 119–124 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Wei, Z. et al. Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease. Am. J. Hum. Genet. 92, 1008–1012 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Anderson, C.A. et al. Data quality control in genetic case–control association studies. Nat. Protoc. 5, 1564–1573 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  37. Howie, B.N., Donnelly, P. & Marchini, J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 5, e1000529 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  38. O'Connell, J. et al. A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet. 10, e1004234 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Delaneau, O., Zagury, J.F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013).

    Article  CAS  PubMed  Google Scholar 

  40. Liu, J.Z. et al. Meta-analysis and imputation refines the association of 15q25 with smoking quantity. Nat. Genet. 42, 436–440 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Zeileis, A., Kleiber, C. & Jackman, S. Regression models for count data in R. J. Stat. Softw. 27, 1–25 (2008).

    Google Scholar 

  42. Cleynen, I. et al. Genetic factors conferring an increased susceptibility to develop Crohn's disease also influence disease phenotype: results from the IBDchip European Project. Gut 62, 1556–1565 (2013).

    Article  CAS  PubMed  Google Scholar 

  43. Wakefield, J. A Bayesian measure of the probability of false discovery in genetic epidemiology studies. Am. J. Hum. Genet. 81, 208–227 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Dilthey, A. et al. Multi-population classical HLA type imputation. PLoS Comput. Biol. 9, e1002877 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Wellek, S. & Ziegler, A. Cochran–Armitage test versus logistic regression in the analysis of genetic association studies. Hum. Hered. 73, 14–17 (2012).

    Article  PubMed  Google Scholar 

  46. Nielsen, M.M. et al. Identification of expressed and conserved human noncoding RNAs. RNA 20, 236–251 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Dobin, A. et al. STAR: ultrafast universal RNA–seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  48. Trapnell, C. et al. Transcript assembly and quantification by RNA–Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Slowikowski, K., Hu, X. & Raychaudhuri, S. SNPsea: an algorithm to identify cell types, tissues and pathways affected by risk loci. Bioinformatics 30, 2496–2497 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Rossin, E.J. et al. Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS Genet. 7, e1001273 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Kundu, S., Aulchenko, Y.S., van Duijn, C.M. & Janssens, A.C. PredictABEL: an R package for the assessment of risk prediction models. Eur. J. Epidemiol. 26, 261–264 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  52. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Zheng, J. et al. LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics (2016).

Download references


We thank L. Hildyard, E. Gray and other members of the Wellcome Trust Sanger Institute DNA team for their help with sample coordination and A. Groff and C. Weiner for critical reading of the manuscript. This work was supported by NIHR Biomedical Research Centres in Cambridge and Guy's and St Thomas' (in particular, J. Todd and the NIHR Cambridge BRC Genomics Theme), Crohn's and Colitis UK (Medical Research Award M/14/2), the Evelyn Trust (17/07), and the Medical Research Council (programme grant MR/L019027/1). J.C.L. is supported by a Wellcome Trust Intermediate Clinical Fellowship (105920/Z/14/Z), and D.B. is supported by a Marie Curie PhD Fellowship (TranSVIR FP7-PEOPLE-ITN-2008 238756). N.J.P. is supported by a Wellcome Trust University Award (094491/Z/10/Z), and J.A.T. is supported by the European Research Council (695551). C.A.A. is supported by the Wellcome Trust (098051). K.G.C.S. is an NIHR Senior Investigator. This study makes use of data generated by the UK10K Consortium, derived from samples from the ALSPAC and DTR cohorts. A full list of the investigators who contributed to the generation of the data is available from Funding for UK10K was provided by the Wellcome Trust (WT091310).

Author information

Authors and Affiliations




The experiment was conceived by J.C.L., M.P., and K.G.C.S. J.C.L., D.B., and P.A.L. designed the analysis. D.B. performed the analysis with input from J.C.L., L.J., C.A.A., J.A.T., and P.A.L. Patient samples and phenotype data were provided by J.C.L., R.R., R.B.G., J.C.M., T.A., N.J.P., J.S., D.C.W., M.P., and other members of the UK IBD Genetics Consortium. J.C.L. and K.G.C.S. wrote the manuscript with input from D.B., P.A.L., and M.P. All authors reviewed and approved the manuscript prior to submission.

Corresponding authors

Correspondence to James C Lee or Kenneth G C Smith.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Additional information

A full list of members and affiliations appears in the Supplementary Note.

Integrated supplementary information

Supplementary Figure 1 Quantile–quantile plot for the combined analysis of cohorts 1 and 2.

Quantile–quantile plot of the observed –log10 (P values) versus the expectation under the null hypothesis. Data are presented for the meta-analysis of cohorts 1 and 2 after imputation and quality control. The overall genomic control inflation factor (λGC) is 1.023, indicating that inflation due to population structure is negligible. SNPs at which the P value is smaller than 1 × 10–8 are represented by triangles at the top of the plot. The gray region represents the 95% concentration band.

Supplementary Figure 2 Fine-mapping at the FOXO3 locus.

Prognosis GWAS results (combined cohorts) at the FOXO3 locus; adapted from the LocusTrack plot1. Top, SNPs in the region with their –log10 (P value) plotted against genomic position and colored according to LD with the lead SNP (rs147856773). Genes in the region are indicated. The expanded plot includes SNPs, genes, and ChIP–seq data from the ENCODE Project2. H3K4me1 and H3K27ac data from CD14+ monocytes and p300 binding data from myeloid K562 cells are shown (no monocyte data were available). The transcription factor binding track displays regions of transcription factor binding identified in a large collection of ChIP–seq experiments performed by the ENCODE Project (further details available at

Supplementary Figure 3 Transcription of XACT in a range of human tissues.

RNA sequencing data from the XACT locus in a range of human tissues. Raw data were downloaded and aligned against the hg19 genome using Star3. (ad) The data sets comprised GEO series GSE45326 (ref. 4; n = 1 per tissue) (a,b) and the Illumina Human Bodythe Map 2.0 project (ArrayExpress E-MTAB-513, n = 1 per tissue) (c,d). The bar plots in a and c depict FPKM for the human tissues studied. The tables in b and d contain the raw and normalized data for each tissue.

Supplementary Figure 4 Relationship between association at classical HLA alleles and the frequency with which these alleles occur in non-ancestral MHC 8.1 haplotypes.

Linear regression demonstrating the relationship between the classical HLA alleles that were associated with prognosis and the frequency with which they occur in haplotypes other than the ancestral MHC 8.1 haplotype in Caucasians. Allele frequency and haplotype data were obtained from the National Bone Marrow Donor Program (Six-Locus High Resolution HLA ACBDRB3/4/5DRB1DQB1 Haplotype Frequencies). Data were not available for HLA-DQA1. In our data, the frequency with which the lead SNP (rs9279411) was associated with non-AH8.1 haplotypes was 0.0149, suggesting that rs9279411 is a better tag for AH8.1 than any of its constituent HLA alleles (and explaining the difference in P values between the HLA alleles and the SNP association).

Supplementary Figure 5 The genetic association signals for Crohn’s disease prognosis and susceptibility at the MHC region are distinct.

(a) Manhattan plots for 22,125 MHC SNPs that were common to this analysis of Crohn’s disease prognosis (top; blue) and a large recent meta-analysis of Crohn’s disease susceptibility (5,956 cases, 14,927 controls5; bottom; red). (b) Scatterplot directly comparing the association P values between Crohn’s disease susceptibility and prognosis at these 22,125 common SNPs. Dotted lines indicate the significance threshold for suggestive association (P < 1 × 10–5). No SNPs that showed suggestive association in one analysis (of susceptibility or prognosis) were also suggestively associated in the other.

Supplementary Figure 6 Protein–protein interaction analysis of genes implicated at prognosis-associated loci.

DAPPLE analysis of prognosis-associated SNPs (meta P < 1 × 10–4) demonstrating known interactions between proteins at implicated loci. Colored dots represent genes at prognosis-associated loci. Gray dots represent proteins at other non-associated loci. Gray lines represent known interactions.

Supplementary Figure 7 Relationship between the observed P value and power for each of the 170 Crohn’s disease susceptibility variants.

Scatterplot of the statistical power to detect a weak general effect (OR = 1.25) plotted against the observed P value in the prognosis analysis for each of the 170 Crohn’s disease susceptibility variants. The line of best fit (dotted line) was calculated by linear regression. Lack of correlation between power and P value is consistent with the null hypothesis that none of the disease susceptibility variants are individually associated with prognosis.

Supplementary Figure 8 Genetic risk scores using the extended Crohn’s disease SNP list (P < 1 × 10–4).

(ac) Box-and-whisker plots of weighted genetic risk scores between good- and poor-prognosis Crohn’s disease subgroups. (a) L1 (ileal disease, n = 742). (b) L2 (colonic disease, n = 724). (c) L3 (ileocolonic disease, n = 947). Boxes represent the mean and interquartile range. Whiskers represent maximum and minimum values. Genetic risk scores were calculated using an extended list of Crohn’s disease–associated SNPs (P < 1 × 10–4) and their published β values6. (d) Distribution of unweighted risk allele counts in the extended list of Crohn’s disease SNPs between the good-prognosis and poor-prognosis Crohn’s disease subgroups. Purple histogram bars represent the poor-prognosis Crohn’s disease subgroup, and yellow histogram bars represent the good-prognosis Crohn’s disease subgroup. Statistical significance was assessed using unpaired two-tailed Student's t tests and were stratified for disease location; n = 2,413.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1–6 and 9, and Supplementary Note (PDF 2834 kb)

Supplementary Table 7

SNPsea results for enrichment of prognosis-associated genes in known biological pathways (Gene Ontology). (XLSX 100 kb)

Supplementary Table 8

SNPsea results for enrichment of prognosis-associated genes in primary human cell types. (XLSX 22 kb)

Supplementary Table 10

Association statistics for 170 Crohn's disease susceptibility SNPs in GWAS of prognosis. (XLSX 30 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lee, J., Biasci, D., Roberts, R. et al. Genome-wide association study identifies distinct genetic contributions to prognosis and susceptibility in Crohn's disease. Nat Genet 49, 262–268 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing