Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls

A Corrigendum to this article was published on 29 July 2015

This article has been updated


Determining whether potential causal variants for related diseases are shared can identify overlapping etiologies of multifactorial disorders. Colocalization methods disentangle shared and distinct causal variants. However, existing approaches require independent data sets. Here we extend two colocalization methods to allow for the shared-control design commonly used in comparison of genome-wide association study results across diseases. Our analysis of four autoimmune diseases—type 1 diabetes (T1D), rheumatoid arthritis, celiac disease and multiple sclerosis—identified 90 regions that were associated with at least one disease, 33 (37%) of which were associated with 2 or more disorders. Nevertheless, for 14 of these 33 shared regions, there was evidence that the causal variants differed. We identified new disease associations in 11 regions previously associated with one or more of the other 3 disorders. Four of eight T1D-specific regions contained known type 2 diabetes (T2D) candidate genes (COBL, GLIS3, RNLS and BCAR1), suggesting a shared cellular etiology.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Figure 1: Venn diagram showing a summary of the disease assignments for 90 regions that showed association with at least one disease, based on the results of the Bayesian analysis.
Figure 2: Distribution of , the estimated proportionality coefficient, together with its 95% confidence interval.
Figure 3: The 2q33.1 region containing the candidate gene CTLA4.
Figure 4: The 6q23.3 region containing the candidate causal gene TNFAIP3.

Change history

  • 08 July 2015

    In the version of this article initially published, the two panels in Figure 2 were presented in the incorrect order. The error has been corrected in the HTML and PDF versions of the article.


  1. Cotsapas, C. et al. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet. 7, e1002254 (2011).

    Article  CAS  Google Scholar 

  2. Plagnol, V., Smyth, D.J., Todd, J.A. & Clayton, D.G. Statistical independence of the colocalized association signals for type 1 diabetes and RPS26 gene expression on chromosome 12q13. Biostatistics 10, 327–334 (2009).

    Article  Google Scholar 

  3. Wallace, C. et al. Statistical colocalization of monocyte gene expression and genetic risk variants for type 1 diabetes. Hum. Mol. Genet. 21, 2815–2824 (2012).

    Article  CAS  Google Scholar 

  4. Wallace, C. Statistical testing of shared genetic control for potentially related traits. Genet. Epidemiol. 37, 802–813 (2013).

    Article  Google Scholar 

  5. Giambartolomei, C. et al. Bayesian test for colocalization between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  Google Scholar 

  6. Smyth, D.J. et al. Shared and distinct genetic variants in type 1 diabetes and celiac disease. N. Engl. J. Med. 359, 2767–2777 (2008).

    Article  CAS  Google Scholar 

  7. Cortés, A. & Brown, M.A. Promise and pitfalls of the Immunochip. Arthritis Res. Ther. 13, 101 (2011).

    Article  Google Scholar 

  8. Parkes, M., Cortés, A., van Heel, D.A. & Brown, M.A. Genetic insights into common pathways and complex relationships among immune-mediated diseases. Nat. Rev. Genet. 14, 661–673 (2013).

    Article  CAS  Google Scholar 

  9. Eyre, S. et al. High-density genetic mapping identifies new susceptibility loci for rheumatoid arthritis. Nat. Genet. 44, 1336–1340 (2012).

    Article  CAS  Google Scholar 

  10. Beecham, A.H. et al. Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis. Nat. Genet. 45, 1353–1360 (2013).

    Article  CAS  Google Scholar 

  11. Mero, I.L. et al. A rare variant of the TYK2 gene is confirmed to be associated with multiple sclerosis. Eur. J. Hum. Genet. 18, 502–504 (2010).

    Article  CAS  Google Scholar 

  12. Barrett, J.C. et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 41, 703–707 (2009).

    Article  CAS  Google Scholar 

  13. Ueda, H. et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature 423, 506–511 (2003).

    Article  CAS  Google Scholar 

  14. Fung, E.Y.M.G. et al. Analysis of 17 autoimmune disease–associated variants in type 1 diabetes identifies 6q23/TNFAIP3 as a susceptibility locus. Genes Immun. 10, 188–191 (2009).

    Article  CAS  Google Scholar 

  15. Ioannidis, J.P.A. Why most discovered true associations are inflated. Epidemiology 19, 640–648 (2008).

    Article  Google Scholar 

  16. Evangelou, M. et al. A method for gene-based pathway analysis using genomewide association study summary statistics reveals nine new type 1 diabetes associations. Genet. Epidemiol. 38, 661–670 (2014).

    Article  Google Scholar 

  17. Trynka, G. et al. Dense genotyping identifies and localizes multiple common and rare variant association signals in celiac disease. Nat. Genet. 43, 1193–1201 (2011).

    Article  CAS  Google Scholar 

  18. Flutre, T., Wen, X., Pritchard, J. & Stephens, M. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genet. 9, e1003486 (2013).

    Article  CAS  Google Scholar 

  19. Deelen, P. et al. Improved imputation quality of low-frequency and rare variants in European samples using the Genome of The Netherlands. Eur. J. Hum. Genet. 22, 1321–1326 (2014).

    Article  CAS  Google Scholar 

  20. Swafford, A.D.E. et al. An allele of IKZF1 (Ikaros) conferring susceptibility to childhood acute lymphoblastic leukemia protects against type 1 diabetes. Diabetes 60, 1041–1044 (2011).

    Article  CAS  Google Scholar 

  21. Visscher, P.M., Brown, M.A., McCarthy, M.I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90, 7–24 (2012).

    Article  CAS  Google Scholar 

  22. Nalls, M.A. et al. Multiple loci are associated with white blood cell phenotypes. PLoS Genet. 7, e1002113 (2011).

    Article  CAS  Google Scholar 

  23. Morris, A.P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat. Genet. 44, 981–990 (2012).

    Article  CAS  Google Scholar 

  24. Dupuis, J. et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat. Genet. 42, 105–116 (2010).

    Article  CAS  Google Scholar 

  25. Nogueira, T.C. et al. GLIS3, a susceptibility gene for type 1 and type 2 diabetes, modulates pancreatic β cell apoptosis via regulation of a splice variant of the BH3-only protein Bim. PLoS Genet. 9, e1003532 (2013).

    Article  CAS  Google Scholar 

  26. Harder, M.N. et al. Type 2 diabetes risk alleles near BCAR1 and in ANK1 associate with decreased β-cell function whereas risk alleles near ANKRD55 and GRB14 associate with decreased insulin sensitivity in the Danish Inter99 cohort. J. Clin. Endocrinol. Metab. 98, E801–E806 (2013).

    Article  CAS  Google Scholar 

  27. Scott, R.A. et al. Genome-wide association study imputed to 1000 Genomes reveals 18 novel associations with type 2 diabetes. (American Society of Human Genetics, 2014).

  28. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium et al. Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat. Genet. 46, 234–244 (2014).

    Article  Google Scholar 

  29. Panagiotou, O.A. & Ioannidis, J.P.A. What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations. Int. J. Epidemiol. 41, 273–286 (2012).

    Article  Google Scholar 

  30. Onengut-Gumuscu, S. et al. Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers. Nat. Genet. 47, 381–386 (2015).

    Article  CAS  Google Scholar 

  31. Cooper, J.D. et al. Seven newly identified loci for autoimmune thyroid disease. Hum. Mol. Genet. 21, 5202–5208 (2012).

    Article  CAS  Google Scholar 

  32. Raftery, A.E. Approximate Bayes factors and accounting for model uncertainty in generalised linear models. Biometrika 83, 251–266 (1996).

    Article  Google Scholar 

Download references


M.D.F. is funded by the Wellcome Trust (099772). C.W. and H.G. are funded by the Wellcome Trust (089989).

This work was funded by the JDRF (9-2011-253), the Wellcome Trust (091157) and the National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre. The Cambridge Institute for Medical Research (CIMR) is in receipt of a Wellcome Trust Strategic Award (100140). ImmunoBase is supported by Eli Lilly and Company.

We thank the UK Medical Research Council (MRC) and Wellcome Trust for funding the collection of DNA for the British 1958 Birth Cohort (MRC grant G0000934 and Wellcome Trust grant 068545/Z/02). Control DNA samples were prepared and provided by S. Ring, R. Jones, M. Pembrey, W. McArdle, D. Strachan and P. Burton.

This research uses resources provided by the Type 1 Diabetes Genetics Consortium, a collaborative clinical study sponsored by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), the National Institute of Allergy and Infectious Diseases (NIAID), the National Human Genome Research Institute (NHGRI), the National Institute of Child Health and Human Development (NICHD) and the JDRF and supported by grant U01DK062418 from the NIDDK.

Collection of the rheumatoid arthritis data was funded by the Arthritis Foundation and the US National Institutes of Health.

We are grateful to G. Trynka and D. van Heel for use of the celiac disease data. Funding was provided by the Wellcome Trust, by grants from the Celiac Disease Consortium and an Innovative Cluster approved by the Netherlands Genomics Initiative, by the Dutch government (BSIK03009 to C.W.) and the Netherlands Organisation for Scientific Research (NWO; grant 918.66.620), and by US National Institutes of Health grant 1R01CA141743 and Fondo de Investigación Sanitaria grants FIS08/1676 and FIS07/0353.

Multiple sclerosis data were provided by S. Sawcer and the International Multiple Sclerosis Genetics Consortium. Funding was provided by the US National Institutes of Health, the Wellcome Trust, the UK Multiple Sclerosis Society, the UK MRC, the US National Multiple Sclerosis Society, the Cambridge NIHR Biomedical Research Centre, DeNDRon, the Bibbi and Niels Jensens Foundation, the Swedish Brain Foundation, the Swedish Research Council, the Knut and Alice Wallenberg Foundation, the Swedish Heart-Lung Foundation, the Foundation for Strategic Research, the Stockholm County Council, Karolinska Institutet, INSERM, Fondation d'Aide à la Recherche sur la Sclérose en Plaques, Association Française contre les Myopathies, Infrastructures en Biologie Santé et Agronomie (GIS-IBISA), the German Ministry for Education and Research, the German Competence Network Multiple Sclerosis, Deutsche Forschungsgemeinschaft, Munich Biotec Cluster M4, the Fidelity Biosciences Research Initiative, Research Foundation Flanders, Research Fund KU Leuven, the Belgian Charcot Foundation, Gemeinnützige Hertie Stiftung, University Zurich, the Danish Multiple Sclerosis Society, the Danish Council for Strategic Research, the Academy of Finland, the Sigrid Juselius Foundation, Helsinki University, the Italian Multiple Sclerosis Foundation, Fondazione Cariplo, the Italian Ministry of University and Research, the Torino Savings Bank Foundation, the Italian Ministry of Health, the Italian Institute of Experimental Neurology, the Multiple Sclerosis Association of Oslo, the Norwegian Research Council, the South-Eastern Norwegian Health Authorities, the Australian National Health and Medical Research Council, the Dutch Multiple Sclerosis Foundation and Kaiser Permanente.

M. Evangelou is thanked for motivating the investigation of the FASLG association.

Author information

Authors and Affiliations



M.D.F. conceived and designed experiments, performed statistical analyses, analyzed data and wrote the manuscript. H.G. conceived and designed experiments. O.B. analyzed data, prepared data and maintained ImmunoBase. E.S. prepared data and maintained ImmunoBase. N.M.W. prepared data. M.B. and S.J.S. contributed multiple sclerosis data and interpreted results. J.B., J.W., A.B. and S.E. contributed rheumatoid arthritis data and interpreted results. J.A.T. analyzed data, contributed T1D data and wrote the manuscript. C.W. conceived and designed experiments, analyzed the data and wrote the manuscript. All authors reviewed and contributed to the final manuscript.

Corresponding authors

Correspondence to Mary D Fortune or Chris Wallace.

Ethics declarations

Competing interests

The JDRF/Wellcome Trust Diabetes and Inflammation Laboratory receives funding from Hoffmann La Roche and Eli Lilly and Company. ImmunoBase, for which O.B. is a principal investigator, is funded in part by Eli Lilly and Company.

Integrated supplementary information

Supplementary Figure 1 The hypotheses being tested by our two approaches.

(a) The hypotheses being tested by the Bayesian approach are represented as collections of configurations. Each configuration is represented by a line, and each circle represents one of the Q SNPs in a region under consideration. Yellow circles represent SNPs that are causal for disease 1; blue circles represent SNPs that are causal for disease 2. We assume that at most one SNP can be causal for each disease. (b) The proportional approach tests the null hypothesis of proportionality: given colocalization, we expect effect estimates at any set of SNPs to be proportional for the two traits. In this plot, each point represents the effect estimate for the two traits at a SNP. Under colocalization, these should lie on a straight line through the origin. (c) The proportional null hypothesis does not correspond to 0 from the Bayesian approach. The null hypothesis of proportionality corresponds to colocalization, single-disease association or association with neither disease. A failure to reject the null hypothesis could also be caused by insufficient power.

Supplementary Figure 2 τ, the probability of colocalization, given that both traits are associated with a region.

τ can be expressed as a function of the number of SNPs in the region and p12, the probability of any given SNP being associated to two traits (we assume that the probability of a SNP being associated to the first trait only is held constant at 10−4). The histogram shows the distribution in number of SNPs present over all regions analyzed. Superimposed upon this are lines showing τ for each of p12 = 10−5, p12 = 10−6 and p12 = 10−7. The dotted line shows τ = 0.50, which we believe to be a reasonable average value. From this, we conclude that p12 = 10−6 is the most appropriate value to use.

Supplementary Figure 3 A Manhattan plot of the 6q25.3 region containing candidate causal gene TAGAP.

There is strong evidence of colocalization between celiac disease (CEL) and multiple sclerosis (MS) (pp. ~ 0.94$). However, the proportional approach shows that the risk allele for celiac disease is protective for multiple sclerosis and vice versa.

Supplementary Figure 4 A Manhattan plot of the 19p13.2 region containing the candidate causal genes ICAM1, ICAM3 and TYK2.

The SNPs considered most likely to be causal by our analysis are highlighted. The green signal is shared by all diseases, whereas the magenta signal is unique to celiac disease (CEL).

Supplementary Figure 5 Information from the UCSC Genome Browser for the 1q24.3 FASLG region.

This region shows association with type 1 diabetes and celiac disease. Note that there is strong evidence of regulatory activity in the region of rs78037977, suggesting that this SNP may be significant.

Supplementary Figure 6 Signal clouds for rs78037977, a SNP within the 1q24.3 region containing candidate causal gene FASLG.

This SNP was removed from the celiac disease data in the original analysis owing to its failing a missingness check. However, the clustering shown here is of good quality, implying that the rs78037977 genotype can be considered reliable.

Supplementary Figure 7 A Manhattan plot of the 7p12.2 region containing the candidate causal gene IKZF1.

This gene overlaps two Immunochip regions separated by a recombination hotspot, one at the 5ʹ end and one at the 3ʹ end. The 5ʹ region contains a colocalized signal for multiple sclerosis (MS) and type 1 diabetes (T1D), whereas the 3ʹ end contains only a T1D signal.

Supplementary Figure 8 P values for type 2 diabetes at the peak SNP for all T1D-associated regions.

These regions are divided into those associated with T1D only and those associated with other autoimmune diseases. We see that those associated with no other autoimmune disease tend to have lower type 2 diabetes (T2D) P values. T2D data was taken from the stage 1 GWAS and stage 2 Metabochip study (summary statistics downloaded from

Supplementary Figure 9 P-value and colocalization data from the regions with newly identified associations.

The most significant SNP for the known association is found, and its P value for the newly identified association is computed. This is plotted against the posterior probability of colocalization (as computed using the Bayesian colocalization approach).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–9, Supplementary Tables 4 and 5, and Supplementary Note. (PDF 2234 kb)

Supplementary Table 1

The regions analyzed and the number of SNPs within each region (after quality control). (CSV 6 kb)

Supplementary Table 2

Detailed results from the two colocalization methods for each region/trait pair. (CSV 110 kb)

Supplementary Table 3

The results from the conditional Bayesian analysis. (CSV 7 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fortune, M., Guo, H., Burren, O. et al. Statistical colocalization of genetic risk variants for related autoimmune diseases in the context of common controls. Nat Genet 47, 839–846 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing