Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics

A Publisher Correction to this article was published on 29 May 2020

This article has been updated


Mendelian randomization (MR) is a valuable tool for detecting causal effects by using genetic variant associations. Opportunities to apply MR are growing rapidly with the increasing number of genome-wide association studies (GWAS). However, existing MR methods rely on strong assumptions that are often violated, leading to false positives. Correlated horizontal pleiotropy, which arises when variants affect both traits through a heritable shared factor, remains a particularly challenging problem. We propose a new MR method, Causal Analysis Using Summary Effect estimates (CAUSE), that accounts for correlated and uncorrelated horizontal pleiotropic effects. We demonstrate, in simulations, that CAUSE avoids more false positives induced by correlated horizontal pleiotropy than other methods. Applied to traits studied in recent GWAS studies, we find that CAUSE detects causal relationships that have strong literature support and avoids identifying most unlikely relationships. Our results suggest that shared heritable factors are common and may lead to many false positives using alternative methods.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Assumptions of traditional MR and the CAUSE model.
Fig. 2: Performance of CAUSE and other MR methods in simulated data.
Fig. 3: False positives resulting from reverse causal effects.
Fig. 4: Effect-size estimates and variant-level contribution to CAUSE test statistics for four trait pairs.
Fig. 5: Tests for causal effects of blood cell composition on immune-mediated traits.

Data availability

All of the data analyzed are publicly available with the exception of blood pressure summary statistics from Ehret et al.28. These are available through dbGaP Accession phs000585.v2.p1. Download links for all other datasets are available in Supplementary Table 11. Instructions and code for formatting and processing data and reproducing CAUSE analysis results can be found on the website

Code availability

All software and analysis code is publicly available. The CAUSE method is implemented in an R package available through GitHub. The website includes pipelines and instructions for replicating all results presented in this paper. The CAUSE software (R package) can be found at The simulations software (R package) can be found out

Change history


  1. Smith, G. D. & Ebrahim, S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).

    Article  Google Scholar 

  2. Smith, G. D. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, 89–98 (2014).

    Article  Google Scholar 

  3. Boef, A. G. C., Dekkers, O. M. & Le Cessie, S. Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int. J. Epidemiol. 44, 496–511 (2015).

    Article  Google Scholar 

  4. Zheng, J. et al. Recent developments in Mendelian randomization studies. Curr. Epidemiol. Rep. 4, 330–345 (2017).

    Article  Google Scholar 

  5. Burgess, S., Dudbridge, F. & Thompson, S. G. Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat. Med. 35, 1880–1906 (2016).

    Article  Google Scholar 

  6. Hemani, G., Bowden, J. & Davey Smith, G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 27, R195–R208 (2018).

    Article  CAS  Google Scholar 

  7. Verbanck, M., Chen, C., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).

    Article  CAS  Google Scholar 

  8. Bowden, J., Smith, G. D. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).

    Article  Google Scholar 

  9. Barfield, R., Feng, H., Gusev, A., Wu, L. & Zheng, W. Transcriptome-wide association studies accounting for co-localization using Egger regression. Genet. Epidemiol. 42, 418–433 (2018).

    Article  Google Scholar 

  10. Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 9, 224 (2018).

    Article  Google Scholar 

  11. Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).

    Article  CAS  Google Scholar 

  12. Anttila, V. et al. Analysis of shared heritability in common disorders of the brain. Science 360, eaap8757 (2018).

    Article  Google Scholar 

  13. Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).

    Article  CAS  Google Scholar 

  14. O’Connor, L. J. & Price, A. L. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat. Genet. 50, 1728–1734 (2018).

    Article  Google Scholar 

  15. Burgess, S. & Thompson, S. G. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol. 181, 251–260 (2015).

    Article  Google Scholar 

  16. Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40, 304–314 (2016).

    Article  Google Scholar 

  17. Hartwig, F. P., Smith, G. D. & Bowden, J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int. J. Epidemiol. 46, 1985–1998 (2017).

    Article  Google Scholar 

  18. Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 27, 1413–1432 (2016).

    Article  Google Scholar 

  19. Zhu, X. & Stephens, M. Bayesian large-scale multiple regression with summary statistics from genome-wide association studies. Ann. Appl. Stat. 11, 1561–1592 (2017).

    PubMed  PubMed Central  Google Scholar 

  20. Burgess, S., Butterworth, A. & Thompson, S. G. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 37, 658–665 (2013).

    Article  Google Scholar 

  21. Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 13, e1007081 (2017).

    Article  Google Scholar 

  22. Voight, B. F. et al. Plasma HDL cholesterol and risk of myocardial infarction: a Mendelian randomisation study. Lancet 380, 572–580 (2012).

    Article  CAS  Google Scholar 

  23. Moordian, A. D. Dyslipidemia in type 2 diabetes mellitus. Nat. Clin. Pract. 5, 150–159 (2009).

    Google Scholar 

  24. Sattar, N. et al. Statins and risk of incident diabetes: a collaborative meta-analysis of randomised statin trials. Lancet 375, 735–742 (2010).

    Article  CAS  Google Scholar 

  25. Crandall, J. P. et al. Statin use and risk of developing diabetes: results from the diabetes prevention program. BMJ Open Diabetes Res. Care 5, e000438 (2017).

    Article  Google Scholar 

  26. Swerdlow, D. I. et al. HMG-coenzyme A reductase inhibition, type 2 diabetes, and bodyweight: evidence from genetic analysis and randomised trials. Lancet 385, 351–361 (2015).

    Article  CAS  Google Scholar 

  27. Fall, T. et al. Using genetic variants to assess the relationship between circulating lipids and type 2 diabetes. Diabetes 64, 2676–2684 (2015).

    Article  CAS  Google Scholar 

  28. Ehret, G. B. et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 478, 103–109 (2011).

    Article  CAS  Google Scholar 

  29. Fulkerson, P. C. & Rothenberg, M. E. Targeting eosinophils in allergy, inflammation and beyond. Nat. Rev. Drug Discov. 12, 117–129 (2013).

    Article  CAS  Google Scholar 

  30. Stephens, M. False discovery rates: a new deal. Biostatistics 18, 275–295 (2017).

    PubMed  Google Scholar 

  31. Wen, X. & Stephens, M. Using linear predictors to impute allele frequencies from summary or pooled genotype data. Ann. Appl. Stat. 4, 1158–1182 (2010).

    Article  Google Scholar 

Download references


This work was supported by National Institutes of Health (NIH) grants MH110531 (to X.H.) and HG002585 (to M.S.), and a research grant from the March of Dimes (to X.H.).

Author information

Authors and Affiliations



J.M. and X.H. conceived and designed the model. J.M. designed the algorithm, implemented the software, conducted analyses and performed simulations. J.M., X.H. and M.S. contributed to writing the manuscript. J.H.M. contributed to preparing GWAS data. N.K. contributed to software development and data preparation and computed LD.

Corresponding author

Correspondence to Xin He.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 False positive-power trade-offs for different proportions of correlated pleiotropic variants.

We compare the power when \(\gamma = \sqrt {0.05}\) and q = 0 to the false positive rate when γ = 0, q varies from 0 to 0.5 and \(\eta = \sqrt {0.05}\). There are 100 simulations each in the causal and non-causal scenarios. Curves are created by varying the significance threshold. Points indicate the power and false positive rate achieved at a threshold of p = 0.05.

Extended Data Fig. 2 Tests for casual effects of risk factors on diseases.

Each cell summarizes the results of six methods for a pair of traits. In the left column of the cell, methods from bottom to top are CAUSE, IVW regression, and Egger regression. In the right column, methods from bottom to top are weighted median, weighted mode, and MR-PRESSO. Filled symbols indicate a nominally significant p < 0.05.

Extended Data Fig. 3 Tests for casual effects of disease outcomes on risk factors.

Tests for casual effects of disease outcomes on mediators. Each cell summarizes the results of six methods for a pair of traits. In the left column of the cell, methods from bottom to top are CAUSE, IVW regression, and Egger regression. In the right column, methods from bottom to top are weighted median, weighted mode, and MR-PRESSO. Filled symbols indicate a nominally significant p < 0.05.

Extended Data Fig. 4 Workflow of a CAUSE analysis.

Dashed boxes represent input data. Each solid box is an analysis step completed by the given function in the cause R package. LD pruning can be parallelized over chromosomes. Text at the bottom of boxes indicates user provided parameters and their default values. All analyses presented are run with default parameters.

Supplementary information

Supplementary Information

Supplementary Tables 1, 2, 6 and 7 and Note

Reporting Summary

Supplementary Tables

Supplementary Tables 3–5 and 8–11

Rights and permissions

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Morrison, J., Knoblauch, N., Marcus, J.H. et al. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet 52, 740–747 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing