Abstract
Mendelian randomization (MR) is a valuable tool for detecting causal effects by using genetic variant associations. Opportunities to apply MR are growing rapidly with the increasing number of genome-wide association studies (GWAS). However, existing MR methods rely on strong assumptions that are often violated, leading to false positives. Correlated horizontal pleiotropy, which arises when variants affect both traits through a heritable shared factor, remains a particularly challenging problem. We propose a new MR method, Causal Analysis Using Summary Effect estimates (CAUSE), that accounts for correlated and uncorrelated horizontal pleiotropic effects. We demonstrate, in simulations, that CAUSE avoids more false positives induced by correlated horizontal pleiotropy than other methods. Applied to traits studied in recent GWAS studies, we find that CAUSE detects causal relationships that have strong literature support and avoids identifying most unlikely relationships. Our results suggest that shared heritable factors are common and may lead to many false positives using alternative methods.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Transcriptome-wide association study-derived genes as potential visceral adipose tissue-specific targets for type 2 diabetes
Diabetologia Open Access 04 August 2023
-
Physical activity and the risk of periodontitis: an instrumental variable study
Clinical Oral Investigations Open Access 13 June 2023
-
Genetic architecture of brain age and its causal relations with brain and mental disorders
Molecular Psychiatry Open Access 10 May 2023
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout





Data availability
All of the data analyzed are publicly available with the exception of blood pressure summary statistics from Ehret et al.28. These are available through dbGaP Accession phs000585.v2.p1. Download links for all other datasets are available in Supplementary Table 11. Instructions and code for formatting and processing data and reproducing CAUSE analysis results can be found on the website https://jean997.github.io/cause/.
Code availability
All software and analysis code is publicly available. The CAUSE method is implemented in an R package available through GitHub. The website https://jean997.github.io/cause/ includes pipelines and instructions for replicating all results presented in this paper. The CAUSE software (R package) can be found at https://github.com/jean997/cause. The simulations software (R package) can be found out https://github.com/jean997/causeSims.
Change history
29 May 2020
A Correction to this paper has been published: https://doi.org/10.1038/s41588-020-0655-9
References
Smith, G. D. & Ebrahim, S. ‘Mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease? Int. J. Epidemiol. 32, 1–22 (2003).
Smith, G. D. & Hemani, G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum. Mol. Genet. 23, 89–98 (2014).
Boef, A. G. C., Dekkers, O. M. & Le Cessie, S. Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int. J. Epidemiol. 44, 496–511 (2015).
Zheng, J. et al. Recent developments in Mendelian randomization studies. Curr. Epidemiol. Rep. 4, 330–345 (2017).
Burgess, S., Dudbridge, F. & Thompson, S. G. Combining information on multiple instrumental variables in Mendelian randomization: comparison of allele score and summarized data methods. Stat. Med. 35, 1880–1906 (2016).
Hemani, G., Bowden, J. & Davey Smith, G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum. Mol. Genet. 27, R195–R208 (2018).
Verbanck, M., Chen, C., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat. Genet. 50, 693–698 (2018).
Bowden, J., Smith, G. D. & Burgess, S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int. J. Epidemiol. 44, 512–525 (2015).
Barfield, R., Feng, H., Gusev, A., Wu, L. & Zheng, W. Transcriptome-wide association studies accounting for co-localization using Egger regression. Genet. Epidemiol. 42, 418–433 (2018).
Zhu, Z. et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat. Commun. 9, 224 (2018).
Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat. Genet. 47, 1236–1241 (2015).
Anttila, V. et al. Analysis of shared heritability in common disorders of the brain. Science 360, eaap8757 (2018).
Pickrell, J. K. et al. Detection and interpretation of shared genetic influences on 42 human traits. Nat. Genet. 48, 709–717 (2016).
O’Connor, L. J. & Price, A. L. Distinguishing genetic correlation from causation across 52 diseases and complex traits. Nat. Genet. 50, 1728–1734 (2018).
Burgess, S. & Thompson, S. G. Multivariable Mendelian randomization: the use of pleiotropic genetic variants to estimate causal effects. Am. J. Epidemiol. 181, 251–260 (2015).
Bowden, J., Davey Smith, G., Haycock, P. C. & Burgess, S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet. Epidemiol. 40, 304–314 (2016).
Hartwig, F. P., Smith, G. D. & Bowden, J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int. J. Epidemiol. 46, 1985–1998 (2017).
Vehtari, A., Gelman, A. & Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 27, 1413–1432 (2016).
Zhu, X. & Stephens, M. Bayesian large-scale multiple regression with summary statistics from genome-wide association studies. Ann. Appl. Stat. 11, 1561–1592 (2017).
Burgess, S., Butterworth, A. & Thompson, S. G. Mendelian randomization analysis with multiple genetic variants using summarized data. Genet. Epidemiol. 37, 658–665 (2013).
Hemani, G., Tilling, K. & Davey Smith, G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 13, e1007081 (2017).
Voight, B. F. et al. Plasma HDL cholesterol and risk of myocardial infarction: a Mendelian randomisation study. Lancet 380, 572–580 (2012).
Moordian, A. D. Dyslipidemia in type 2 diabetes mellitus. Nat. Clin. Pract. 5, 150–159 (2009).
Sattar, N. et al. Statins and risk of incident diabetes: a collaborative meta-analysis of randomised statin trials. Lancet 375, 735–742 (2010).
Crandall, J. P. et al. Statin use and risk of developing diabetes: results from the diabetes prevention program. BMJ Open Diabetes Res. Care 5, e000438 (2017).
Swerdlow, D. I. et al. HMG-coenzyme A reductase inhibition, type 2 diabetes, and bodyweight: evidence from genetic analysis and randomised trials. Lancet 385, 351–361 (2015).
Fall, T. et al. Using genetic variants to assess the relationship between circulating lipids and type 2 diabetes. Diabetes 64, 2676–2684 (2015).
Ehret, G. B. et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 478, 103–109 (2011).
Fulkerson, P. C. & Rothenberg, M. E. Targeting eosinophils in allergy, inflammation and beyond. Nat. Rev. Drug Discov. 12, 117–129 (2013).
Stephens, M. False discovery rates: a new deal. Biostatistics 18, 275–295 (2017).
Wen, X. & Stephens, M. Using linear predictors to impute allele frequencies from summary or pooled genotype data. Ann. Appl. Stat. 4, 1158–1182 (2010).
Acknowledgements
This work was supported by National Institutes of Health (NIH) grants MH110531 (to X.H.) and HG002585 (to M.S.), and a research grant from the March of Dimes (to X.H.).
Author information
Authors and Affiliations
Contributions
J.M. and X.H. conceived and designed the model. J.M. designed the algorithm, implemented the software, conducted analyses and performed simulations. J.M., X.H. and M.S. contributed to writing the manuscript. J.H.M. contributed to preparing GWAS data. N.K. contributed to software development and data preparation and computed LD.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 False positive-power trade-offs for different proportions of correlated pleiotropic variants.
We compare the power when \(\gamma = \sqrt {0.05}\) and q = 0 to the false positive rate when γ = 0, q varies from 0 to 0.5 and \(\eta = \sqrt {0.05}\). There are 100 simulations each in the causal and non-causal scenarios. Curves are created by varying the significance threshold. Points indicate the power and false positive rate achieved at a threshold of p = 0.05.
Extended Data Fig. 2 Tests for casual effects of risk factors on diseases.
Each cell summarizes the results of six methods for a pair of traits. In the left column of the cell, methods from bottom to top are CAUSE, IVW regression, and Egger regression. In the right column, methods from bottom to top are weighted median, weighted mode, and MR-PRESSO. Filled symbols indicate a nominally significant p < 0.05.
Extended Data Fig. 3 Tests for casual effects of disease outcomes on risk factors.
Tests for casual effects of disease outcomes on mediators. Each cell summarizes the results of six methods for a pair of traits. In the left column of the cell, methods from bottom to top are CAUSE, IVW regression, and Egger regression. In the right column, methods from bottom to top are weighted median, weighted mode, and MR-PRESSO. Filled symbols indicate a nominally significant p < 0.05.
Extended Data Fig. 4 Workflow of a CAUSE analysis.
Dashed boxes represent input data. Each solid box is an analysis step completed by the given function in the cause R package. LD pruning can be parallelized over chromosomes. Text at the bottom of boxes indicates user provided parameters and their default values. All analyses presented are run with default parameters.
Supplementary information
Supplementary Information
Supplementary Tables 1, 2, 6 and 7 and Note
Supplementary Tables
Supplementary Tables 3–5 and 8–11
Rights and permissions
About this article
Cite this article
Morrison, J., Knoblauch, N., Marcus, J.H. et al. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet 52, 740–747 (2020). https://doi.org/10.1038/s41588-020-0631-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41588-020-0631-4
This article is cited by
-
Genetic association and causal inference between lung function and venous thromboembolism
Respiratory Research (2023)
-
Shared genetic architecture between attention-deficit/hyperactivity disorder and lifespan
Neuropsychopharmacology (2023)
-
Reciprocal causation mixture model for robust Mendelian randomization analysis using genome-scale summary data
Nature Communications (2023)
-
Genetic architecture of brain age and its causal relations with brain and mental disorders
Molecular Psychiatry (2023)
-
Genetic analyses implicate complex links between adult testosterone levels and health and disease
Communications Medicine (2023)