Understanding the assumptions underlying Mendelian randomization

de Leeuw, Christiaan; Savage, Jeanne; Bucur, Ioan Gabriel; Heskes, Tom; Posthuma, Danielle

doi:10.1038/s41431-022-01038-5

Review Article
Published: 26 January 2022

Understanding the assumptions underlying Mendelian randomization

European Journal of Human Genetics volume 30, pages 653–660 (2022)Cite this article

4403 Accesses
35 Citations
3 Altmetric
Metrics details

Subjects

Abstract

With the rapidly increasing availability of large genetic data sets in recent years, Mendelian Randomization (MR) has quickly gained popularity as a novel secondary analysis method. Leveraging genetic variants as instrumental variables, MR can be used to estimate the causal effects of one phenotype on another even when experimental research is not feasible, and therefore has the potential to be highly informative. It is dependent on strong assumptions however, often producing biased results if these are not met. It is therefore imperative that these assumptions are well-understood by researchers aiming to use MR, in order to evaluate their validity in the context of their analyses and data. The aim of this perspective is therefore to further elucidate these assumptions and the role they play in MR, as well as how different kinds of data can be used to further support them.

You have full access to this article via your institution.

Download PDF

Exome-wide analysis implicates rare protein-altering variants in human handedness

Article Open access 02 April 2024

Dick Schijven, Sourena Soheili-Nezhad, … Clyde Francks

Genome-wide association studies

Article 26 August 2021

Emil Uffelmann, Qin Qin Huang, … Danielle Posthuma

Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

Article 09 April 2024

Saori Sakaue, Kathryn Weinand, … Soumya Raychaudhuri

Introduction

Genetic research in the last two decades has taken an enormous flight, and a wealth of genetic data is now available for a wide variety of human phenotypes [1]. Besides providing ever-increasing insight into the genetic etiology of these phenotypes, it may provide an opportunity to study causal relations between these phenotypes as well.

Although causal inference is generally considered the domain of experimental methods like randomized controlled trials (RCT), some nonexperimental methods can be applied to estimate causal relations indirectly [2]. Though less robust, these can be used when RCTs are not a viable option. Mendelian Randomization (MR), a form of instrumental variable analysis that uses genetic variants as instruments to investigate causal relations between phenotypes, is one such method [3]. MR has become very popular in recent years, with thousands of methodological and applied MR studies published to date [4, 5], and with the continued growth of available genetic data this trend will likely persist.

MR relies on strong assumptions however, yielding biased and misleading results if those assumptions fail [6, 7]. Given the widespread popularity of MR, it is therefore imperative that these assumptions are clearly understood by the researchers using it, to allow them to properly evaluate the validity of these assumptions in the context of their own data and analyses [8,9,10].

The aim of this Perspective is to outline the assumptions that are needed to perform MR, what role those assumptions play in the analysis and its interpretation, and what information different elements of input data contribute to the support of these assumptions. Our aim is not to give an exhaustive overview of individual methods, but rather to elucidate the underlying logic of MR in its different forms. As such, we will also abstract away from issues pertaining to estimation, assuming an idealized scenario in which all associations between observed variables are fully known, examining what challenges remain even when estimation uncertainty is entirely eliminated.

Core principle

The aim of an MR analysis is to estimate and test the causal effect of a putative causal phenotype X, the exposure, on another phenotype Y, the outcome. It uses the principles of instrumental variable analysis to do so, with the genotype G_j of a genetic variant j serving as the instrument [8, 11].

To serve as a valid instrument for the causal effect of exposure on outcome, there must be an association between G_j and the exposure. Moreover, it must be the case that any association of G_j with the outcome is mediated by the exposure, as depicted in Fig. 1A. In other words, associations of G_j directly with the outcome, or with a variable C that acts as a confounder of exposure and outcome cannot be present (Fig. 1B). There is no requirement that G_j itself has a causal effect (see also Supplementary Information—Relevance assumption); if variant j is in LD with causal variants that are valid instruments, then G_j is a valid instrumental variable as well (Fig. 1C). For ease of notation however, the graphs used throughout the paper will assume the selected variants used are causal.

**Fig. 1: Graphical representation of valid instrument causal scenarios, for a variant j.**

If we assume the effect sizes of all associations and causal effects to be constant (i.e., simple linear relations), we can easily see how this can provide the parameter β_XY of the causal effect of the exposure on the outcome. Denoting the marginal associations of G_j with exposure and outcome as γ_Xj and γ_Yj respectively, for the assumed scenario in Fig. 1A we can express these as \(\gamma _{Xj} \,=\, \alpha _{Xj}\) and \(\gamma _{Yj} \,=\, \alpha _{Xj}\beta _{XY}\). Because the association γ_Yj between G_j and the outcome is fully mediated by the exposure, it equals the causal effect β_XY scaled by the causal effect \(\alpha _{Xj}\) of G_j on the exposure.

Thus, defining the ratio of marginal effects \(\beta _j \,=\, \frac{{\gamma _{Yj}}}{{\gamma _{Xj}}}\), it follows that if variant j is a valid instrument then \(\beta _j \,=\, \frac{{\alpha _{Xj}\beta _{XY}}}{{\alpha _{Xj}}} \,=\, \beta _{XY}\) [11]. In other words, the variant-specific causal effect α_Xj cancels out in the ratio of the marginal genetic effects, making β_j equal to the causal effect parameter β_XY for every variant that is a valid instrument. Although not every MR method is explicitly defined in terms of β_j, they all ultimately depend on this property. To examine the impact of different causal scenarios, we will thus focus on the functional form β_j takes in those scenarios, and whether it still equals β_XY.

We can thus obtain β_XY using any genetic variant for which the instrumental variable assumptions hold [12], since all such variants provide the same causal parameter. However, the a priori plausibility of these assumptions varies greatly, depending particularly on the exposure being studied, and establishing that the variants used are indeed valid instruments requires further analysis and data. As such it is crucial that active steps are taken to ensure that all assumptions are met, since reliable interpretation of MR results is otherwise impossible.

MR also generally depends on some additional assumptions [8, 13], which are listed in Table 1. Different methods may relax these additional assumptions in various ways so these are not always all required. In the next two sections, we will examine causal scenarios that violate the instrumental variable assumptions, and various strategies to deal with such violations, either by direct modeling and testing or by levering constrained data. Following that we discuss the role of the additional assumptions and what can happen if they do not hold. Throughout, we will use the simplest causal scenario that can illustrate the particular issue being discussed, rather than providing an exhaustive list of such scenarios. Additional discussion and mathematical details for these issues is found in the Supplemental Information. An overview of the main methods referenced is given in Table 2.

Table 1 Instrumental variable and other assumptions relevant for MR.

Full size table

Table 2 Overview of referenced methods.

Full size table

Evaluating instrumental variable assumptions

Heterogeneity of causal estimates

One common way in which the exclusion restriction can be violated is by a direct causal effect of the genetic variant on the outcome (Fig. 2A). The reason why this is a problem can be readily discerned when considering how this changes the functional form of the marginal association γ_Yj of the variant with the outcome, which becomes \(\gamma _{Yj} \,=\, \alpha _{Xj}\beta _{XY} \,+\, \alpha _{Yj}\) This means that the ratio parameter β_j now equals \(\beta _j \,=\, \frac{{\alpha _{Xj}\beta _{XY} \,+\, \alpha _{Yj}}}{{\alpha _{Xj}}} \,=\, \beta _{XY} \,+\, \frac{{\alpha _{Yj}}}{{\alpha _{Xj}}}\). The same thing happens in a scenario where there is LD between G_j and another variant G_k that has a causal effect on the outcome (Fig. 2B).

**Fig. 2: Graphical representation of several violations of instrumental variable assumptions, for a variant j.**

In other words, β_j becomes offset from the value of the true causal effect β_XY by a bias term specific to that variant. Although in this case we can no longer directly obtain the causal effect from β_j, the way this type of violation manifests itself makes it relatively straightforward to detect. Because this bias term is variant-specific it will tend to differ across (independent) variants, resulting in a heterogeneity of their β_j values (see also Supplementary Information—heterogeneity of estimated causal effects). By contrast, for a set of variants that are all valid instruments, their β_j will be the same, because as noted above they will all equal the causal effect parameter β_XY.

Given this, if we have multiple variants available as potential genetic instruments, an obvious and commonly used way to leverage this is therefore to test for heterogeneity of the β_j. Then, if such heterogeneity is found to be present, we can prune away variants from the selection until we retain a subset of variants with homogeneous β_j. In this way we can rule out violations of the exclusion restriction of the kind depicted in Fig. 2A, B, and under the assumption that the remaining variants are valid instruments we can use those variants to obtain β_XY as before [14,15,16].

An alternative to explicit heterogeneity testing and pruning is to use “robust” models for multivariant MR analysis, which do not require that all variants used for their input are valid instruments (see also Supplementary Information—robust methods). These subdivide into two main types. The first type assumes that only a subset of the variants used are valid instruments, and take either a median- or mode-based approach. Median-based methods only require that more than half of the variants are valid instruments, which guarantees that the median of the β_j equals β_XY [17]. Mode-based methods make an even weaker assumption, only requiring that the largest subset of variants with homogeneous β_j consists of valid instruments, in which case the mode of the β_j will equal β_XY [18,19,20].

The second type of robust model does not require that any variant is a valid instrument. Instead, it models the marginal association of each variant with the outcome as \(\gamma _{Yj} \,=\, \gamma _{Xj}\beta _{XY} \,+\, \delta _j\) with a heterogeneity term δ_j, and then makes an assumption about the distribution of these δ_j. The most prominent example of this second type is the MR-Egger model [21], which is based on the so-called InSIDE (Instrument Strength Independent of Direct Effect) assumption. This assumption states that these δ_j terms are independent of the marginal associations γ_XJ of the variant with the exposure, and based on this the MR-Egger model can estimate \(\beta _{XY}\) using essentially a linear regression of \(\gamma _{Yj}\) on γ_XJ. For valid instruments this assumption is automatically true, since \(\delta _j\) is zero, and for a scenario such as in Fig. 2A it is very plausible as well: in that case, \(\gamma _{Xj} \,=\, \alpha _{Xj}\) and \(\delta _j \,=\, \alpha _{Yj}\), and since \(\alpha _{Xj}\) and \(\alpha _{Yj}\) represent two distinct causal paths that share no mediating variables there is no clear mechanism by which they would become correlated.

Robust methods can thus in principle directly estimate the causal effect from a mixture of valid and invalid instruments, but this requires specific assumptions about the degree or structure of the heterogeneity, which are not directly testable. Even when using such robust methods, it is therefore still imperative that the heterogeneity, and the validity of the assumptions made about it (with specific valid subsets of variants present in the data for median- and mode-based methods, or the independence specified by InSIDE for MR-Egger), are explicitly considered.

Moreover, homogeneity of the \(\beta _j\) does not imply that the instrumental variable assumptions (or the InSIDE assumption) do hold, since there are other causal scenarios that violate the assumptions without resulting in heterogeneity. For the remainder of the paper, we will therefore generally assume that heterogeneity has been dealt with, and focus on scenarios where all variants used correspond to the same homogeneous causal graph, and with \(\beta _j\) equal to the same value \(\beta\).

Reverse causation

The “reverse causation” scenario is illustrated in Fig. 2C, the mirror image of Fig. 1A, with the genetic variant now exerting a direct causal effect on the outcome, which in turn has a causal effect on the exposure. This is also a violation of the exclusion restriction, but unlike in Fig. 2A, B this does not result in heterogeneity. This is because the marginal genetic associations of the variant are \(\gamma _{Xj} \,=\, \alpha _{Yj}\beta _{YX}\) and \(\gamma _{Yj} \,=\, \alpha _{Yj}\), which means that \(\beta \,=\, \frac{{\alpha _{Yj}}}{{\alpha _{Yj}\beta _{YX}}} \,=\, \frac{1}{{\beta _{YX}}}\), the inverse of the causal effect of the outcome on the exposure. As such, the value of \(\beta\) we would get in this scenario is completely different from the \(\beta _{XY}\) we are attempting to estimate, which in this case is simply zero. The InSIDE assumption also does not hold here, since the heterogeneity term \(\delta _j \,=\, \alpha _{Yj}\), meaning that both \(\delta _j\) and \(\gamma _{Xj}\) are dependent on the same parameter \(\alpha _{Yj}\).

When the genetic effect on the outcome is fully mediated by the exposure as in Fig. 1A, it follows that the correlations between the variant and the outcome are weaker than those between the variant and the outcome; unless the exposure fully determines the outcome in which case the correlations are equal. In case of reverse causation, as in Fig. 2C, the opposite is true, with the correlations between variant and exposure being weaker than those between variant and outcome. For Fig. 1A, since in our notation all variables are standardized, the correlations of the variant with the exposure and outcome equal the genetic associations \(\gamma _{Xj}\) and \(\gamma _{Yj}\) respectively, and the standardization also means that the absolute value of all causal parameters is at most one as well, including \(\beta _{XY}\). Since as previously noted \(\gamma _{Yj} \,=\, \gamma _{Xj}\beta _{XY}\), the absolute value of \(\gamma _{Yj}\) must therefore be smaller than (or at most equal to) that of \(\gamma _{Xj}\).

It is therefore generally possible to infer direction from the relative size of these correlations, or more directly from the causal estimate itself. In case of reverse causation \(\beta _j \,=\, \frac{1}{{\beta _{YX}}}\), which (since \(|\beta _{YX}|\) is at most 1) will have an absolute value greater than or equal to 1. As such we can decide between forward and reverse causation by determining whether \(\beta _j\) is smaller or greater than 1. This can be assessed manually by running MR analyses in both directions or using a model that incorporates both [22, 23]. Moreover, depending on the choice of exposure and outcome we will often already have strong a priori information about the causal direction, and in some cases reverse causation is inherently impossible because the exposure is known to occur before the outcome. In this regard, resolving the order of causation is often relatively straightforward in practice.

However, these methods and a priori information can only help to decide between forward and reverse causation as long as the independence assumption holds, and it is thus presumed that one of these two scenarios is correct. This therefore still requires ruling out the possibility of genetic effects on exposure and outcome being mediated by one of their confounders.

Analysing potential confounders

Two variations of what we will refer to as “mediated confounding” are depicted in Fig. 2D, E, with a causal effect \(\alpha _{Cj}\) of the variant on a confounder \(C\), violating the independence assumption. These scenarios result in a \(\beta\) value of \(\beta _{XY} \,+\, \frac{{\beta _{CY}}}{{\beta _{CX}}}\) (with \(\beta _{XY} \,=\, 0\) for Fig. 2D), demonstrating a bias away from the true causal effect of the exposure on the outcome. The InSIDE assumption is violated here as well, with both \(\gamma _{Xj} \,=\, \alpha _{Cj}\beta _{CX}\) and \(\delta _j \,=\, \alpha _{Cj}\beta _{CY}\) dependent on \(\alpha _{Cj}\). Note that these scenarios are specific to the particular confounder \(C\), and there may be other sets of variants operating on different confounder variables, with correspondingly different biases.

Because the \(\beta _{XY} \,+\, \frac{{\beta _{CY}}}{{\beta _{CX}}}\) term can take any value that \(\beta _{XY}\) itself can take, it is impossible to rule out mediated confounding scenarios using just the genetic associations with exposure and outcome. Some methods have been developed that use a mixture model approach to explicitly include a mediated confounding component in their model, such as CAUSE [24] which assumes that the variants used are a mixture of ones conforming to Fig. 2A and others conforming to Fig. 2F. LHC-MR [23] offers an even more general model also allowing for reverse causation. However, the problem remains that for any forward causation scenario as in Fig. 2A, it is possible to formulate parameter values for the mediated confounding scenario like in Fig. 2F that result in an identical pattern of genetic associations. As such, the components of these mixture models that are assumed to capture forward causation may still be capturing mediated confounding instead (see also Supplementary Information—whole-genome methods).

Additional data is therefore required to resolve the issue of mediated confounding. If genetic associations conditioning on a putative confounder variable \(C\) are available for both exposure and outcome, evaluating and correcting for that particular \(C\) is relatively straightforward. If this \(C\) is indeed mediating (part of) the effect of the variants on the exposure and outcome, adding \(C\) as a covariate to compute the conditional associations will remove this confounding effect from a subsequent MR analysis based on them. Similarly, if separate GWAS results for a possible confounder \(C\) are available, these can be used to obtain corrected MR estimates. This can be accomplished by either first correcting the \(\gamma _{Xj}\) and \(\gamma _{Yj}\) and then performing a regular MR analyis [25], or by using an MR-Egger style regression approach, essentially regressing \(\gamma _{Yj}\) on both \(\gamma _{Xj}\) and \(\gamma _{Cj}\) (the genetic associations with the possible confounder) simultaneously. The latter approach can be considered a form of multiple-exposure model, treating \(C\) as a second exposure potentially correlated with \(X\) [26]. Note that both correction using \(C\) directly or based on the \(\gamma _{Cj}\) is susceptible to collider bias when \(C\) is not a confounder [27], which therefore needs to be considered when using such methods (see also Supplementary Information—mediated confounding).

Although approaches like these can be effective in detecting and correcting for effects mediated by confounders, the obvious limiting factor is that this requires the potential confounders to be explicitly tested. If no data is available for a particular confounder, or if it was simply not considered as a potential confounder in the analysis, its effects will not have been accounted for. This poses a major challenge, since any confounder of the exposure and outcome is itself almost certainly heritable, and any variant directly associated with that confounder will also have associations with the exposure and outcome mediated by that confounder.

This implies that in practice all (potential) confounders of the exposure and outcome would need to be considered and evaluated in an MR context. This is particularly problematic with confounding endophenotypes such as those involved in specific biological pathways and processes, as their causal effects on exposure and outcome may be specific to a particular context such as a tissue or developmental time period, and measurements of such confounders would therefore need to be specific to that context as well.

Leveraging constrained data

Negative control populations

MR has sometimes been compared to RCTs, drawing a parallel between the random inheritance of alleles from parents to offspring and the randomized assignment of study participants to treatment groups, with the exposure taking the role that the actual treatment has in RCT [28]. However, this analogy is problematic, because although part of the inferential strength of RCT comes from random assignment of individuals to groups, such randomization only deals with pre-existing differences between individuals in the trial. Potential confounding that occurs after assignment remains a constant challenged even in RCT and must accounted for in the experimental design, by using well-designed control groups and strictly controlling other experimental and background variables. This level of control does not exist in the MR context, and since the exposure occurs at an unknown time possibly many years after the “randomized assignment” (and measurement of the exposure and outcome typically happens even later still), there is ample opportunity for confounding to arise.

An MR approach that more closely mimics the structure of RCT however, is the use of negative control populations [13, 29]. A negative control population is one where the exposure is constrained to a particular value, but that in other respects matches the population from which the main MR data for was derived (i.e., the relations between all relevant variables are the same). An example of this is alcohol consumption as the exposure, using a population where people do not drink alcohol due to religious or cultural taboo as control [30]. A negative control population does need to have an actual constraint on the exposure; simply selecting a subset of a population for whom the exposure is zero does not work, as this would lead to collider bias (see Supplementary Information—negative control populations).

Because in such a control population the exposure does not vary, causal effects involving that exposure are essentially blocked. The constraint on the exposure stops other variables from affecting the exposure, and stops the exposure from affecting other variables. Genetic association between a variant and the outcome in this control population therefore only consists of effects not mediated by the exposure, and thus should be zero for valid genetic instruments like in Fig. 1A. Testing the genetic association between variants and the outcome can thus serve to validate them as instruments, provided the control sample is sufficiently well-powered.

This approach can be further extended to determine how much of the genetic association with the outcome \(\gamma _{Yj}\) is not mediated by the exposure (with some restrictions, see Supplementary Information—negative control populations) [31]. Modeling this genetic association as \(\gamma _{Yj} \,=\, \gamma _{Xj}\beta _{XY} \,+\, \delta _j\), similar to MR-Egger, this can essentially provide a direct estimate of the heterogeneity term \(\delta _j\) for each individual variant \(j\). With that, it becomes possible to obtain a corrected genetic association \(\gamma _{Yj} \,-\, \delta _j\), by subtracting out the heterogeneity from the overall association, and then using this corrected \(\gamma _{Yj}\) to perform MR analysis. However, although potentially quite powerful, using negative control populations in this way is also vulnerable to bias, since this will create a hidden bias if the assumptions of the negative control population fail. This is in contrast to using negative control populations to determine validity of variants as an instrument, which will instead only tend to generate false negatives (rejecting valid instruments as invalid) if the negative control population assumptions do not hold.

Other forms of constrained data

Using negative control populations leverages natural constraints on data to provide a means of validating the instrumental variable assumptions that does not require explicit testing of individual confounders. Other approaches that utilize such constraints can be employed as well, and a prime example of this is the use of longitudinal data, for either exposure, outcome, or both. Use of such data allows the timing of the causally relevant exposure and of the causal effects to be narrowed down.

If for example we have two measurements of the exposure, as in Richardson et al. [32], there are three main scenarios to consider: a direct causal effect on the outcome only by the early exposure \(X_1\) (Fig. 3A), only by the late exposure \(X_2\) (Fig. 3B), or by both (Fig. 3C). This can be resolved by a set of three MR analyses, including one that has \(X_2\) as the exposure with a set of variants such as in Fig. 3D that only affect the later exposure. Here, the early exposure essentially functions as a baseline value, allowing us to identify variants that only affect the change in exposure that occurred since the first time point (see also Supplementary Information—longitudinal data).

**Fig. 3: Graphical representation of scenarios involving longitudinal data and imperfect measurement of variables, for a variant j.**

This process can be generalized to more than two time points, allowing for better determination of the likely timing of the causal effects. If longitudinal measurements of the outcome are available, these can be used in the same way to narrow down the timing. Moreover, for later time points these models can be interpreted as conditioning on the value of the exposure or outcome at an earlier time point, which would block any confounder-mediated genetic effects that occurred prior to that time point from affecting the estimate of \(\beta _{XY2}\) [33]. Although confounders may still be present for the later time points (acting e.g., on \(X_2\) and \(Y\) in Fig. 3A), this is restricted to a more limited time window, making it easier to identify likely confounders and correct for them.

Another way of leveraging known constraints on data is the use of positive and negative control outcomes: outcomes which already have strong evidence that they respectively are or are not causally influenced by the exposure, which can be used to evaluate the validity of candidate genetic instruments [8, 34]. Positive control outcomes are subject to a causal effect of the exposure, and as such any variants causally acting on the exposure must be affecting such control outcomes as well. As such, if the variants used in our MR analysis show no association with this positive control outcome, beyond what could be explained by possible lack of statistical power, this suggests that the variants used do not in fact have such a causal effect on the exposure. Similarly, if we perform an MR analysis with a negative control outcome that should not be causally affected by the exposure, and the analysis suggests that there actually is a causal effect on that negative control outcome, this casts doubt on the validity of the variants used as genetic instruments.

Relaxing the additional assumptions

The causal graph in Fig. 1A is a common way of depicting the instrumental variable assumptions central to MR, clearly showing the causal paths that need to be either present or absent for the standard analysis to work. Less explicit in this graph are some of the additional assumptions implied by it, listed in Table 1, that the analysis depends on as well. These assumptions can be condensed to two general constraints: first, that the causal graph applies in the same way to every individual used in the analysis, both in its structure and in the value of the causal effect sizes; and second, that the variables as we have measured them in our data, correspond to the true causal variables depicted in the graph without bias or error. In this section we will discuss scenarios in which these assumptions may not hold, and the implications of this for the MR analysis.

Variable effect sizes across samples

In the commonly used two-sample approach to MR analysis, variable effect sizes can potentially occur and pose a problem when the genetic associations \(\gamma _{Xj}\) and \(\gamma _{Yj}\) are obtained from samples each derived from different populations with different values for the causal parameters in Fig. 1A. As described, MR works on the core premise that \(\gamma _{Xj} \,=\, \alpha _{Xj}\) and \(\gamma _{Yj} \,=\, \alpha _{Xj}\beta _{XY}\), and that therefore the variant-specific part \(\alpha _{Xj}\) will cancel out when we take their ratio \(\beta _j \,=\, \frac{{\gamma _{Yj}}}{{\gamma _{Xj}}}\), leaving only \(\beta _{XY}\). But this will fail if the value of \(\alpha _{Xj}\) in the population from which the exposure GWAS was drawn, differs from the value of \(\alpha _{Xj}\) in the population that the outcome GWAS was based on, resulting in \(\beta _j\) being biased away from \(\beta _{XY}\).

The extent to which this is a problem will depend on the way the MR analysis is conducted. The biases produced by this scenario will usually cause heterogeneity of the \(\beta _j\), and as such it should be possible to detect and remove the affected variants (see also Supplementary Information—variable effect sizes). The MR-Egger style models are more susceptible to this issue, as the average bias will tend to end up in their estimate of \(\beta _{XY}\), which may go unnoticed unless these are used in conjunction with other types of models. Differences in \(\alpha _{Cj}\) across the populations from which GWAS data was drawn will pose similar problems when using additional GWAS data with a putative confounder \(C\) as outcome to correct for confounding.

A similar issue can arise even when all data is taken from the same population, if the GWAS samples are subject to explicit or implicit selection criteria. If these criteria differ between the exposure and outcome GWAS, this can lead to the same kind of issue as between different populations described above, if the \(\alpha _{Xj}\) differ between the selected subpopulations. Moreover, selection effects occurring in the GWAS sample for the outcome also have the potential to result in collider bias, because selection implicitly conditions on the variables being selected on [27, 35]. For example, the outcome may be measured specifically in older individuals, thus selecting for individuals who have survived to that age [36] and resulting in collider bias if the exposure causally affects life expectancy and there are any confounders of the relation between the exposure and outcome [37] (see also Supplementary Information—variable effect sizes). This sort of bias will not generally result in any heterogeneity in the \(\beta _j\), as it will affect every variant in proportionally the same way. Addressing it will therefore often require identifying relevant selection processes and evaluating whether the specific variables involved may be causing collider bias.

Variable effect sizes within samples

Effect sizes may also vary across individuals within a population, due to for example interactions of causal variants with other variables. In this case, different individuals in the population have a different value of \(\alpha _{Xj}\), depending on their score on the interactor variable. In practice, the genetic associations \(\gamma _{Xj}\) would reflect an average of these different \(\alpha _{Xj}\) values across the levels of the interactor variable. The \(\gamma _{Yj}\) are based on this average \(\alpha _{Xj}\), and thus as long as the distribution of the interactor variable is the same in both samples this will still cancel out in the ratio \(\beta _j \,=\, \frac{{\gamma _{Yj}}}{{\gamma _{Xj}}}\). On the other hand, if for example the mean of the interactor is greater in one of the samples, this no longer holds. In that case however, as with the differences in \(\alpha _{Xj}\) across samples described above, it should result in heterogeneous \(\beta _j\), and can therefore be addressed by careful application of heterogeneity testing and modeling.

It is possible for the \(\beta _{XY}\) parameter itself to vary across individuals as well, with different causal effect sizes for different individuals in the population. This can arise as an interaction effect with another variable but also as a non-linear effect of the exposure, which can be seen as essentially an interaction of the exposure with itself. In effect, the value of \(\beta _{XY}\) that MR would estimate in this case is an average of the different \(\beta _{XY}\) values across the levels of the interactor variable. In this sense, this therefore does not substantially affect the MR analysis, since such an average causal effect is still generally interpretable and informative of the relation between exposure and outcome. It can make it somewhat more difficult to generalize however, since this average \(\beta _{XY}\) would be potentially quite different in other populations if the distribution of the interaction variable in that population substantially differs from that in the population from which the outcome GWAS sample was drawn.

Imperfectly observed variables

In the graphs in Figs. 1 and 2 it is implicitly assumed that the observed variables we use in the GWAS, the exposure and outcome, as well as putative confounder variables we may be trying to evaluate, are sufficiently good proxies for the causally relevant variables. Yet this can fail to be the case for a variety of reasons [38, 39]. There could be simple measurement or diagnostic error, where the observed variables in the data are a noisy representation of the variables of interest. The causal graph in Fig. 3E depicts a scenario like this, with the true exposure of interest \(X\) now unobserved, and with a noisy observed exposure variable \(X_{obs}\) from which the genetic associations \(\gamma _{Xj}\) are estimated. Such situations often also arise when using binary variables, such as a medical diagnosis or a dichotomized continuous variable (e.g., hypertension as dichotomized blood pressure) [40], where the relevant causal effects are likely related to the underlying biological state rather than with the diagnosis or dichotomized value.

This is can arise from more systematic causes as well. It is possible that the context in which the variable was observed does not sufficiently match that of its causally relevant instance: if for instance we use gene expression as our exposure, it may well be that the tissue in which that gene’s expression causally affects the outcome is different from the tissue in which the exposure variable we are using in our analysis is measured. Similarly, there may be differences in timing and developmental period, or environmental triggers, or the observed variable may have a complex internal structure, with the causal effect only pertaining to a subtype or subscale of that variable. In case of large differences between the developmental timing of the causal effect of the exposure and when the exposure was measured, processes such as canalization and behavioral adaptive responses may also have amplified or dampened the changes induced by earlier causal effects [10, 41].

Regardless of the underlying mechanism, in a scenario such as in Fig. 3E where the “true” exposure \(X\) is imperfectly represented by the observed exposure \(X_{obs}\), the causal effect we would estimate becomes biased away from \(\beta _{XY}\). For the exposure the genetic effect changes to \(\gamma _{Xj} \,=\, \alpha _{Xj}\beta _{XO}\), and as such the ratio \(\beta _j \,=\, \frac{{\gamma _{Yj}}}{{\gamma _{Xj}}}\) becomes \(\frac{{\beta _{XY}}}{{\beta _{XO}}}\). Depending on the nature of the relation between the “true” and observed variables, the value we get may therefore differ considerably from the true value of \(\beta _{XY}\) (see also Supplementary Information—imperfectly observed variables). Note that this issue of imperfectly observed variables is not unique to MR, and would pose a problem even in the context of RCT.

All these same mechanisms can operate on the outcome as well, as depicted in Fig. 3F, in which case \(\beta _j\) will be \(\beta _{XY}\beta _{YO}\). Although this does affect interpretation, the value we are estimating does still represent a legitimate causal effect, in contrast to Fig. 3E where the causal structure would be misspecified. If for example our intended outcome is true schizophrenia status, and the \(Y_{obs}\) we use is diagnosis of schizophrenia, the causal effect we would obtain is that of our exposure on schizophrenia diagnosis, and as such does have a meaningful interpretation, even if it does not give us an estimate of the causal effect on true schizophrenia status. In this regard, full observation of the exposure is considerably more crucial than full observation of the outcome.

It should also be noted that a further consequence of such issues is that it may no longer be possible to distinguish forward and reverse causation in the way described above [39], since the parameter constraints upon which this would be based would no longer apply in the same way. Similarly, imperfect observation of a putative confounder \(C\) will also tend to render corrections of confounding effects only partially effective, not fully removing the confounding effect. Other approaches for evaluating these alternative causal scenarios would therefore need to be employed.

A somewhat related issue is that even if the observed exposure is in fact a good proxy for the causally relevant exposure, it may also be a good proxy for any number of other instances of the exposure. For example, if the expression of a particular gene is relatively stable across various tissues, the expression in a specific tissue will likely be a good proxy for expression in other tissues. As such, even if we use expression in that tissue as the exposure, we cannot know if the causal effect \(\beta _{XY}\) is indeed specific to that tissue. Similarly, we also generally do not know other aspects of the exposure such as the dosage, duration and frequency, also limiting the specificity of our conclusions [10, 41, 42].

Conclusion

In this Perspective we have outlined how the different assumptions and elements of the data figure into an MR analysis. This outline is not exhaustive, but should provide further insight in how the different components of MR fit together, on both a mathematical and conceptual level. Throughout this paper we have entertained the hypothetical that we know all true associations, focusing specifically on the challenges that remain even in such an idealized scenario. These challenges become substantially harder when having to deal with all the uncertainty in the estimates as well.

As we have shown, causal inference with MR strongly depends on its assumptions. When performing an MR study, it is thus crucial that the validity of these assumptions is examined for each specific analysis, with all alternative scenarios can be carefully considered and ruled out as much as possible. Consequently, performing a reliable MR study requires a considerable investment of time and effort, and access to high quality data for both exposures and outcomes. Despite all its complications however, a well-executed MR study can be a valuable tool in providing greater insight in the relations between our phenotypes. Moreover, the data we have available continues to improve, with more detailed measurements of phenotypes in ever larger biobanks, and rapid innovation in new data and technologies in molecular genetics. With this growth of our data, and our understanding of phenotypes, opportunities for well-designed MR studies will continue to improve.

References

Mills MC, Rahal C. A scientometric review of genome-wide association studies. Commun Biol. 2019;2:9.
Article Google Scholar
Pearl J. Causal inference in statistics: an overview. Stat Surv. 2009;3:96–146.
Article Google Scholar
Davey Smith G, Hemani G. Mendelian randomization: genetic anchors for causal inference in epidemiological studies. Hum Mol Genet. 2014;23:R89–98.
Article CAS Google Scholar
von Hinke Kessler Scholder S, Smith GD, Lawlor DA, Propper C, Windmeijer F. Mendelian randomization: the use of genes in instrumental variable analyses. Health Econ. 2011;20:893–6.
Article Google Scholar
Sleiman PMA, Grant SFA. Mendelian randomization in the era of genomewide association studies. Clin Chem. 2010;56:723–8.
Article CAS Google Scholar
Haycock PC, Burgess S, Wade KH, Bowden J, Relton C, Smith GD. Statistical commentary best (but oft-forgotten) practices: the design, analysis, and interpretation of Mendelian randomization studies. Am J Clin Nutr. 2016;103:965–78.
Article CAS Google Scholar
Lousdal ML. An introduction to instrumental variable assumptions, validation and estimation. Emerg Themes Epidemiol. 2018;15:1.
Article Google Scholar
Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D, Glymour MM, et al. Guidelines for performing Mendelian randomization investigations. Wellcome Open Res. 2020;4:186.
Article Google Scholar
Skrivankova VW, Richmond RC, Woolf BAR, Davies NM, Swanson SA, VanderWeele TJ, et al. Strengthening the reporting of observational studies in epidemiology using mendelian randomisation (STROBE-MR): explanation and elaboration. BMJ 2021;375:n2233.
Article Google Scholar
Burgess S, Butterworth AS, Thompson JR. Beyond Mendelian randomization: How to interpret evidence of shared genetic predictors. J Clin Epidemiol. 2016;69:208–16.
Article Google Scholar
von Hinke S, Davey Smith G, Lawlor DA, Propper C, Windmeijer F. Genetic markers as instrumental variables. J Health Econ. 2016;45:131–48.
Article Google Scholar
Teumer A. Common methods for performing Mendelian randomization. Front cardiovascular Med. 2018;5:51.
Article Google Scholar
Hemani G, Bowden J, Davey Smith G. Evaluating the potential role of pleiotropy in Mendelian randomization studies. Hum Mol Genet. 2018;27:R195–208.
Article CAS Google Scholar
Zhu Z, Zheng Z, Zhang F, Wu Y, Trzaskowski M, Maier R, et al. Causal associations between risk factors and common diseases inferred from GWAS summary data. Nat Commun. 2018;9:224.
Article Google Scholar
Dai JY, Peters U, Wang X, Kocarnik J, Chang-Claude J, Slattery ML, et al. Diagnostics for pleiotropy in Mendelian randomization studies: global and individual tests for direct effects. Am J Epidemiol. 2018;187:2672–80.
Article Google Scholar
Verbanck M, Chen C-Y, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases. Nat Genet. 2018;50:693–8.
Article CAS Google Scholar
Bowden J, Davey, Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40:304–14.
Article Google Scholar
Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46:1985–98.
Article Google Scholar
Burgess S, Zuber V, Gkatzionis A, Foley CN. Modal-based estimation via heterogeneity-penalized weighting: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid. Int J Epidemiol. 2018;47:1242–54.
Article Google Scholar
Qi G, Chatterjee N. Mendelian randomization analysis using mixture models for robust and efficient estimation of causal effects. Nat Commun. 2019;10:1941.
Article Google Scholar
Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32:377–89.
Article Google Scholar
Bucur IG, Claassen T, Heskes T. Inferring the direction of a causal link and estimating its effect via a Bayesian Mendelian randomization approach. Stat Methods Med Res. 2020;29:1081–111.
Article Google Scholar
Darrous L, Mounier N, Kutalik Z. Simultaneous estimation of bi-directional causal effects and heritable confounding from GWAS summary statistics. Genet Genom Med. 2020. http://medrxiv.org/lookup/doi/10.1101/2020.01.27.20018929.
Morrison J, Knoblauch N, Marcus JH, Stephens M, He X. Mendelian randomization accounting for correlated and uncorrelated pleiotropic effects using genome-wide summary statistics. Nat Genet. 2020;52:740–7.
Article CAS Google Scholar
Cho Y, Haycock PC, Sanderson E, Gaunt TR, Zheng J, Morris AP, et al. Exploiting horizontal pleiotropy to search for causal pathways within a Mendelian randomization framework. Nat Commun. 2020;11:1010.
Article CAS Google Scholar
Rees JMB, Wood AM, Burgess S. Extending the MR-Egger method for multivariable Mendelian randomization to correct for both measured and unmeasured pleiotropy. Stat Med. 2017;36:4705–18.
Article Google Scholar
Gkatzionis A, Burgess S. Contextualizing selection bias in Mendelian randomization: how bad is it likely to be? Int J Epidemiol. 2019;48:691–701.
Article Google Scholar
Swanson SA, Tiemeier H, Ikram MA, Hernán MA. Nature as a trialist?: deconstructing the analogy between Mendelian randomization and randomized trials. Epidemiology. 2017;28:653–9.
Article Google Scholar
Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls. Epidemiology. 2010;21:383–8.
Article Google Scholar
Chen L, Davey Smith G, Harbord RM, Lewis SJ. Alcohol intake and blood pressure: a systematic review implementing a Mendelian randomization approach. PLoS Med. 2008;5:e52.
Van Kippersluis H, Rietveld CA. Pleiotropy-robust Mendelian randomization. Int J Epidemiol. 2018;47:1279–88.
Article Google Scholar
Richardson TG, Sanderson E, Elsworth B, Tilling K, Smith GD. Use of genetic variation to separate the effects of early and later life adiposity on disease risk: mendelian randomisation study. BMJ 2020;369:m1203.
Article Google Scholar
Streeter AJ, Lin NX, Crathorne L, Haasova M, Hyde C, Melzer D, et al. Adjusting for unmeasured confounding in nonrandomized longitudinal studies: a methodological review. J Clin Epidemiol. 2017;87:23–34.
Article Google Scholar
Sanderson E, Richardson T, Hemani G, Smith GD. The use of negative control outcomes in Mendelian Randomisation to detect potential population stratification or selection bias. bioRxiv. 2020. https://doi.org/10.1101/2020.06.01.128264.
Hughes RA, Davies NM, Davey Smith G, Tilling K. Selection bias when estimating average treatment effects using one-sample instrumental variable analysis. Epidemiology. 2019;30:350–7.
Article Google Scholar
Smit RAJ, Trompet S, Dekkers OM, Jukema JW, Le, Cessie S. Survival bias in Mendelian randomization studies: a threat to causal inference. Epidemiology. 2019;30:813–6.
Article Google Scholar
Swanson SA. A practical guide to selection bias in instrumental variable analyses. Epidemiology. 2019;30:345–9.
Pierce BL, Vanderweele TJ. The effect of non-differential measurement error on bias, precision and power in Mendelian randomization studies. Int J Epidemiol. 2012;41:1383–93.
Article Google Scholar
Hemani G, Tilling K, Davey, Smith G. Orienting the causal relationship between imprecisely measured traits using GWAS summary data. PLoS Genet. 2017;13:1–22.
Google Scholar
Burgess S, Labrecque JA. Mendelian randomization with a binary exposure variable: interpretation and presentation of causal estimates. Eur J Epidemiol. 2018;33:947–52.
Article Google Scholar
Burgess S, Butterworth A, Malarstig A, Thompson SG. Use of Mendelian randomisation to assess potential benefit of clinical intervention. BMJ. 2012;345:1–6.
Article Google Scholar
Swanson SA, Hernan MA. The challenging interpretation of instrumental variable estimates under monotonicity. Int J Epidemiol. 2018;47:1289–97.
Article Google Scholar

Download references

Acknowledgements

This work was funded by The Netherlands Organization for Scientific Research (NWO VICI 453-14-005 (DP), 645-000-003 (DP), CHiLL 617-001-451 (IGB)) and by F. Hoffman-La Roche AG (CdL).

Author information

Authors and Affiliations

Department of Complex Trait Genetics, Center for Neurogenomics and Cognitive Research, Amsterdam Neuroscience, VU University Amsterdam, Amsterdam, The Netherlands
Christiaan de Leeuw, Jeanne Savage & Danielle Posthuma
Department of Data Science, Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands
Ioan Gabriel Bucur & Tom Heskes
Department of Clinical Genetics, Amsterdam Neuroscience, VU University Medical Center, Amsterdam, The Netherlands
Danielle Posthuma

Authors

Christiaan de Leeuw
View author publications
You can also search for this author in PubMed Google Scholar
Jeanne Savage
View author publications
You can also search for this author in PubMed Google Scholar
Ioan Gabriel Bucur
View author publications
You can also search for this author in PubMed Google Scholar
Tom Heskes
View author publications
You can also search for this author in PubMed Google Scholar
Danielle Posthuma
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CdL wrote and revised the paper. The other authors contributed to revision and editing.

Corresponding author

Correspondence to Christiaan de Leeuw.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Information

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Leeuw, C., Savage, J., Bucur, I.G. et al. Understanding the assumptions underlying Mendelian randomization. Eur J Hum Genet 30, 653–660 (2022). https://doi.org/10.1038/s41431-022-01038-5

Download citation

Received: 09 June 2021
Revised: 06 December 2021
Accepted: 04 January 2022
Published: 26 January 2022
Issue Date: June 2022
DOI: https://doi.org/10.1038/s41431-022-01038-5

This article is cited by

Lifestyle factors and subacromial impingement syndrome of the shoulder: potential associations in finnish participants
- Zhengtao Lv
- Jiarui Cui
- Li He
BMC Musculoskeletal Disorders (2024)
Unraveling the causality between chronic obstructive pulmonary disease and its common comorbidities using bidirectional Mendelian randomization
- Zihan Wang
- Yongchang Sun
European Journal of Medical Research (2024)
Gastroesophageal reflux disease and the risk of respiratory diseases: a Mendelian randomization study
- Rui Dong
- Qianqian Zhang
- Hongxing Peng
Journal of Translational Medicine (2024)
Estimating the direct effects of the genetic liabilities to bipolar disorder, schizophrenia, and behavioral traits on suicide attempt using a multivariable Mendelian randomization approach
- Brenda Cabrera-Mendoza
- Necla Aydin
- Renato Polimanti
Neuropsychopharmacology (2024)
Variant of the lactase LCT gene explains association between milk intake and incident type 2 diabetes
- Kai Luo
- Guo-Chong Chen
- Qibin Qi
Nature Metabolism (2024)