MR-PheWAS: hypothesis prioritization among potential causal effects of body mass index on many outcomes, using Mendelian randomization

Observational cohort studies can provide rich datasets with a diverse range of phenotypic variables. However, hypothesis-driven epidemiological analyses by definition only test particular hypotheses chosen by researchers. Furthermore, observational analyses may not provide robust evidence of causality, as they are susceptible to confounding, reverse causation and measurement error. Using body mass index (BMI) as an exemplar, we demonstrate a novel extension to the phenome-wide association study (pheWAS) approach, using automated screening with genotypic instruments to screen for causal associations amongst any number of phenotypic outcomes. We used a sample of 8,121 children from the ALSPAC dataset, and tested the linear association of a BMI-associated allele score with 172 phenotypic outcomes (with variable sample sizes). We also performed an instrumental variable analysis to estimate the causal effect of BMI on each phenotype. We found 21 of the 172 outcomes were associated with the allele score at an unadjusted p < 0.05 threshold, and use Bonferroni corrections, permutation testing and estimates of the false discovery rate to consider the strength of results given the number of tests performed. The most strongly associated outcomes included leptin, lipid profile, and blood pressure. We also found novel evidence of effects of BMI on a global self-worth score.

Calculating the BMI allele score The BMI allele score was created using a weighted sum of allelic dosages, such that a higher score corresponds to a higher BMI: where d is the allelic dosage of individual i such that 0! ≤ d! ≤ 2, and effect ! !is the effect size of loci l, scaled relative to the effect of FTO which has the largest effect size of these loci.

Imputation methods
The imputed dataset consisted of all 8,101 individuals and 172 variables in the original dataset. We used multiple imputation using chained equations (ice command in Stata), to impute missing values for all variables, and generated 20 imputation data sets (1). We used predictive mean matching (match option) for non-normal (or log-normal) variables because it does not assume normality, to prevent extrapolation beyond feasible values.
To inform the imputation we included additional socio-economic position (SEP) variables which may help to explain missingness: household social class, maternal education, smoking during pregnancy, and ethnicity. The purpose of this is to satisfy the missing at random (MAR) assumption of the imputation method; the probability of missingness does not depend on the missing data conditional on the observed data. We included the BMI allele score and all outcomes in our imputation, to inform the prediction of each outcome. The large number of variables in our dataset should also help to satisfy the MAR assumption, and the variable set should include variables predictive of both the variables and missingness of the variables (2).
Adjusting P values to account for the number of independent tests performed We have presented both unadjusted and Bonferroni corrected P values. Given the high degree of confounding in observational data, the adjusted P values are likely to be a conservative estimate, as the Bonferroni correction accounts for the number of independent tests. A more appropriate adjustment would need to take into account the degree of dependence among the outcome variables, or use a set of outcomes that are independent. This 'independence adjusted' P value would sit between the unadjusted and Bonferroni corrected values. We performed a sensitivity analysis to investigate this, by creating a correlation matrix of the outcome variables and removing variables such that there are no correlations higher than 0.8 (or lower than -0.8) (variables removed shown in Supplementary Table 4). Thus, these results are a subset of our main results, where the corrected P values are adjusted for the number of independent outcomes instead of the total number of outcomes. The number of results with P < 0.05 can decrease due to the removal of variables from the dataset, or increase because a smaller outcome set means that a smaller P value correction is needed. This approach removed 32 variables (none of which were in our validation set) to give an outcome set of size 140. This removed 2 of the outcomes with p<0.05 in the stage 1 results; the Attention/activity symptoms score and Apolipoprotein B. Thus after removing highly correlated variables, 12 of the 128 outcomes have a P<0.05 in the stage 1 tests. The Bonferroni approach found only 1 outcome with P<0.05 after adjustment for the 160 tests performed. This shows that the correlation method is less conservative than the Bonferroni approach (test of proportions P = 0.0004).
All outcomes are transformed to normal distributions using a rank-based inverse normal transformation. Exposure and outcome variables are standardised. Outcome as dependent variable, BMI allele score as independent variable.