A multivariable Mendelian randomization to appraise the pleiotropy between intelligence, education, and bipolar disorder in relation to schizophrenia

Education and intelligence are highly correlated and inversely associated with schizophrenia. Counterintuitively, education genetically associates with an increased risk for the disease. To investigate why, this study applies a multivariable Mendelian randomization of intelligence and education. For those without college degrees, older age of finishing school associates with a decreased likelihood of schizophrenia—independent of intelligence—and, hence, may be entangled with the health inequalities reflecting differences in education. A different picture is observed for schooling years inclusive of college: more years of schooling increases the likelihood of schizophrenia, whereas higher intelligence distinctly and independently decreases it. This implies the pleiotropy between years of schooling and schizophrenia is horizontal and likely confounded by a third trait influencing education. A multivariable Mendelian randomization of schooling years and bipolar disorder reveals that the increased risk of schizophrenia conferred by more schooling years is an artefact of bipolar disorder – not education.

A multivariable Mendelian randomization to appraise the pleiotropy between intelligence, education, and bipolar disorder in relation to schizophrenia charleen D. Adams education and intelligence are highly correlated and inversely associated with schizophrenia. counterintuitively, education genetically associates with an increased risk for the disease. to investigate why, this study applies a multivariable Mendelian randomization of intelligence and education. For those without college degrees, older age of finishing school associates with a decreased likelihood of schizophrenia-independent of intelligence-and, hence, may be entangled with the health inequalities reflecting differences in education. A different picture is observed for schooling years inclusive of college: more years of schooling increases the likelihood of schizophrenia, whereas higher intelligence distinctly and independently decreases it. this implies the pleiotropy between years of schooling and schizophrenia is horizontal and likely confounded by a third trait influencing education. A multivariable Mendelian randomization of schooling years and bipolar disorder reveals that the increased risk of schizophrenia conferred by more schooling years is an artefact of bipolar disorder -not education.
Schizophrenia is a heterogeneous neurological syndrome, typically presenting in early adolescence, and observationally associated with lower intelligence and lower educational attainment [1][2][3] .
Education is positively associated with many health outcomes 4,5 . Counterintuitively, more years of schooling is genetically associated with an increased risk for schizophrenia 3 . Intelligence and education are highly positively correlated both phenotypically (r = 0.8) 6 and genetically (r = 0.7) 7 . The traits are bidirectionally causally related: higher intelligence causes more years of schooling and more years of schooling increases intelligence 8 . The interwoven traits are also pleiotropically related to schizophrenia: a recent genome-wide association (GWA) study found evidence of an increased risk for schizophrenia for the single-nucleotide polymorphisms (SNPs) tagging years of schooling (P = 3.2 × 10 −4 ) and strong genetic covariance between cognitive performance and increased years of schooling (P = 9.9 × 10 −50 ) 9 .
Three possible explanations exist for the associations between intelligence, education, and schizophrenia: vertical, horizontal, and confounding pleiotropy (Fig. 1). Uncovering the nature of these relationships could inform interventional strategies. To that end, this study uses univariable and multivariable Mendelian randomization (MR) to appraise these pleiotropic relationships and considers two measures of education: 1) age at completion of full-time schooling without a college degree (Education Age) and 2) years of schooling inclusive of college (Education Years). Due to the nature of the pleiotropy suggested by the findings for Education Years and schizophrenia, the study also considers a multivariable MR appraisal of Education Years and bipolar disorder in relation to schizophrenia. Table 1 and Table 2 contain the results for (i) the univariable (total) effects of education and intelligence on schizophrenia, (ii) the univariable results for the (total) effect of bipolar disorder on schizophrenia, and (iii) the bidirectional effects of Education Years and intelligence.  Tables 1 and 2, the MR-Egger intercept column is shaded grey. This is because its interpretation is different than that of the IVW and the other sensitivity estimators; the MR-Egger intercept provides a test for directional pleiotropy and an assessment of the validity of the instrument assumptions 10 . If the intercept is not different than 1 on the exponentiated scale (or 0 on the non-exponentiated scale), that indicates a lack of evidence for bias in the IVW estimate. For all the univariable results, the MR-Egger intercept demonstrated no evidence for pleiotropy (P > 0.05).

Results
education Years (Lee instrument) on schizophrenia. An increased (but null) effect on schizophrenia is observed for Education Years (odds ratio (OR) for schizophrenia per SD increase in years of schooling: IVW estimate 1.13; 95% CI 0.98, 1.29; P = 0.085). The sensitivity estimators are discrepant both in direction and magnitude of effects, indicating possible unwanted pleiotropy. Simulation extrapolation (SIMEX), which adjusts the MR-Egger estimate for potential regression dilution to the null 11 , did not ameliorate the discrepancy for the MR-Egger estimate. education Years (okbay instrument) on schizophrenia. In contrast, a robust increased risk for schizophrenia is observed for the Education Years: OR for schizophrenia per SD increase in Education Years: instrument estimate 1.49; 95% CI 1.23, 1.81; P < 0.001). There is comportment in the direction of effects among the sensitivity estimators. The weak F-statistic for the Lee instrument may explain the discrepancy between the Lee and Okbay results (see the Methods section for a discussion of the F-statistics). education age on schizophrenia. A strong protective effect against schizophrenia is observed for Education Age (OR for schizophrenia per SD increase in Education Age): IVW estimate 0.46; 95% CI 0.28, 0.76; P = 0.002). The sensitivity estimators align both in direction and magnitude of effects.
intelligence (Hill instrument) on schizophrenia. A protective effect of intelligence against schizophrenia is observed for both the Hill and UK Biobank instrumental variables. There is, however, substantial disagreement between the IVW and MR-Egger estimates for the Hill instrument, which was rescued by SIMEX correction (the direction of the effect is reversed towards that of the IVW). The remaining discordance in the sensitivity estimators for the Hill instrument likely indicates pleiotropy: OR for schizophrenia per SD increase in intelligence: IVW estimate 0.76; 95% CI 0.63, 0.93; P = 0.007.

intelligence (UK Biobank instrument) on schizophrenia.
A robust protective effect against schizophrenia is observed for the UK Biobank instrument (OR for per SD increase in intelligence): IVW estimate 0.86; 95% CI 0.78, 0.95; P = 0.006. The sensitivity estimators align. IQ = intelligence; UKBB = UK Biobank; EduYears=Education Years; EduAge=Education Age; P = P-value; F = F-statistic; OR = odds ratio; CI = confidence interval. IVW = inverse-weighted variance test; IVW is the primary MR method. The MR-Egger, weighted median estimator, and weighted mode estimators are included as Possible explanations for the pleiotropy between intelligence, education, and schizophrenia. An example of vertical pleiotropy would be the SNPs for intelligence influencing schizophrenia (only) through their effect on education. Vice versa, the SNPs for education might influence schizophrenia (only) through their effect on intelligence. Since education influences intelligence, an increase in intelligence from education might influence risk for schizophrenia (a). An example of horizontal pleiotropy would be if the SNPs for intelligence and/or the SNPs for education have independent, direct effects on schizophrenia (b). An example of confounding pleiotropy would be if education has no influence on schizophrenia but appears to due to strong association with intelligence. Vice versa, intelligence might not influence schizophrenia but appears to due to strong association with education (c). Multivariate MR can be used to investigate these relationships. (Multivariable MR does not eliminate potential bias from pleiotropic pathways not tested for in a given model 5 . For instance, in a multivariable MR of education and intelligence on schizophrenia, the multivariable analysis would not overcome possible bias from other traits, such as depression 43 .). sensitivity tests to examine horizontal pleiotropy. The magnitude and direction of their effects in comparison to the IVW are what are gauged-and are more informative than their p-values. If the magnitudes and directions of effects are similar to those of the IVW, this provides some evidence against pleiotropy. (When p-values for the sensitivity estimators are>0.05, this does not invalidate the results from the IVW estimate; it simply means that the sensitivity estimators do not provide additional evidence in support of the IVW findings.) SIMEX = simulation extrapolation, a correction that adjusts the MR-Egger estimate for potential regression dilution to the null 11 . The MR-Egger intercept is shaded grey because it is interpreted differently than the IVW estimate and the sensitivity estimators; the MR-Egger intercept provides a test for directional pleiotropy 10 . If the MR-Egger intercept is not different than 1 (P > 0.05), that indicates a lack of evidence for bias due to pleiotropy in the IVW estimate. Bipolar disorder on schizophrenia. An increased risk for schizophrenia is observed per genetic liability to bipolar disorder (IVW estimate 1.17; 95% CI 1.09, 1.26; P < 0.001). The effect estimate is reversed for the MR-Egger estimator, and the magnitudes of the various estimators vary, possibly indicative of some unwanted pleiotropy. Figure 2 contains the comparison of the univariable and multivariable (adjusted) estimates for the effects of education and intelligence on schizophrenia and bipolar disorder and education (Education Years) on schizophrenia. intelligence, adjusting for education age. The impact of intelligence on schizophrenia attenuates to the null when adjusting for Education Age (adjusted OR for schizophrenia per SD increase in intelligence: IVW estimate 0.92; 0.82, 1.04; P = 0.219). One explanation for the difference observed between the univariable and multivariable MR estimates for the effect of intelligence on schizophrenia is that intelligence affects schizophrenia through its effect on Education Age, rather than through a direct effect on schizophrenia.

Multivariable results.
intelligence, adjusting for education years. The protective effect of intelligence remains after adjusting for Education Years (adjusted OR for schizophrenia per SD increase in intelligence: IVW estimate 0.84; 95% CI 0.74, 0.94; P = 0.004). This suggests intelligence has a robust and direct protective effect against schizophrenia. The effect attenuates some in comparison to the univariable model, perhaps reflecting the loss of the contribution of Education Years to intelligence.  Table 2. Bidirectional relationship between Education Years and intelligence.
these findings strongly suggest that the underlying pleiotropy between intelligence and Education Years is horizontal in relationship to schizophrenia (Fig. 1b) and that the relationship is additionally caught up by the presence of an unmeasured confounder (similar to Fig. 1c). The horizontal pleiotropy and opposing directions of effect for Education Years and intelligence prompted a univariable investigation of bipolar disorder and schizophrenia and a multivariable Mendelian randomization of bipolar disorder and Education Years on schizophrenia. The proposed hypothesis is seen in Fig. 3 Bidirectional relationship between education years and intelligence. Table 2    www.nature.com/scientificreports www.nature.com/scientificreports/ MR-Egger intercept provides a test for directional pleiotropy 10 . If the MR-Egger intercept is not different than 0 (P > 0.05), that indicates a lack of evidence for bias due to pleiotropy in the IVW estimate.

Discussion
The MR findings show that, for those without college degrees, older age of finishing school (Education Age) associates with a decreased likelihood of schizophrenia-independent of intelligence. For those without college degrees, education-not intelligence-acts as the mechanism conferring protection against schizophrenia. The implications of this are uncertain, since the protective effect is likely to be entangled with the social inequalities linked to educational attainment. Nonetheless, efforts to retain at-risk adolescents in school, especially those beginning to show features of cognitive impairment, may be worth exploring, even if difficult to implement societally.
A different picture is observed for years of schooling inclusive of college (Education Years): more schooling years increases the likelihood of schizophrenia, whereas higher intelligence distinctly and independently decreases it. This implies the pleiotropy between schooling years and schizophrenia is horizontal and likely confounded by a third trait also influencing Education Years. Further to this, bipolar disorder, associated observationally with both higher education and schizophrenia 3,12,13 , was investigated along with Education Years, using multivariable MR. The findings suggest that the increased risk of schizophrenia conferred by more schooling years is an artefact of bipolar disorder -not Education Years.
Educational attainment has been described as feature of bipolar disorder 12,13 . Bipolar disorder shares some cognitive deficits and genetic overlap with schizophrenia, but also predisposes to cognitive adeptness and creativity that distinguish it from the more neurodevelopmental aspects of schizophrenia 3 . This complex picture is reflected in the horizontal and confounding pleiotropy uncovered by the multivariate analyses here. Specifically, when bipolar disorder is not accounted for, it appears that more years of schooling increase risk for schizophrenia. Hence, bipolar disorder is a confounder of the relationship between Education Years and schizophrenia. Since more years of schooling increase intelligence and higher intelligence strongly protects against schizophrenia, these findings imply that staying in school is neuroprotective.
The bidirectional analysis of intelligence and Education Years revealed that higher intelligence increases years of schooling and years of schooling increase intelligence, replicating the findings by Anderson 8 . This comports with what was found in the present study. Given the multivariable finding that Education Years does not cause schizophrenia once bipolar disorder is accounted for, the bidirectional causation between intelligence and Education Years strengthens the implication that staying in higher education longer may have beneficial consequences against acquisition of schizophrenia.
The primary strength of this study is that it capitalizes on the power of seven large GWA studies to probe these complexly related traits. It is the most detailed and comprehensive joint investigation of them to date. An unintended benefit of doing so demonstrates the value of these massive public datasets for etiologic discovery.
The study has several limitations. MR critically relies on the validity of the instrumental variables. As such, measures were taken to assess the robustness of the analyses to potential unwanted pleiotropy, including the use of instruments lacking between-SNP heterogeneity and comparison of the IVW estimate with a battery of sensitivity estimators, each making different assumptions.
Another possible limitation, which, like unwanted pleiotropy, cannot be entirely ruled out, is the possible introduction of bias caused by some instances of the same individuals being included in the GWA studies of both the exposures and the outcomes. The greatest overlap is likely to be for the Lee Education Years instrument on intelligence and the Hill intelligence instrument on Lee's Education Years. However, since that bidirectional 1) Mendel's Laws of Inheritance, 2) genotype assignment at conception, and 3) pleiotropy (genes influencing more than one trait) [14][15][16] .
Two-sample MR (Fig. 5) uses summary statistics from two genome-wide association (GWA) studies 10,[17][18][19][20][21] . Bidirectional MR, as the name suggests, is an MR method for examining causal relationships in two directions. Bidirectional MR helps orient the causal direction and determine whether both traits causally influence each other-"bidirectional causation". Multivariable MR permits adjustment, similar to multivariable regression to adjust for potential confounders in observational studies 22 . Multivariable MR is especially useful when two variables are highly correlated with each other, as is the case for Education Years and intelligence. In a multivariable MR analysis of Education Years and intelligence on schizophrenia, the estimated effect of Education Years is the effect given a constant level of intelligence, and the effect for intelligence is the effect given a constant level of Education Years. The effect estimates from univariable and multivariable MR can be compared to obtain total (univariable, unadjusted) and direct (multivariable, adjusted) effects.

Mendelian randomization assumptions.
In order for MR to be valid, three assumptions must hold: (i) the SNPs acting as the instrumental variables must be strongly associated with the exposure; (ii) the instrumental variables must be independent of confounders of the exposure and the outcome; and (iii) the instrumental variables must be associated with the outcome only through the exposure 19,23 . For example, for the present analysis, the following assumptions must hold: (i) genetic variants robustly associated with Education Years must be chosen as instruments to test the causal relationship between Education Years and schizophrenia; (ii) the genetic variants chosen to instrument Education Years must not be associated with confounders of the relationship between Education Years and schizophrenia; and (iii) the genetic variants chosen to instrument Education Years must only impact schizophrenia through their impact on Education Years. When violated, assumption (iii) describes horizontal pleiotropy (Fig. 1b), which can invalidate causal inference from vertical (Fig. 1a) pleiotropy probed in univariable MR designs.
GWA study data sources for instruments. Education age on schizophrenia. Two measures of education were selected to instrument education: age at completion of full-time schooling without a college degree (Education Age) and years of schooling inclusive of college (Education Years). The Education Age measure was obtained from field 845 in the UK Biobank project 24,25 . Participants were asked if they had a college or university degree. Those without a college or university degree were asked what age they left continuous full-time education. Summary statistics for a GWA study of Education Age (adjusted for sex and 10 principal components), including 226,899 UK Biobank participants who answered field 845, are publicly available; the GWA study was performed by the Neale lab, after transforming the item into a normally distributed quantitative variable 26 (SNP coefficients per standard deviation (SD) units of Education Age). Because the instrument for Education Age captures only those without college or university degrees, the inference from the use of Education Age as an instrument is restricted to those without college or university degrees. Figure 5. Two-sample Mendelian randomization testing the causal effect of intelligence or education on schizophrenia. Estimates of the SNP-intelligence (or SNP-education) associations (βˆZ X ) are calculated in sample 1 (from GWA study of intelligence or GWA study of education). The association between these same SNPs and schizophrenia are then estimated in sample 2 (βˆZ Y ) (from a schizophrenia GWA study). These estimates are combined into Wald ratios (βˆX Y =β β/ ZY ZX ). The βˆX Y estimates are meta-analyzed using the inverse-variance weighted analysis (βˆ IVW ) method. The IVW method produces an overall causal estimate of intelligence and/or education on schizophrenia. (2020) 10:6018 | https://doi.org/10.1038/s41598-020-63104-6 www.nature.com/scientificreports www.nature.com/scientificreports/ The F-statistic, a function of how much variance in a trait is explained by an instrument (R 2 ), the sample size, and the number of SNPs in an instrument, provides an indication of instrument strength 27 . F-statistics <10 are conventionally considered to be weak 28 . The F-statistic for the Education Age instrument is 13.3.
Education years on schizophrenia. The primary years of schooling measure was obtained from the Lee et al.
(2018) GWA study of 1,131,881 participants of European ancestry from 71 cohorts 29 . Education Years was measured for those who were at least 30 years of age, and International Standard Classification of Education (ISCED) categories were used to impute a years-of-education equivalent (SNP coefficients per SD units of years of schooling). The F-statistic for the Lee Education Years instrument is 4.7, indicating the instrument may be weak. Due to this, a second measure of Education Years from a smaller GWA study of years of schooling was used to construct a second instrument for Education Years 9 Table 3 for a list of the number of selected SNPs for each of the instrumental variables).
Intelligence on schizophrenia (Hill instrument). Two GWA studies were used to create instruments for intelligence. The first came from the Hill et al. (2019), which included 248,482 individuals of European ancestry (SNP coefficients per one SD increase in intelligence test scores 7 . The instrument's F-statistic is 14.9. Intelligence on schizophrenia (UK Biobank instrument). A second instrument for intelligence was constructed from a GWA study performed by the Neale lab using the UK Biobank measure for fluid intelligence (field 20016) (n = 108,818). The participants answered 13 logic questions within two minutes and the number of correct answers were summed. The data were transformed into a normally distributed quantitative variable (SNP coefficients per one SD unit increase in fluid intelligence score) 26 . The instrument's F-statistic is 26.  Instrument construction. For each instrument (βˆZ X ), independent (those not in linkage disequilibrium, LD; R 2 < 0.01) SNPs associated at genome-wide significance (P < 5 × 10 −8 ) with a trait were extracted from within their respective GWA study. The summary statistics for the instrument-associated SNPs were then extracted from an outcome GWA study (βˆZ Y ). SNP-exposure and SNP-outcome associations were harmonized with the "harmo-nization_data" function within the MR-Base "TwoSampleMR" package within R 17,33 . Harmonized SNP-exposure and SNP-outcome associations were combined with the IVW method (Fig. 5).
For the bidirectional associations between intelligence and schooling years, SNPs tagging both traits at genome-wide significance and/or SNPs that were in LD between intelligence and schooling years were excluded. This is because overlapping SNPs can invalidate bidirectional MR findings 21,34 . In addition, for all instrumental variables, RadialMR regression 35

Sensitivity analyses.
To address possible violations to MR assumption (iii), MR-Egger regression, weighted median, and weighted mode MR methods were run as complements to the IVW method for the univariable models. When the magnitudes and directions of the various MR methods comport across estimators, this lack of heterogeneity is a screen against pleiotropy. The reason for this is that various MR sensitivity estimators make different assumptions about the underlying nature of pleiotropy. It is unlikely there would be homogeneity in the direction and magnitudes of their effect estimates if there were substantial violations to the pleiotropy assumption.
Comparing the IVW and the sensitivity estimators is a form of triangulation: integrating several approaches with different assumptions to weigh causal evidence 37 . Briefly, a drawback of the primary IVW estimator is that its estimate can be biased if one or more the SNPs in its multi-allelic genetic instrument are directionally pleiotropic 38 . The MR-Egger sensitivity estimator can provide unbiased estimates of causal effects, even if all SNPs in an instrument are invalid due to pleiotropy. But the SNPs in the genetic instrument must not violate the Instrument Strength Independent of Direct Effect ("InSIDE") assumption, and measurement error in the genetic instrument must be negligible ("No Measurement Error" assumption). The weighted median estimator can provide unbiased causal effects, assuming at least 50% of the chosen SNPs are valid. The weighted mode estimator assumes the most common effect estimate among SNPs in an instrument comes from a valid instrument. Elaborate descriptions of the various MR methods and the different assumptions they make about pleiotropy are described elsewhere [38][39][40] . For the purposes of understanding how to interpret the IVW and sensitivity estimators in the present study, the IVW is the main estimator. The others are provided to compare their magnitudes and directions of effect with those of the IVW.

Number of tests.
In total, 14 MR tests were run. Table 3 contains a list of the tests and the number of instrumental variables (detailed characteristics for the individual SNPs used in each model are provided in Supplementary Tables 3, 6, 9, 12, 15, 20, 23, 26, and 29). These 14 tests are not independent; a false-discovery rate (FDR)-correction was applied to the raw P-values to assess whether the penalization changed the inference (Supplementary Table 2). As it did not, the raw P-values are reported for the following reasons: the inference remained unchanged, the FDR-adjustment is overly conservative in this case, and P-values alone are not the best guide for causal inference 42 .
Statistical software. SIMEX corrections were perfomed in Stata SE/16.0. All other described analyses were performed in R version 3.5.2 with the "TwoSampleMR" package 17 .