A direct test of the diathesis–stress model for depression

The diathesis-stress theory for depression states that the effects of stress on the depression risk are dependent on the diathesis or vulnerability, implying multiplicative interactive effects on the liability scale. We used polygenic risk scores for major depressive disorder (MDD) calculated from the results of the most recent analysis from the Psychiatric Genomics Consortium as a direct measure of the vulnerability for depression in a sample of 5 221 individuals from 3 083 families. In the same we also had measures of stressful life events and social support and a depression symptom score, as well as DSM-IV MDD diagnoses for most individuals. In order to estimate the variance in depression explained by the genetic vulnerability, the stressors and their interactions, we fitted linear mixed models controlling for relatedness for the whole sample as well as stratified by sex. We show a significant interaction of the polygenic risk scores with personal life events (0.12% of variance explained, p-value=0.0076) contributing positively to the risk of depression. Additionally, our results suggest possible differences in the aetiology of depression between women and men. In conclusion, our findings point to an extra risk for individuals with combined vulnerability and high number of reported personal life events beyond what would be expected from the additive contributions of these factors to the liability for depression, supporting the multiplicative diathesis-stress model for this disease.


Introduction
A popular explanation for the aetiology of depression is the diathesis-stress model [1][2][3][4][5][6] . Initially developed to explain the origins of schizophrenia in the 1960s 5,6 and adapted for the study of depression in the 1980s [1][2][3][4] , this model states that stress may activate a diathesis or vulnerability, transforming the potential of predisposition into the actuality of psychopathology 7 . The model proposes that there is a synergism between the diathesis and stress that yields an effect beyond their combined separate effects into depressive symptomatology and thus, the effects of stress on the depression risk are dependent on the diathesis. Implicit in this theory is that there will be not only additive but multiplicative interactive effects on the liability scale 7 .
Over fifty years ago David Rosenthal 6 described the diathesis-stress theories as "the ones in which genuine meaning attaches to the commonly repeated statement that heredity and environment interact". However, he criticised the vague formulations for the predispositions and stressors that these theories propose. This criticism has been highlighted by others like Monroe 7 , who call for more research and more precise measures on the "conceptual essence" of the diathesis-stress premise, i.e. "the nature of the interaction between elements in the etiologic process over time". The diathesis-stress theory and research have been criticised for being "unproductive, either theoretically or empirically" 8 .
The genetically driven sensitivity to environments proposed by the diathesis-stress model can be operationalised as a gene by environment interaction (GxE). GxE studies have commonly focused on single loci in candidate genes, such as the length polymorphism (5HTTLPR) in the serotonin transporter gene (SLC6A4), with mostly inconsistent or negative results [9][10][11][12][13] . This approach has limitations related to poor quality genotyping, inconsistent types of interactions, inconsistent grouping of genotypes, selective presentation of results, interactions arising from the scale of measurement, and publication bias 9 . Moreover, MDD is a polygenic trait, arising from the effect of multiple risk variants, each with small effect sizes 14,15 . Therefore, MDD is influenced by many genetic variants of small effect, and it is more likely that affected individuals carry a polygenic burden of risk alleles rather than any single genotype of large effect. However, the progress from a candidate gene to an hypothesis-free genome-wide approach is hampered by the need for extremely large samples due to expected small effect sizes as well as necessarily imperfect assessment of environmental stressors across large cohorts 16,17 .
Polygenic risk scores (PRS) provide a novel opportunity to test the diathesis-stress model, since PRS can be conceptualised as an indicator of the diathesis and will likely prove a much stronger instrument than any single risk gene. PRS estimation uses Genome-Wide Association Study (GWAS) results to predict the genetic risk of each individual in an independent genotyped sample; PRS are estimated as the sum of risk alleles weighted by their respective independently estimated effect sizes 18 . Note that, since GWAS are currently underpowered to detect all common genetic risk variants in complex traits, the variance explained by the PRS is usually lower than the twin heritability 18 .
The first ones to use PRS for MDD to test for GxE interaction in MDD were Peyrot et al. 17 . Using a sample of 1 645 participants with a DSM-IV diagnosis for MDD and 340 screened controls from the Netherlands Study of Depression and Anxiety, they showed increased effects of PRS on MDD in the presence of childhood trauma, with evidence for interaction. Musliner et al 19 studied the association between PRS-MDD, SLEs and depressive symptoms in a sample of 8 761 participants from the Health and Retirement Study in the United States. SLEs were operationalised as a dichotomous variable indicating whether participants had experienced at least one stressful event in the previous two years. Depressive symptoms were measured using an 8-item Center for Epidemiological Studies Depression subscale and operationalised as both a dichotomous and a continuous variable. They found that both SLEs and PRS were significantly and independently associated with depressive symptoms, but found no evidence that SLEs moderated the association between PRS-MDD and depressive symptoms. Instead, their results were compatible with an additive model. Most recently, Mullins et al. 20 examined the idea using 1 605 cases with recurrent MDD and 1 064 controls all with SLE data, and a subset of 240 cases and 272 controls with childhood trauma data from in the RADIANT UK study. Both PRS and SLEs were significant predictors of case/control status but no interactions were found between PRS for MDD and SLEs, in agreement with previous findings by Musliner et al. 19 . Significant interactions were found between PRS and childhood trauma but, contrary to Peyrot et al. 17 , there was an inverse association with depression status. In summary, these studies do not present consistent results. Studies to date have used the first wave of GWAS data (MDD1) from the Psychiatric Genomics Consortium (PGC) MDD working group (PGC-MDD), based on 9 240 cases and 9 519 controls 14 , and so are likely underpowered.
We report here a direct test of the diathesis-stress model for depression using PRS for MDD and measures of Stressful Life Events (SLEs) and Social Support (SS; lack of SS being considered a stressor); we predict diathesis using an updated version of PGC-MDD GWAS results (N total=159 601, after excluding QIMR data). Given the higher lifetime risk of MDD in women 21 , we also tested the hypothesis in sexes separately.

Materials and Methods
Phenotypic data were collected as part of a general Health and Lifestyle questionnaire (HLQ) mailed to adult twins enrolled in the Australian Twin Registry between 1988 and 1992 [22][23][24] . It included self-report questions about depression, recent personal or network stressful life events (PSLE, NSLE) and SS. The content and details of data collection have been previously described [22][23][24] . Data used in this analysis were collected in 3 waves. The first wave ran between 1988 and 1992 (N=5 843) and targeted adult twins (mean age 41.2, SD=12.8, range 24-86, 61.0% females) 23 . The second wave (N=3 646, collected between 1990 and 1992) focused on younger twins (mean age 23.2, SD=2.2, range 16-31, 65.6% females) and the questionnaire was slightly adapted to cover some of the more common issues of that age group 22,23 . Finally, the last wave (N=236) targeted twin pairs whose information was partially missing from the original 1980 survey, using the same questionnaire (collection between 1990-1992, mean age 42.0, SD=9.9, range 27-73, 58.5% females). This study was approved by the Queensland Institute of Medical Research Human Research Ethics Committee and the storage of the data follows national regulations regarding personal data protection. All of the participants provided informed consent.
Depression scores were calculated by combining the 7 depression items from the Delusions-Symptoms-States Inventory (DSSI) 25, 26 with 5 depression items from the Symptom CheckList (SCL-90) 27 . The factor structure of the scale has been reported previously in the younger dataset 22 and the score has been used in several publications 9,22,23,28,29 .
The HLQ also assessed Personal Stressful Life Events (PSLE) and Network Stressful Life Events (NSLE), adapted from the List of Threatening Experiences 30 . For PSLE, participants were asked to report adverse events (divorce, marital separation, broken engagement or steady relationship, separation from other loved one or close friend, serious illness or injury, serious accident, burgled or robbed, laid off or sacked from job, other serious difficulties at work, major financial problems, legal troubles or involvement with police, living in unpleasant surroundings) that happened in the last 12 months. In addition, they were asked if they had had serious problems getting along with their close network (spouse, someone living with you e.g. child/elderly parent, other family member, co-twin, a close friend, neighbour or workmate) in the past 12 months. These 19 yes/no items were summed to calculate the PSLE score. NSLE was calculated from 21 yes/no questions, in which the participants could report death, injury or crisis that their close network (spouse, child, mother/father, co-twin, other brother/ sister, other relative, someone else close to them) experienced in the last 12 months.
Perceived Social Support (PSS) was measured using the Kessler Perceived Social Support (KPSS) Measure 31 . Several publications from our group have made use of these data 9, 23, 28, 32-34 .
We used Item Response Theory 35,36 , which weights the item responses by their difficulty and discrimination, to calculate individuals' scores of depression, PSLE, NSLE and SS. First, we performed an exploratory non-parametric IRT analysis the KernSmoothIRT package in R 37 . It allows estimation of the probability of endorsing each option of each item as a function of the latent underlying trait (known as Item Response Step Functions: IRSF), without any constraint on the shape of the fitted curve 36 . We used it to confirm the monotonicity of IRSF necessary to ensure the property of stochastic ordering on the sum score 38 . It further allows choosing the most appropriate parametric IRT models based on the shapes of the non-parametric IRSF.
IRSF plotted in Supplementary Figures 1-4 show, for all scales and items, monotonic IRSF in the normal range of the latent trait continuum (−2 2). Small breaches of monotonicity were observed for extreme values of the latent trait and can be attributed to small numbers of participants that lead to unstable non-parametric kernel estimation (as indicated by widening 95% CIs). Consequently, we estimated the IRT scores using a 2-parameter logistic model in WinBUGS v. 1.4.3 39 that constrains all left asymptotes to be 0 and all right asymptotes to be 1. In such a model, the IRSF only differ in term of difficulty and discrimination 35,40 . Such IRT scores are maximum likelihood estimates of the latent trait and carry the same information as a sum score while presenting more normal distributions, thus reducing the influence of extreme values in later analyses.
Missingness in the depression items was limited to less than 2% of the respondents and most of the missing answers (88 or 60%) were found in the item "recently, I have lost interest in sex or have found not found sex pleasurable". The 1.6% of the respondents who omitted this item tended to be females (p-value=4.2e-04), 4 months older on average (p-value=9.7e-06), and with a slightly higher DSSI score (+0.1 pts, p-value=3.6e-05). Missingness not at random (i.e. potentially dependent on depression level) implies that excluding participants may create a sampling bias. Thus, we chose to impute the missing observations using WinBUGS (described in 34,41 ) using age, sex and the depression items as predictors. Overall, due the low missingness rate imputation should have little influence on the results.
Lifetime DSM-IV depression diagnoses were obtained in most of the cohort in two telephone interview follow-up studies using the clinical Semi-Structured Assessment for the Genetics of Alcoholism (SSAGA 42,43 in 1992-1993 and 1996-2000 (see Supplementary Figure 5 for a summary of the phenotypic data collection). Details of data collection are described elsewhere 29,[44][45][46][47] . The depression score significantly predicted lifetime DSM-IV MDD status assessed 4 to 7 years later 44, 45 (OR=1.96, 95%CI 1.85-2.08, p-value=3.0e-108, N=8 607), which translates to a 6.1 fold increased odds of MDD between participants in the top and bottom deciles of depression IRT scores (Figure 1a), so demonstrating the utility of the score. For our analysis we used the continuous IRT score rather than the binary diagnosis as continuous models provide greater statistical power than logistic regressions (>99.9% vs. 88.0%, N= 5 221, O.R.= 1.1, beta = 0.095, with α= 0.05, proportion of cases and SD of outcome and predictor measured in our sample) and is available for larger sample size (5 179 vs. 5 221 with IRT score).
PRS 58,59 were calculated from the imputed genotype dosages, using GWAS summary statistics from the most recent PGC MDD release [July 9 th 2016], with the exclusion of the contribution of QIMR, for a final sample of 49 524 cases and 110 074 controls (see Supplementary Table 1 for cohort contributions). For comparison, we also calculated the PRS using the first wave GWAS summary statistics published by the PGC-MDD 14 . From our data, we excluded SNPs with low imputation quality (r 2 <0.6) and MAF below 1%. We selected the most significant independent SNPs using PLINK1.9 53 in order to correct for signal inflation due to Linkage Disequilibrium (LD) (criteria LD r 2 <0.1 within windows of 10MBp). We calculated 8 different PRS using different p-value thresholding of the GWAS summary statistics (see Supplementary Table 2  Our final sample comprised 5 221 individuals (from 3 083 twin families) of European ancestry with available phenotypic and genetic data (mean age at questionnaire 35.7, SD=12.2, range 17-85, 65.6% females). Covariates (age, age 2 , sex, age*sex and age 2 *sex interactions, and the first four genetic principal components) were regressed from the PRS and the stress scores before inclusion in the models to guard against confounding influences on the PRS-stress interactions 60 .
In order to estimate the variance explained by the PRS, the stressors and their interactions in the depression score, we then fitted linear mixed models which controlled for relatedness for the whole sample as well as stratified by sex. ). The parameters of the model were estimated using GCTA 1.26.0 (student test to test the significance of the fixed effects) that accounts for twin relatedness using a Genetic Relatedness Matrix (GRM). The linear model used is as follows: Depression = intercept + b*Covariates + c*PRS_z + d*PSLE_z + e*NSLE_z + f*SS_z + g*PRS_z*PSLE_z + h*PRS_z*NSLE_z + i*PRS_z*SS_z + j*G With b,c,d,e,f,g,h,i the vectors of fixed effects Covariates used in this analyses were age, sex, age 2 , sex*age, sex*age 2 , GWAS array, wave, and first 4 genetic principal components. Note that sex and its interaction were not included when stratifying the analyses by sex.
PSLE_z, SS_z, NSLE_z and PRS_z are the residuals of the scores after regressing out the covariates listed above G is the random effect that models the sample relatedness G~ N(0, GRM), with GRM the NxN matrix of relatedness estimated from SNPs We used OpenMx 61 to calculate the heritability and correlations (likelihood-ratio test, using a kinship matrix to account for familial relatedness) of the depression score and the stressors, correcting for age, sex, age 2 , sex*age, sex* age 2 , and wave. Following the significant genetic correlations estimated from twin models, we investigated how much of the variance in stress scores could be accounted for by the MDD-PRS. We controlled for age, age 2 , sex (and their interactions), study, imputation batch and 4 genetic principal components. Model parameters were estimated using GCTA 1.26.0 62 that accounts from twin relatedness.

Results
PRS for MDD significantly predicted the depression score (maximum variance explained =0.46%. p-value= 5.01e-08, Figure 1b, right panel), which represents a substantial improvement compared to PRS predictions based on earlier GWAS 14 (Figure 1b, left panel, variance explained = 0.08%, p-value=0.018), reflecting the increased sample size of the GWAS discovery samples 63,64 . The main effects of PSLE, NSLE, and lack of SS were also significant, explaining respectively 12.9%, 0.3% and 3% of the depression score variance (Figure 1c), with effects in the expected directions. Lack of SS predicted more of the depression score in women than it did in men (between sex differences, p-value= 4.7e-03) but there were no other differences between sexes that reached significance.
The significance of main effects allowed testing the significance of the interaction between each stress type (PSLE, NSLE, lack of SS) and the most predictive PRS (using all SNPs: p<1). The interaction with PSLE was significant (0.12% of variance explained, p-value=0.0076) and contributed positively to the risk of depression, predominantly in women ( Figure 1d). Overall, the variance explained by PRS main effect plus the interaction was comparable in men (0.73%) and women (0.60%). The interaction was not significant in men while explaining almost as much variance as the main effect in women. However, there was no significant difference when comparing the size of the interaction across sexes (p-value=0.21). For completeness, interactions between each stressor and all PRS are also reported in Supplementary Figure 7.

Discussion
Our finding of a significant diathesis-PSLE interaction points to an extra risk for individuals with combined vulnerability and high number of reported PSLE beyond what would be expected from their additive contributions to liability (Figure 1d, 1e). In the full sample 0.58% of the depression score variance was explained by the PRS and interaction, of which ~80% corresponds to the main effects and ~20% to the interaction. As the power of the PRS increases with larger GWAS 63 , and if these proportions are maintained, the interaction explaining about 20% of the heritability would be typical of the size of GxE estimates for other traits in other species 65 . We cannot dismiss the possibility of diathetic interactions with NSLE or SS, as the power of our study is still limited by the PRS instrument 63 and our sample size. This is evidenced by our measure of genetic predisposition still only explaining a small fraction of the depression score variance (Figure 1b), in comparison with the twin based heritability (Supplementary Figure 8 and 66 ), or even the SNP heritability for MDD (h 2 SNP =0.21 18 ). Note that for all psychiatric and complex traits, it is common that the variance explained by PRS corresponds to a fraction of the heritability, especially when only a few variants are known 67 . Much larger GWAS samples are required to better differentiate the true SNP signals from the noise and to provide a greater level of prediction 63 .
We confirmed using a twin analysis of the dataset (1 110 MZ pairs, 1 032 DZ pairs, 961 singletons) that our measure of depression and all 3 stress measures are moderately heritable (30-40%, p-value<2.1e-25; Supplementary Figure 8), as reported previously 68,69 . Additive genetic and unique environment (AE) models showed the best fit to the data and shared environment could not explain the association (p-value<1.7e-03). We also replicated that self-reported measures of stress are genetically correlated with the depression score (Supplementary Figure 9) 70,71 .
PRS for MDD predicted PSLE and SS (p-value<0.001), no significant association was observed with NSLE (Supplementary Figure 10). This is consistent with heritability and genetic correlation results reported from twin models.
On the scale of measurement for depression that we have used, our results support the multiplicative diathesis-stress model for depression proposed in the 1980s. In addition, our results suggest possible differences in the aetiology of depression between women and men, which may have implications for the tailoring of treatments. However, we must caution that the presence and size of interaction are completely dependent on the scale of measurement and more perfectly normal scales for depression and stressors may have produced a different result 65,72 . For example, using a simple sum score for depression, with an extreme reverse-J distribution yields an even larger and more significant estimate of interaction (0.39% variance explained; p-value = 6.8e-07) whereas using logistic regression to analyse the binary DSM IV diagnosis, predicated on an underlying normal liability, produces a smaller and only marginally significant estimate (0.06% variance explained; p-value = 0.059), although this analysis has much lower power than using a continuous variable.
In addition, the substantial genetic correlation between PSLE and depression hinders attributing the interaction solely to a GxE effect. To investigate this point, we broke down PSLE into events in which the individual may have played an active role (PSLE-active: divorce, separation, having difficulty at work, financial or legal troubles, not getting along with people) as opposed to passive role 73,74 (PSLE-passive: illness, accident, being burgled, sacked or living in unpleasant surroundings) and calculated the IRT score for them. This follows previous publications reporting that PSLE-active was more heritable than PSLE-passive 73,74 , which we confirmed in our sample (twin h 2 PSLE-Active =0.29, 95%CI 0.24-0.34; h 2 PSLE-Passive =0.11, 0.05-0.17). We tested whether the less heritable PSLEpassive may drive the observed interaction, which may point towards a more likely GxE interaction. However, in the models, active PSLE explained most of the variance explained by the PSLE score (r 2 PSLE-Active =10.5%, p-value=3.2e-123 vs. r 2 PSLE-Passive =0.77%, p-value=4.1e-12) and the interaction (r 2 PSLE-Active * PRS =0.085%, p-value=0.030 vs. r 2 PSLE-Passive * PRS =0.0084%, p-value=0.46) (results given for the PRS "P<1", consistent with our other analyses, see Supplementary Figure 12 for all details). This approach was unable to confirm the nature of the interaction, since both PSLE-active and -passive are significantly heritable.
To tackle this question more directly, we calculated environmental and genetic factor scores for PSLE (PSLE-E and PSLE-A) via an independent pathway model fitted to the 19 items of the PSLE questionnaire 75 (Supplementary Figure 13). Conceptually, this divides the IRT PSLE score, which estimates the (phenotypic) latent trait underlying the participants' responses in the questionnaire, into its additive genetic and environmental dimensions ( Supplementary Figures 13 and 14). We then replaced the PSLE IRT score by each of its components in a mixed model to investigate the source of the interaction: GxG if interaction between PRS and PSLE-A or GxE if interaction between PRS and PSLE-E. As in our previous analyses, we regressed covariates out of the factor scores and also included them in the model. The model including PSLE-E was the best fitting as indicated by lower Akaike information criterion, AIC (AIC PSLE-E =2645, AIC PSLE-A =2798). Further, the interaction term points towards a greater GxE effect, as the interaction between PRS and PSLE-E explains 0.11% of the variance (p-value=0.0076), although we cannot rule out the presence of an interaction with PSLE-A (r 2 =0.076%, p-value=0.037) (results given for the PRS "P<1", see Supplementary Figure 15 for more details). Results were showed a similar pattern when we modelled the A and E factors of PLSE-active.
Notwithstanding the above caveats, more work is needed to evaluate different mechanisms of interaction, including a bi-causal relationship between PSLE and depression or molecular interaction (e.g. through methylation changes). We are aware of potential confounds of interaction analyses: in addition to sensitivity of the analyses to the properties of the scale 76 and unavoidable departures from normality in the outcome and predictors as discussed above ( Supplementary Figures 6 and 11), problems may also arise from the stress and depression measures being self-reported in the same questionnaire, and the fact that the stress measures are genetically correlated with the outcome variable. Replication of our findings in independent cohorts, consideration of other variables such as the perceived impact of stressors and improvement of PRS via larger GWAS and larger samples with both depression and risk factors evaluated will allow us further to refine our understanding of the aetiology of depression.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material. by a QIMR Berghofer Fellowship. BCD is supported by a UQI scholarship. SEM is supported by a NHMRC research fellowship (1103623). We are grateful to the QIMR participants, data collectors and data managers. Association between MDD-PRS and depression scores (main effects, one-sided tests, results expressed in % of variance explained). Full sample analyses using the two versions of the PRS were run in the same target dataset with the exact same covariates. Red bars indicate positive correlation with the depression score. PRS were calculated using different p-value thresholds from the GWAS summary statistics. The most conservative only includes independent loci with genome wide significant SNPs (p-value<5e-8), while the least conservative include the most significant SNP of each haplotype (p-value<1). (c) Association between self-reported stress (personal stressful life events (PSLE) or network stressful life events (NSLE), lack of SS) and depression IRT score (main effect, one-sided tests, results expressed in % of variance explained). Blue bars indicate negative correlations and red bars indicate positive correlations with the depression score. Dashed bars indicate sex specific effects. (d) Variance of the depression score explained by the interaction between PRS and PSLE (2-sided tests). Dashed bars indicate sex specific effects. We focused on the association with the PRS comprising all haplotypes but the other associations are also reported for completeness. Blue bars indicate negative correlations and red bars indicate positive correlations with the depression score. (e) Increase in depression score (fitted values, vertical axis) as a function of PSLE and MDD-PRS. For example, the effect of the PSLE-diathesis interaction is visible when comparing the bottom (minimal PSLE) and top (maximal PSLE) edges of the surface. The difference in slopes indicates that PSLE mediates the effect of the genetic predisposition on the depression score. From right to left, results for the whole sample, females and males. In all analyses, we accounted for familial relatedness using a kinship matrix (a) or a genetic relatedness matrix calculated from SNPs (b-d). For (a) we used R package "hglm" 77 to estimate the odds ratios (student test to test the significance of the fixed effects) . For panels (b-d) the parameters of the model were estimated using GCTA 1.26.0 (student test to test the significance of the fixed effects) 62 . All analyses controlled for age, age 2 , sex, age*sex and age 2 *sex interactions, study, array, and the first four genetic principal components in the outcome variable and predictors 60 .