Embracing the positive: an examination of how well resilience factors at age 14 can predict distress at age 17

One-in-two people suffering from mental health problems develop such distress before or during adolescence. Research has shown that distress can predict itself well over time. Yet, little is known about how well resilience factors (RFs), i.e. those factors that decrease mental health problems, predict subsequent distress. Therefore, we investigated which RFs are the best indicators for subsequent distress and with what accuracy RFs predict subsequent distress. We examined three interpersonal (e.g. friendships) and seven intrapersonal RFs (e.g. self-esteem) and distress in 1130 adolescents, at age 14 and 17. We estimated the RFs and a continuous distress-index using factor analyses, and ordinal distress-classes using factor mixture models. We then examined how well age-14 RFs and age-14 distress predict age-17 distress, using stepwise linear regressions, relative importance analyses, as well as ordinal and linear prediction models. Low brooding, low negative and high positive self-esteem RFs were the most important indicators for age-17 distress. RFs and age-14 distress predicted age-17 distress similarly. The accuracy was acceptable for ordinal (low/moderate/high age-17 distress-classes: 62–64%), but low for linear models (37–41%). Crucially, the accuracy remained similar when only self-esteem and brooding RFs were used instead of all ten RFs (ordinal = 62%; linear = 37%); correctly predicting for about two-in-three adolescents whether they have low, moderate or high distress 3 years later. RFs, and particularly brooding and self-esteem, seem to predict subsequent distress similarly well as distress can predict itself. As assessing brooding and self-esteem can be strength-focussed and is time-efficient, those RFs may be promising for risk-detection and translational intervention research.

Every year, about 1 in 5 people experience mental disorders, 1,2 of which the most prevalent mental illnesses are depressive and anxiety disorders. 1 Half of such mental illnesses first emerge during adolescence. 3 About 1 in 3 adolescents have an episode of an anxiety disorder and more than 1 in 10 an episode of a mood disorder, between the ages of 13 and 18. 4 The prevalence of anxiety disorders tends to remain stable during adolescence, however, mood disorders double between the ages of 13 and 18. 4 Hence, adolescence seems to be a particularly sensitive time period for the emergence of mental health problems and it is therefore imperative to characterize and predict such vulnerability to psychopathological distress properly.
A growing number of studies has developed screening tools and risk prediction models also known as risk calculatorsfor mental health problems. 5,6 For example, Dinga and colleagues (2018) 7 have shown that, among a large variety of psychological and biological variables, only mood severity predicted subsequent depressive symptomology significantly. 7 Still, their prediction model revealed an acceptable accuracy. 7 Similarly, Lewis and colleagues (2019) have shown that a constellation of demographics, psychopathology symptoms (i.e. psychotic and internalizing symptoms), and adversity variables can together satisfactorily predict whether adolescents develop post-traumatic stress disorder, following trauma exposure. 8 In a recent systematic review, summarizing literature on mental health screening tools and risk models, 60 studies were identified for depression related diagnoses, 13 for psychopathological stress, five for anxiety related diagnoses, and five for well-being. 6 Importantly, the majority of those studies used symptom-related (e.g. questionnaires and interviews), demographical (e.g. adverse life-events), or biological indicators (e.g. inflammatory markers, cortisol, metabolic syndrome, brain-derived neurotrophic factor, white and grey matter, and heart rate variables). [5][6][7]9 Thus, previous studies primarily examined predictors that are relatively static (e.g. ethnicity or gray matter) and/or risk factors that increase the development of mental health problems (e.g. negative life-events or prior psychiatric symptoms). Focussing on static and risk factors, however, is only half the story, as it fails to address factors that are amenable and promote mental health. The resilience literature has already identified various factors that are associated with improved subsequent mental health, [10][11][12][13][14] which seem to be overlooked in the development of screening tools and risk calculators. A notable exception is the study of Chen and colleagues (2015) 15 in which selfesteem was used to predict subsequent anxiety. Another important exception is the recent study of Meehan and colleagues (2020) 16 , which included alongside various risk indicators four potential resilience factors (sibling warmth, adult involvement, social cohesion and status among peers), to predict internalizing and externalizing disorders following victimization.
Here, we aim to extend the existing literature in several ways. Firstly, we use resilience factors (RFs) as predictors, i.e. factors that have been found to reduce the risk of psychological distress following adverse experiences. 17 We derived the RFs that we study here from a preregistered systematic review, 17 in which RFs were defined as those factors that moderate and/or mediate the relationship between childhood adversity and subsequent mental health problems. In the resilience literature there is a sparse but ongoing discourse about whether resilience and risk factors are opposing sides of the same coin (for a detailed discussion see Fritz, Stochl and colleagues 18 ). Some RFs and risk factors seem indeed to be on opposing sides of the same continuum (e.g. RF = high friendship support & risk factor = low friendship support) 19 , whereas for others this apparent dichotomy seems more complex. For example, high rumination can be both an RF and a risk factor depending whether its content is positive or negative (e.g. RF = high positive rumination & risk factor = low positive ruminations; RF = low negative rumination & risk factor = high negative rumination; while high positive and high negative rumination often go together 20 ). Importantly, regardless of whether resilience and risk factors operate on the same continuum, studying the predictive value of RFs has universal appeal as it focuses on what promotes good mental health rather than on what increases mental health problems. 18 Secondly, we extend the existing literature through focusing exclusively on factors that are amenable to psychotherapeutic change, which is in contrast to the majority of the above reviewed studies, as those mainly focused on relatively static demographic (e.g. ethnicity) and biological (e.g. grey or white matter volume) predictors. More specifically, we predict psychopathological distress from 10 amenable RFs. Three of those RFs operate on an inter-individual level: friendship support, family support and family cohesion; and seven on an intra-individual level: high positive self-esteem, low negative self-esteem, low brooding, low ruminative reflection, high distress tolerance, a low aggression potential and low expressive suppression. 17 Importantly, all those RFs on their own have been found to decrease subsequent mental health problems, yet, research investigating multiple RFs at the same time is so far scarce. [21][22][23] Recently, we found that these RFs reduce concurrent psychopathological distress with a similar degree in adolescents with and without prior exposure to adversity. Moreover, we have shown that the RFs interrelate strongly and can be described as a complex interacting system. 18 This supports the notion that models that succeed in taking all those factors into account may ecologically be more valid and may successfully reveal those RFs that are particularly important in reducing the risk of mental health problems.
Recently, research has also shed light on the benefits of describing mental health problems as distress continua rather than as discrete diagnosis specific constructs. For example, several studies show that modelling psychopathological symptoms as a continuous latent factor captures a wide range of mental health symptomatology, in terms of both severity and breadth of symptomatology, [24][25][26][27][28] and even seems to generalize well to other disorders. 25 Therefore, such latent continuous constructs may be particularly informative for transdiagnostic prevention and intervention research. Moreover, hybrid models have been developed that describe mental health symptoms as a continuous latent factor and then add categorical classes to the latent factor that differentiate between subgroups on the latent mental distress continuum (e.g. as defined by differences in the distress severity). 29 Categorical distress scores derived from those models may be particular useful for prediction purposes, as they allow for the estimation of predictive sensitivity and specificity, while taking into account the continuous nature of distress. Yet, to the best of our knowledge, transdiagnostic distress indices have so far rarely been used for predictive purposes and is therefore the third way in which we extend the existing literature.
In sum, we aim to extend the existing literature (a) by using resilience factors rather than risk markers as predictors for subsequent psychopathology, (b) by using amenable (i.e. social, emotional, cognitive and behavioural) rather than static variables (e.g. ethnicity or biological predispositions) as predictors, and (c) by using transdiagnostic distress indices rather than discrete diagnosis specific variables as outcome variables. To this end, we use data from the ROOTS population cohort (n = 1130) 30 to predict distress at age 17 from RFs assessed at age 14, covering the adolescent period during which about half of all mental illnesses start emerging. Given the powerful predictive effects of past mental distress, we evaluate in addition to the relative effects of RFs also the relative effect of distress at age 14 when predicting distress at age 17. A cascade of studies has shown that childhood adversity (CA) vastly increases the risk for mental health problems during adolescence and adulthood. [31][32][33][34] Therefore, throughout all analyses, we take the effect of CA before the age of 14 into account. Additionally, we control for gender effects, as being female has frequently been found to increase the risk for distress. e.g. 26 In sum, we aim to examine: a) to what degree RFs can explain subsequent distress, b) which RFs are the best indicators for subsequent distress, and c) with what accuracy RFs can predict distress levels three years later.

Sample
The ROOTS study is a population cohort for which 1238 adolescents were recruited at age 14

Participants
Here we included adolescents who had complete data on all RFs at age 14 as well as distress at age 14 and 17, and had information on gender and the presence/absence of adverse experiences before the age of 14 (N = 729).

RFs
In accordance with Fritz et al. 2018 23 and 2019 18 , we investigated 10 RFs that were identified in our preregistered systematic review 17 and were assessed in ROOTS 30 . All RFs were assessed at age 14: 1. Friendship support: five items of the Cambridge Friendships Questionnaire. 35 2. Family support: five items of the McMaster Family Assessment Device. 36 3. Family cohesion/climate: seven items of the McMaster Family Assessment Device. For brevity we write family cohesion throughout the manuscript. 36 4. Positive self-esteem: five items of the Rosenberg self-esteem scale. 37 5. Negative self-esteem: five remaining items of the Rosenberg self-esteem scale (of note, the items are reversed). 37 6. Reflective rumination: five items of the Ruminative Response Scale (RRS; of note, the items are reversed). 38,39 7. Ruminative brooding: five items of the RRS (of note, the items are reversed). 38

Childhood Adversity
CA was assessed with the Cambridge Early Experiences Interview (CAMEEI), which is a semi-structured interview performed with the primary carer. 46 CAs were defined as adverse experiences or severely stressful events that happened between birth and the age of 14. The assessed CAs include a wide range of intra-family events/experiences (e.g. sexual, physical or emotional maltreatments, or parental mental illness), but also cover external events (e.g. a fire or exposure to war). For a detailed description see Dunn and colleagues. 46 These authors clustered the adolescents based on their CA experiences into four latent classes (i.e. no, moderate, severe and atypical parenting CA), separately for the time periods early (age 0 to 5), middle (age 5 to 11) and late childhood (age 11 to 14). 46 As in previous reports on this sample, 23 we dichotomized the CA variable in CA+, which is 'moderate, severe and/or atypical parenting CA' for at least one of the three time periods, and CA-, which is 'no CA' for any of the three time periods.

Analyses
Variable estimation. Prior to the main analyses we computed the RFs based on unidimensional confirmatory factor analyses (CFAs; except for expressive suppression as this was assessed with only one item). We use factor scores and not sum scores to evade tau-equivalence and to decrease measurement error as much as possible (for a rationale and explanation see Additional file V Part A in Fritz, Stochl et al., 2019 18 ). As all items ranged between three and six answer categories, we used categorical CFAs with a weighted least square mean and variance adjusted (WLSMV) estimator. The distress factor was similarly estimated using a longitudinal, unidimensional, categorical CFA (also with the WLSMV estimator), and was identified according to the strongly invariant model described by Wu and Estabrook (for a more detailed rationale see Supplement I). 47 Prediction analyses. First, we performed a series of multiple linear regressions to predict distress at age 17. The first two models functioned as baseline models, one only included CA (model B1) and the other one included CA and gender as regressors (model B2). The next three models were the main models of interest: All contained CA and gender as regressor, the first model additionally contained the ten RFs (model M1), the second model additionally contained distress at age 14 (model M2), and the third model additionally contained both the RFs and distress at age 14 (model M3). Those analyses were performed to examine the directionality of the regressors (i.e. +/-sign of the b-values) and to investigate which regressors add significant variance to the explanation of distress at age 17. We additionally compared the models against each other using Likelihood-Ratio tests. Moreover, we re-estimated the models separately for the CA+ and the CA-groups as well as for males and females, to explore group effects. Second, we aimed to disentangle the relative importance (RI) of the regressors in explaining general distress at age 17. Disentangling the RIs is of particular importance when the regressors are (or are assumed to be) strongly correlated, as every order of regressors then results in a different decomposition of sum of squares. 48 Here, we examined the RI metric "lmg" (cf. Lindeman, Merenda and Gold), 49 which calculates sequential R 2 s while permuting and then averaging over the regressor orders. 48 To this end, we performed the three above described main models (M1, M2, and M3) as RI analyses. Moreover, we repeated the analyses separately for the CA+ and the CA-group as well as for males and females, to investigate differences in result patterns between subgroups.
Third, we conducted prediction analyses, to test with what accuracy the RFs and general distress at age 14 predict distress at age 17. We again used the three main models described above (M1, M2 and M3). All three prediction models were conducted once as a categorical model, with general distress at age 17 as categorical outcome variable, and once as linear models, with general distress at age 17 as a continuous outcome variable. For the categorical distress variable we conducted a series of factor mixture models, 29 which are hybrid models that add latent classes on top of the latent factors, with different invariance levels between the classes. We did this to classify the adolescents based on their distress profiles into categorical distress classes, while also taking into account the continuous nature of distress. Firstly, we applied latent class analyses to identify possible class solutions and then conducted one-factor mixture models with the appropriate class solutions (factor mixture model analyses details can be found in Supplement II). For a factor mixture model solution with two classes we planned to use logistic prediction models, for a factor mixture model solution with three or more unordered classes we planned to use multinomial prediction models, and for a factor mixture model solution with three or more ordered classes we planned to use ordinal prediction models. For the prediction analyses the sample was quasi-randomly split into a training sample (75%; n = ~545) and a testing sample (25%; n = ~180; quasi-randomly means that that the relative class proportion of age-17 distress was kept equal between the training and the testing sample). We chose to have a larger training than testing sample, to be able to estimate as accurate prediction models as possible, particularly given that categorical prediction models require a substantial amount of power (relatively more than linear models, depending on the category number and size of the outcome variable). To determine the best link function for the categorical prediction models (i.e. logistic or probit) we used the Akaike Information Criterion (AIC) and the residual deviance as model comparison indices. We then used the models resulting from the training procedures to predict distress at age 17 in the testing sample. To evaluate categorical prediction models, we calculated the amount of predicted distress scores that were predicted into their observed distress class. To evaluate the linear prediction models, we used the standard errors (SEs) of the age-17 distress factor scores and computed personspecific 95% confidence intervals (CI). We then calculated for how many adolescents our model could predict distress scores that fell into their respective 95% factor score CI. We again, also computed the analyses separately for the CA+ and the CA-group as well as for males and females, to investigate differences in result patterns between subgroups. This time, we could quantify the differences between the CA and the gender subgroups using proportion comparison tests, as we could describe the determined accuracies as accuracy proportions.
Software. Most analyses were performed in R version 3.5.1 (R packages are reported in Supplement III). 50 The factor scores and SEs for age-14 and age-17 distress were estimated in MPlus 8.2, 51 as it was not possible to compute the SEs based on categorical data in R. Similarly, we performed the latent class and factor mixture model analyses in MPlus as this allowed us to specify the items as categorical. 51 Data availability. Data for this specific paper has been uploaded to the Cambridge Data Repository https://doi.org/10.17863/CAM.46642 and is password protected. Our participants did not give informed consent for their measures to be made publicly available, and it is possible that they could be identified from this data set. Access to the data supporting the analyses presented in this paper will be made available to researchers with a reasonable request to openNSPN@medschl.cam.ac.uk.
Code availability. Analysis code is available from http://jessica-fritz.com/.

Sample
After excluding four adolescents who qualified as outliers in the multivariate space, we could include 725 adolescents of which 377 were exposed (CA+) and 348 were not exposed to prior CA (CA-; see Table 1). The CA groups did not differ in age or gender proportions. SES was higher and a prior psychiatric history was less likely in the CA-than in the CA+ group. Of the 725 participants, 415 were female and 310 male. The male and the female groups did neither differ in age nor SES. Female adolescents were more likely to have a prior psychiatric history. tests were used for binary data and performed with Yate's continuity correction. The z-test was used for the SES variable and was conducted as asymptotic linear-by-linear association test, to account for the ordering in the data. The t-tests were used for continuous data and were conducted as Welsh's two-sample t-tests. SES was calculated based on the ACORN classification system (http://www.caci.co.uk). 52 Prior psychiatric history was measured with the Schedule for Affective Disorders and Schizophrenia for School-Age Children (Present and Lifetime Version) 53 and included learning disabilities, clinical sub-threshold diagnoses and deliberate self-harm at age 14; and clinical sub-threshold diagnoses and deliberate self-harm, but not learning disabilities, at age 17. *Please note, one participant has missing data on this variable which is why the numbers do not add up. Tests were conducted two-sided.

Disentangling the Amount of Variance that RFs and Age-14 Distress Explain in Age-17 Distress
First we performed two baseline models, one only included CA (model B1) and the other one CA and gender (model B2) as predictors for age-17 distress. Then we conducted three main models. In addition to CA and gender, the first model contained the ten RFs (model M1), the second model contained age-14 distress (model M2), and the third model contained both the RFs and age-14 distress (model M3) as predictors for age-17 distress. We conducted the three main models for two reasons. Firstly, when comparing the individual effects of the RFs (M1) with the individual effects of age-14 distress (M2) it is possible to find out whether RFs and age-14 distress are similarly predictive for subsequent age-17 distress. This comparison seemed important, as the predictive value of previous distress on future distress has been investigated frequently, but little is known about the predictive magnitude of the RFs. Secondly, exploring the effects of RFs on age-17 distress over and above the effects of age-14 distress (M3) seemed relevant, as it gives an indication for the magnitude with which RFs explain change in distress between age 14 and 17.
Adding the RFs to CA and gender significantly improved the model and increased the explained variance from 10 to 28% (see Likelihood-Ratio test for M1 in Table 2). Similarly, adding age-14 distress (instead of the RFs) to CA and gender significantly improved the model and increased the explained variance from 10 to 38% (see Likelihood-Ratio test for M2 in Table 2; see Supplement IV for Figures depicting change in distress). Adding age-14 distress to the model with CA, gender and the RFs improved the model significantly and increased the explained variance from 28 to 39% (see Likelihood-Ratio test for M3-D14 in Table 2). Adding the RFs to the model with CA, gender and age-14 distress did not improve the model significantly and increased the explained variance from 38 to 39% (see Likelihood-Ratio test for M3-RFs in Table 2). Hence, the RFs seemed to explain age-17 distress, but not the change in distress from age 14 to age 17. Importantly, there was no multicollinearity between the RFs and age-14 distress (see Supplement V). When computing the analyses separately for the CA+ and

Disentangling the Relative Importance of RFs and Age-14 Distress in Explaining Age-17 Distress
We next decomposed the individual variance contribution of the regressors. In the model including both age-14 distress and the RFs, the RFs explained slightly less variance in age-17 distress than age-14 distress (M3 RFs total variance = 42%; M3 age-14 distress total variance = 46%; see Table 3). Moreover, when taking age-14 distress into account the importance ranking of the RFs stayed mainly the same as in the model without age-14 distress (i.e. compare M1 and M3). The self-esteem and brooding RFs explained most and expressive suppression explained the least amount of variance. The results remained similar when being computed separately for CA+ (M3 RFs total variance = 45%; M3 age-14 distress total variance = 48%) and CA-groups (M3 RFs total variance = 47%; M3 age-14 distress total variance = 45%), as well as for female (M3 RFs total variance = 43%; M3 age-14 distress total variance = 52%) and male participants (M3 RFs total variance = 51%; M3 age-14 distress total variance = 46%). The RFs seemed to explain more relative variance in the CA-and the male group, whereas age-14 distress seemed to explain more relative variance in the CA+ and the female group.

Disentangling the Accuracy with which RFs and Age-14 Distress Predict Age-17 Distress
We first performed a series of factor mixture models to classify the adolescents based on their categorical distress profiles, while also taking into account the continuous nature of distress.
The three-class model, which allows the factor score mean to vary per distress class (called factor mixture model -1; for more specific analysis details see Supplement II), performed well (entropy = 0.95) and revealed a theoretically plausible solution, splitting the adolescents into "low/mild", "moderate" and "high" distress severity classes. Figure 1 shows the class solution plotted against the continuous general distress scores. Figure 1. Three-class distress solution (low: n = 343; moderate: n = 292; high: n = 90) plotted against the continuous distress severity scores. Center line = median (50% quantile); lower box limit =25% quantile; upper box limit = 75% quantile; lower whisker = smallest observation greater than or equal to the lower box limit -1.5 x Inter Quartile Range (IQR); upper whisker = largest observation less than or equal to the upper box limit + 1.5 x Inter Quartile Range (IQR).
As the best class solution was ordered categorical, we conducted three ordinal prediction models with the three-class distress variable as outcome variable. Of the three models one again contained the RFs (M1), one age-14 distress (M2) and one both (RFs and age-14 distress; M3) in addition to gender and CA as predictors. Here, we conducted the three models to investigate whether RFs (M1) have a similar predictive accuracy as age-14 distress (M2), and to find out whether the combination of RFs and age-14 distress is better than one information source alone (M3 vs M1 and M2). The applied ordinal regression models have a proportional odds assumption, which was not met for all predictors. Therefore, we conducted the ordinal regressions as partial proportional odds models and relaxed the proportional odds assumption for those predictors that did not meet the assumption (see details in Supplement VI).
The three models (M1-M3) had a low to acceptable accuracy ranging from 63% to 66% (see Table 4). Hence, about 2 out of 3 adolescents were correctly predicted into their distress severity class, regardless of using RFs, age-14 distress, or both as predictors for age-17 distress. The results were somewhat different when we split the adolescents into CA+ (accuracy: M1 = 45%, M2 = 55%, M3 = 52%), CA-(accuracy: M1 = 67%, M2 = 60%, M3 = 59%), female (accuracy: M1 = 50%, M2 = 56%, M3 = 53%) and male groups (accuracy: M1 = 63%, M2 = 62%, M3 = 65%). Yet, most of the prediction accuracies did formally not differ between the CA and gender subgroups (for details see Supplement VII); only model M1 revealed a significant accuracy difference between the CA subgroups (Chi 2 = 8.16, df = 1, p = 0.004). Note. D14 = age-14 distress. All models were computed with childhood adversity and gender as predictors. ROC = receiver operating characteristic. Accuracy = relative number of correctly predicted cases. Sensitivity = e.g. for low distress: the number of adolescents who are correctly predicted into the low distress group divided by all adolescent who are actually in the low distress group. Specificity = e.g. for low distress: the number of adolescents who are correctly not predicted into the low distress group divided by all adolescent who are actually not in the low distress group. Variable for which the proportional odds assumption was relaxed can be found in Supplement VI.
We next tested the prediction accuracy for linear models with the continuous distress severity variable as outcome measure. These analyses revealed that in contrast to the ordinal models, the prediction accuracy for all three linear models was low (39 to 47%; Table 5  Note. D14 = age-14 distress. All models were computed with childhood adversity and gender as predictors. RMSE = root mean squared error, MAE = mean absolute error, Accuracy = relative number of correctly predicted cases. Model accuracy was based on 1000 bootstraps.

Post-hoc Exploration: Disentangling the Accuracy for fewer RFs Predicting Age-17 Distress
In our RF regression models (i.e. the M1s), two RFs were (at least marginally < 0.10) significant in three of the four subgroups, namely negative self-esteem and brooding. Moreover those two RFs had in all four subgroups the highest relative importance. Therefore, we next reran all prediction models this time instead of including all 10 RFs, CA and gender, only including these two RFs and gender. We did this to investigate whether the assessment of just two RFs and gender would provide similar information as all 10 RFs, CA and gender (i.e. M1). This is important as such an assessment may be more feasible and efficient in many non-clinical settings (e.g. in school assessments). Interestingly, in these post-hoc analyses, both the ordinal and the linear models performed similar as the models including all RFs (change in accuracy: ordinal models from 63% to 61%, Chi 2 = 0.047, df = 1, p = 0.83; linear models from 38.89% to 37.78%, Chi 2 = 0.012, df = 1, p = 0.91). Moreover, the models including gender, the two RFs and age-14 distress were rather comparable to the models including gender, CA, all 10 RFs, and age-14 distress (i.e. M3; change in accuracy: ordinal models from 66% to 66%, Chi 2 = 0, df = 1, p = 1; linear models from 46.67% to 44.44%, Chi 2 = 0.101, df = 1, p = 0.75). For completeness, we also conducted the prediction analyses with a subset of the RFs separately in the subgroups, which can be found in Supplement VIII.

DISCUSSION
We aimed to shed light onto potentially promising RF targets that reduce subsequent distress, by pursuing three sub-goals: First, we intended to find out to which degree RFs can explain subsequent distress. Our results suggest that RFs explained less variance in age-17 distress than age-14 distress could explain, but when the predictors were used together RFs explained a slightly lower but similar amount of variance than age-14 distress. Second, we aimed to find out which RFs are the best indicators for subsequent distress. Our results showed that negative self-esteem and brooding RFs explained most variance and revealed significance in the multivariable regression models. Third, we intended to explore with what accuracy RFs can predict distress levels three years later. We found that RFs and distress at age 14 were similarly accurate in predicting distress at age 17, with age-14 distress reaching a slightly higher accuracy. The prediction accuracy was low and highly unsatisfactory when we tried to predict continuous distress scores. When we predicted more crude ordinal ("low", "moderate" and "high") distress classes the accuracy was again not good, but acceptable. As such, both RFs and distress at age 14 (as well as their combination) are able to correctly predict the categorical distress class of about 2 in 3 adolescents.
RFs and/or age-14 distress explained more than one-fourth of the overall variance in distress three years later. Importantly, this was after CA and gender were taken into account. Hence, despite the fact that we have used gender, life-history information (i.e. CA), a broad range of distress symptoms and as many as 10 empirically supported RFs, we could not even explain half of the variance in distress three years later. This is alarming and interesting at the same time. Dinga and colleagues 7 put forward the explanation that the way psychopathology is defined may lack important information (i.e. content validity), such as biological components, which may make it so difficult to predict it well. Another explanation could be derived from the time period we have investigated. We assessed the adolescents during early (age 14) and later (age 17) adolescence, which is generally described as a particularly malleable period during which a lot of mental health problems develop. 3 That is, distress predictions over a period during which many mental health problems manifest themselves may be particularly difficult. A third account may come from the instructions that were provided for the assessment of the distress symptoms: "please tick how often you have felt or acted in this way over the past two weeks". The instructions assess distress during the past two weeks, which for some adolescents may have captured state-rather than trait-distress. An outcome construct that at least to some extent captures state characteristics may complicate the prediction even further. In sum, insufficient content validity, a sensitive developmental time period, and state-like characteristics of the distress variable may all help explain why it was so difficult to predict subsequent distress.
While the RFs explained age-17 distress significantly, the RFs did not explain change in distress from age 14 to age 17 significantly. Yet, the importance ranking of the RFs for explaining age-17 distress remained similar when taking age-14 distress into account. Moreover, the RFs and age-14 distress had a similar relative importance. Importantly, there was no overlap between RFs and distress items content-wise, and no multicollinearity between RFs and distress at age 14. Besides the comparable relative importance, RFs and age-14 distress had a similar accuracy for predicting age-17 distress. This clearly is a notable finding, as RFs could similarly well predict distress over the course of three years, as distress could predict itself over the course of three years. Moreover, a combination of the two information sources (RFs and age-14 distress) did not necessarily seem advantageous above either source alone. Therefore, if our results were to be replicated, we would assume that knowledge on the RFs may, due to its "conceptual commitment to strengths and assets" (see 54 , p. 136), be highly interesting for various public health and clinical settings. More specifically, in settings where a strengths-focus would be more feasible than a symptom-focus, RFs could be assessed to screen, monitor and potentially promote mental health.
If we would have to judge which of the RFs may be the most promising for screening, monitoring and potentially promoting mental health, we probably would choose negative selfesteem and brooding RFs, as those two had the strongest relative importance in reducing the risk of subsequent distress and were significant in the multivariable RF models (M1). Importantly, our prediction results remained rather stable when we used only those two instead of all 10 RFs. Moreover, those two RFs together are measured with only 10 items. Hence, assessing brooding and negative self-esteem RFs would not only have a relatively low stigma risk, but would also be highly time and money efficient. The finding that both self-esteem and brooding seem to play such an important role in the development of mental health problems has been noted in previous research and has led to the suggestion to use self-esteem 55 or brooding 56 as time-efficient and less stigma-prone mental health screens. Young and Dietrich (2014) 56 for example employed the same brooding subscale as used in our study (5 items of the RRS) 38 and detected a screening accuracy of 91 percent for concurrent depressive symptoms in young adolescents. Moreover, both self-esteem and brooding have already been found to be successful intervention targets, 57,58 particularly for interventions aimed at reducing internalizing disorders and/or increasing mental well-being. Interventions targeting self-esteem are suggested to be most successful when provided earlier during adolescence, as self-esteem often is more amenable during early than during late adolescence. 55 Moreover, rumination focused cognitive behaviour therapy has been shown to be a promising prevention intervention for adolescents at risk for internalizing mental health problems. 58 Yet, our results require replication in an independent sample and need ideally to be tested in translational studies, before screening and intervention-related recommendations can be made. Moreover, additional replication in other populations would be ideal, to ensure a clear scope for generalization.
It is important to note that our linear prediction models, which are derived from the group level, are not good enough to predict individual-level distress scores three years later. Those models translated for only two in five adolescents correctly to the individual level. Our categorical prediction models, which are also derived from the group level, did predict individual-level distress severity classes better, but there is still plenty of room for improvement. Those models translated for about two-third of the adolescents correctly to the individual level. Hence, the generalization from group to individual level is limited, particularly when predicting continuous transdiagnostic distress severity. Therefore, it is crucial that future research identifies ways to increase the prediction accuracy for subsequent distress severity. In sum, we recommend that future research (a) examines whether our findings replicate, (b) tests additional RFs that were not measured in our adolescent cohort but are empirically found to reduce subsequent distress, (c) identifies ways which further increase the prediction accuracy (e.g. shorter prediction intervals), (d) is conducted at the individual rather than (or in addition to) the group level, and (e) explores in which prevention and intervention settings targeting RFs may be most helpful.
Last but not least, our study is not without limitations. First, ROOTS has a slightly higher than average SES and thus may mainly generalize to more wealthy populations. 30 Second, our analyses were constrained to those people who provided data for both age 14 and 17, which is not ideal as we cannot rule out a possible increase in selection bias. Third, the binary CA variable may not be ideal as it omits the type of the adversity experience, as well as its severity and frequency. Particularly CA severity may be a valuable consideration and addition in future research. 59 However, justification for using CA as a binary indicator stems from research showing that CAs are likely to co-occur and that clustered CA indices have a robust, negative effect on mental health problems. 32,46,59 For future research it would be ideal if adversity would also be assessed, and controlled, for the interim period between the assessment of the RFs and the assessment of subsequent mental distress. Fourth, the RFs were not all assessed with measures developed to particularly reflect the RF construct at hand (e.g. aggression or expressive suppression). Hence, future research should aim to replicate our results with scales particularly developed for the specific RFs, to increase the content validity. Fifth, we only tested 10 RFs, as only those were assessed in our adolescent cohort. However, in the realm of complexity we think that it would be advantageous if future research could assess and test more than 10 empirically-supported RFs. Sixth, our distress index was mainly defined by internalizing (and not externalizing) symptoms and does not contain information on the distress chronicity. Seventh, we built the prediction models on a subset of the ROOTS cohort (n ~545) to predict distress three years later for another ROOTS subset (n ~180). This means that we used data from the same cohort for training and testing our model. However, it may be that adolescents in our cohort are more comparable to each other than to the general population. This would mean that our prediction accuracy would be lower when using our model to predict distress scores for adolescents who did not take part in ROOTS. Therefore, replication of our findings in a different sample is crucial. Eighth, here we mainly focussed on the overall sample and not so much on findings within the subgroups (CA+ vs. CA-, females vs. males). Yet, there were slight differences in the relative importance of the RFs between the subgroups. Future research should more specifically focus on those differences, for example with moderation analyses.
Critics might argue that investigating age-17 distress as both a categorical and a continuous outcome is superfluous. Yet, we believe that there are good reasons from a scientific as well as a clinical point of view that justify the usage of both (categorical and continuous outcomes) in conjunction. From a statistical point of view it may perhaps seem neater to investigate distress continua. But, first of all our distress classes did take the distress continuum into account, and more importantly, as prior research often only looked at categorical outcomes we feel that it is high time to gain information on the comparison of precise continuous versus more crude categorical outcomes. As our findings showed, it seems like we are not good enough yet to predict precise distress continua, but we are getting into an acceptable range for predicting crude distress classes (from either RFs, distress, or their combination). From a translational point of view, one may favour a categorical outcome as this is often used in clinics, such as cut-offs like "low risk", "at risk/sub-threshold", and "diagnosed". Although crude categorical outcomes may be more easily translatable, providing results of both approaches has given rise to the clinically relevant finding that RFs and prior distress may be promising targets for screens aiming at predicting rough distress risk-categories (e.g. "low", "moderate", "high"), but not yet for screens aiming at predicting precise distress risk levels.
As pointed out in the introduction, there is a sparse but ongoing discourse about whether resilience and risk factors are opposing sides of the same coin, which cannot fully be done justice within the scope of this manuscript. However, we suggest that future studies could conduct more idiographic rather than group level research, as the "relationship between resilience and risk factors is likely to additionally depend on biological predispositions, type(s) of adversity experienced, the specific environmental circumstances, and the developmental stage" (see p. 3 in Supplement XVI of Fritz et al, 2019 18 ) of the adolescent. Moreover, while this manuscript specifically focusses on using RFs that predict mental health problems (in individuals with and without CA exposure), it would be interesting to see future research taking the same modelling approach but focussing on those factors that predict a resilient functioning outcome. To this end one could for example focus on resilience predictors reviewed by Kalisch and colleagues (2015; including hair cortisol concentration, trait self-enhancement, expression of specific gene networks, and cortisol stress reactivity), 60 or on factors that predict resilient growth trajectories and resilient functioning outcomes as reviewed by Bonanno and colleagues (2011; including perceived control, high positive affectivity, low negative affectivity, trait resilience, low brooding, coping self-efficacy, emotional support, social support, instrumental support, favorable worldviews, and positive emotions), 61 or on factors that relate to resilient functioning specifically following childhood maltreatment, as reviewed in Ioannidis and colleagues (2020; including the social environment as well as biological factors related to the hypothalamic-pituitary-adrenal axis and polygenetics). 62 Overall, our results showed that the RFs were able to correctly predict the categorical ('low'/'moderate'/'high') distress class of 2 in 3 adolescents three years later. This finding was highly similar when predicting age-17 from age-14 distress. The two RFs that were most promising in predicting and reducing subsequent distress were negative self-esteem and brooding. Hence, those two RFs may potentially be promising targets for risk-detection and interventions, if they hold up in replication and translational research. Note. WLSMV = weighted least squares mean and variance corrected estimator; CFI = Comparative fit index; TLI = Tucker-Lewis index; RMSEA = Root mean square error of approximation; CI = Confidence interval; chisq = chi-square; BM = baseline model; C IM = configural invariance model; L+T+I IM = loadings, thresholds, and intercepts invariance model. All models were conducted with the delta parameterization.

SUPPLEMENTS
We used modification indices only when statistically necessary and theoretically defensible. All CFA models fitted reasonably. For aggression the resulting factor scores were notably poorly distributed and we therefore binarized this variable. The continuous latent distress scores used in the manuscript are based on a strongly invariant, categorical CFAs (i.e. L+T+I IM models in the above table), to ensure the latent mean comparability between distress at age 14 and age 17. More specifically, we applied the delta parametrization, equated item loadings and item thresholds across the two time points (i.e. age 14 and 17), fixed all item intercepts to 0, the item scales of the first time point to 1, the latent factor mean of the first time point to 0, and the latent factor variance of the first time point to 1.

Supplement II
For the categorical prediction model we aimed to classify the adolescents based on their distress profiles into a categorical distress variable. Firstly, we applied latent class analysis (LCA) with ordinal items, an MLR estimator, and a logit link (see Table 1), to identify possible class solutions. We used the same 41 anxiety and depression items for the LCA as for the general distress factor model. We tested a 2-, a 3-and a 4-class solution. The 3-class solution had the highest entropy (=0.961), but did not differ significantly from the 2-class solution (entropy = 0.960; Likelihood-Ratio tests (LRT) = 2690.16, p = 0.76). Based on those results we conducted a series of factor mixture models (FMMs), 29 which are hybrid models that add latent classes on top of the latent factors, with different invariance levels between the classes. We tested those FMMs with 2, 3 and 4 classes. The FMM1 is the factor mixture model with the most invariance constraints between classes, as it only allows the factor mean to vary between classes. The FMM1 with 2 classes fitted better than the FMM1 with 1 class (LRT = 7462.22, p < .001).
Moreover, the FMM1 with 3 classes fitted better than the FMM1 with 2 classes (LRT = 2143.07, p < .01). The FMM1 with 4 classes fitted better than the FMM1 with 3 classes (LRT = 906.57, p < .05), but had a lower entropy (0.952 vs 0.922) and revealed one very small class (class 4). In the prediction models, 32 adolescents of this class were sampled in the training and 10 in the test sample. Hence, this is already a small group to be predicted, but when we then split the sample further into CA+ vs CA-and into female vs male, the high distress class had for the CA-group only 6 adolescents in the training and 2 in the test sample. Similarly, the female group had only 5 adolescents in the training and 1 in the test sample. We therefore considered this class practically too small. We also tested the FMM2 model, in which in addition to the factor mean also the factor variance can vary between classes. The FMM2 model for the 2, the 3 and the 4 class solution had however a noticeably low entropy (2 classes: 0.371; 3 classes: 0.532, 4 classes = 0.314) and did not fit better than a 1 class solution. In sum, we decided to go forward with the FMM1 3-class solution, to have sufficiently predictable class sizes. Moreover, the 3 class model revealed a theoretically plausible and practical solution, which is described in the main text. For completeness we also computed the prediction analyses with the FMM1 with 4 classes as outcome variable, which can be found in Supplement IX. 14 Note. Mod = model; CA = childhood adversity; Frn = friendship support; Fms = family support; Fmc = family cohesion; Ngt = negative self-esteem; Pst = positive self-esteem; Brd = brooding; Rfl = reflection; Dst = distress tolerance; Agg = aggression; Exp = expressive suppression; D14 = age-14 distress; B2 = baseline model with CA and gender as predictors; M1 = main model with CA, gender and RFs as predictors; M2 = main model with CA, gender and age-14 distress as predictors; M3 = main model with CA, gender, RFs and age-14 distress as predictors. When taking the square root of the variance inflation factors, none is bigger than 2, which additionally underpins the absence of multicollinearity.

Supplement VI
Ordered categorical, or proportional odds models, have a "proportional odds" or also called "parallel slopes" assumption. This assumption necessitates that when the tested ordinal categories are dichotomized (e.g. here "a": low vs moderate and high, and "b": low and moderate vs high) the logistic prediction of the respective dichotomized categories results in two slopes (i.e. one for scenario "a" and one for scenario "b") that do not differ significantly from each other. If the slopes differ significantly, the proportional odds assumption does not hold and needs to be relaxed. The assumption can be determined for each predictor in the model and only for those predictors that do not meet the assumption separate slope values need to be estimated. This then results in a partial proportional odds model. It would also be possible to estimate a non-proportional odds model to circumvent the assumption for every variable in the model. However, this would be highly disadvantageous as it requires a vast amount of power. Hence we opted for the partial proportional odds model to ensure that we have as much power as possible. The below table depicts all the variables for which the proportional odds assumption was relaxed:

Supplement VII
The below tables depict the prediction accuracy of the prediction models described in the main manuscript. The first two tables depict subgroup accuracy comparisons for CA and gender models, respectively. The third table depicts accuracy comparisons for models including all RFs versus models that only include a subset of the RFs. Chi-squared = 0.1008, df = 1, p-value = 0.7509 Note. M1 = Model 1 contains the ten RFs, M2 = Model 2 contains age-14 distress, M3 = Model 3 contains both the RFs and age-14 distress as predictors for age-17 distress. Correct predictions = number of correctly predicted adolescents, Total predictions = number of adolescents that could have been predicted correctly, Accuracy = ratio correct predictions divided by total predictions. df = degrees of freedom.

Supplement VIII
For the CA+ group we tested four RFs in addition to gender, as those were significant in the multivariable model, namely: friendship support, family cohesion, brooding, and aggression. Those models were similarly predictive as the models with all 10 RFs and gender (change in accuracy: ordinal models from 45% to 50%; linear models from 43.01% to 35.48%). We also tested those two models while additionally including age-14 distress, which were again similar as the models with gender, the 10 RFs and age-14 distress (change in accuracy: ordinal models from 52% to 55%; linear models from 43.01% to 36.56%). Interestingly, while the accuracy of the ordinal models seems to increase with less RFs, the accuracy of the linear models seems to decrease. 70  10 0 of which -00 correct -00 false low -00 false l/m -00 false m/h Note. D14 = age-14 distress. All models were computed with childhood adversity and gender as predictors. ROC = receiver operating characteristic. Accuracy = relative number of correctly predicted cases. Sensitivity = e.g. for low distress: the number of adolescents who are correctly predicted into the low distress group divided by all adolescent who are actually in the low distress group. Specificity = e.g. for low distress: the number of adolescents who are correctly not predicted into the low distress group divided by all adolescent who are actually not in the low distress group. Variable for which the proportional odds assumption was relaxed can be found in Supplement VI. Figure 1. Four-class distress solution (low: n = 263; low/moderate: n = 260; moderate/high: n = 160; high: n = 42) plotted against the continuous distress severity scores. Center line = median (50% quantile); lower box limit =25% quantile; upper box limit = 75% quantile; lower whisker = smallest observation greater than or equal to the lower box limit -1.5 x Inter Quartile Range (IQR); upper whisker = largest observation less than or equal to the upper box limit + 1.5 x Inter Quartile Range (IQR).