Abstract
The exposome represents the totality of environmental effects, but systematic evaluation between it and depressive symptoms is scant. Here we sought to comprehensively identify the association of the exposome with depressive symptoms in late adolescence and early adulthood and determine genetic and environmental covariances between them. Based on the FinnTwin12 cohort (3,025 participants in young adulthood and 4,127 at age 17), the exposome-wide association study (ExWAS) design was used to identify significant exposures from 12 domains. Bivariate Cholesky twin models were fitted to an exposome score and depressive symptoms. In ExWASes, 29 and 46 exposures were significantly associated with depressive symptoms in young adulthood and at age 17, respectively, and familial exposures were the most influential. Twin models indicated considerable genetic and environmental covariances between the exposome score and depressive symptoms with sex differences. The findings underscore the systematic approach of the exposome and the consideration of relevant genetic effects.
Similar content being viewed by others
Main
Depressive symptoms are a type of chronic mental health condition with complex etiology, and major depressive disorder (MDD) is the clinical disorder diagnosed when depressive symptoms reach a threshold of severity and duration. Depressive symptoms and MDD lead to a serious public health burden. The updated Global Burden of Diseases study showed that the age-standardized prevalence of MDD was 4% (3,951 per 100,000 people) in Western Europe, higher than the global level, and underlined the heavy burden on people aged between 15 and 24 (ref. 1). Among adolescents, a 2021 systematic review indicated that the pooled prevalence of self-reported depressive symptoms was 34% and of MDD was 5% from the studies between 2001 to 2020, and the prevalence is increasing2. The COVID-19 pandemic exacerbated the already growing trend of hardship. Given a growing body of evidence on the environmental effect on depressive symptoms and MDD3,4, more systematic investigation is urgently needed, especially among youth.
The concept of the ‘exposome’, which depicts the dynamic totality of the environment that an individual experiences, was raised in 20055. The exposome is divided into three parts—specific external, general external, and internal exposomes—and the external exposome could be further subdivided into the familial, social, built exposome, and so on. Instead of studying a single or small group of exposures, an exposome study aims to investigate the overall effect of the environment while, unavoidably, complexities such as interaction or ubiquity increase the difficulty6. An exposome- (environmental, exposure) wide association study (ExWAS), like other ‘WAS’ studies, denotes an agnostic and systematic method for hypothesis generating, which is comparatively appropriate to the exposome’s spatiotemporal variabilities and multi-level structure7. Several ExWAS studies have targeted mental health8,9,10, and Choi et al11. used clinical incident depression as the outcome and identified multiple modifiable factors. As it is the early warning sign of MDD, focusing on depressive symptoms in adolescence or young adulthood could be easier to guide translational intervention as early as possible, which would be more cost effective.
Despite the benefits of the exposome approach, there are some other hindrances. First, under the current technique, we cannot measure every possible exposure (far from reaching ‘1-genome’), and the exposome keeps updating, expanding, and enriching. Moreover, some studies have emphasized exposures’ non-genetic properties, which ignores how the environment interacts with genetics through multiple mechanisms among many traits, including depression12,13. Medda and colleagues, on the basis of the Italian Twin Registry, demonstrated the substantial genetic role in exogenous metallomics, where the estimations of standardized genetic variance, as a proportion of total variance of the measured exposures, ranged from 0.15 (arsenic) to 0.79 (zinc)14. As a natural experiment, twin and family studies provide a method to evaluate genetic and environmental relationships between traits and exposures. This design decomposes the variance of traits into additive genetic (A), dominant genetic (D), common environmental (C), and unique environmental (E) components, which contain the distinct features of the exposome as the overall environmental effect. Such indirect evidence of genetic effects based on genetic relationships of family members is an efficient way to demonstrate the presence (or lack of) genetic effects. Thus, the combination of exposome and twin studies could advance our knowledge of the complexities between genes and environments, improve our understanding of existing deficiencies in exposome measures, and produce further research questions. A natural extension is then to include measured genotypes, either targeting specific genes such as those involved in the metabolism of external compounds or more broad-based genome-wide approaches to derive polygenic scores of genetic susceptibility.
In this study, based on the FinnTwin12 cohort, we aim to (1) comprehensively and systematically determine exposures that are significantly associated with depressive symptoms and MDD in late adolescence and early adulthood through three ExWASes and (2) estimate to what extent the exposome and depressive symptoms share the same genetic and environmental risk factors.
Results
Characteristics of the study, participants, and exposures
Figure 1 shows the flowchart of the analysis pipeline, which consisted of three ExWASes and the following bivariate twin modeling. Per the FinnTwin12 cohort, there were 3,025, 1,236, and 4,127 individual twins included in three separate ExWASes with the outcomes of general behavior inventory (GBI) score in young adulthood (primary), the incidence of MDD in young adulthood, and GBI score at age 17, respectively. The characteristics of each ExWAS are shown in Table 1.
For individual twins included in ExWASes of all outcomes (Table 2), the majority were female and from dizygotic (DZ) pairs, and their parental education levels were limited (less than high school). At age 17, 25.4% of individual twins reported being current smokers, and 82.6% were full-time students and not working. In young adulthood, 25.3% of individual twins reported that they were currently smoking, and 51.4% had a full-time job. The mean GBI scores at age 17 and in young adulthood were 5.0 (s.d.: 4.9) and 4.4 (s.d.: 4.7), respectively, and the two measures correlated with 0.49. The incidence of lifetime MDD in young adulthood was 12.3%.
Exposures’ code names, description, and statistics based on twins included in the ExWAS of GBI in young adulthood (before imputation) are presented in Supplementary Table 1. There are 12 domains of exposures, colored in the following plots: air pollution, building, blue and green spaces, population density, geocoordinates, prenatal exposures, passive smoking, family and parents, friend and romantic relationships, school and teachers, stressful life events, and social indicators. In principal component analysis (PCA), the first principal component (PC1) attributed only 10.93% and 10.66% to the total variability of all included exposures in young adulthood and at age 17, respectively (Extended Data Fig. 1). From the scatter plots of PC1 and PC2, we identified some potential clusters of exposures from domains of building, blue and green spaces, and social indicators via visual assessment.
ExWAS of depressive symptoms and MDD in young adulthood
The adjusted coefficient and –log10(P value) of all exposures included for both adult outcomes are presented in Supplementary Table 2. There were 40 significant P values in 29 exposures, which were associated with log-transformed GBI score in young adulthood, identified from 385 exposures (Fig. 2a). There were 24, 2, and 3 exposures belonging to the domains of family and parents, friend and romantic relationships, and school and teachers, respectively. For the most protective exposure, compared with twins who felt their home environment was completely unfair, quite unfair, or somewhat unfair at age 17 (unfair_A17), twins who felt it was not at all unfair at age 17 were associated with a 0.40 lower log-transformed GBI score (95% confidence interval (CI): −0.50, −0.31) (Fig. 2b). For the most harmful exposure, compared with twins who were completely satisfied with their relationships with friends at age 14 (sat_friend_A14), twins who felt somewhat satisfied, mainly not satisfied, or not at all satisfied at age 14 were associated with a 0.42 higher log-transformed GBI score (95% CI: 0.29, 0.55) (Fig. 2b). By contrast, none of the exposures showed a significant association with MDD (Extended Data Fig. 2).
ExWAS of depressive symptoms at age 17
The adjusted coefficient and –log10(P value) for the age 17 outcome are presented in Supplementary Table 2. There were 71 significant P values in 46 exposures, which were significantly associated with log-transformed GBI score, identified from 286 exposures (Extended Data Fig. 3a). There were 32, 6, 4, and 4 exposures belonging to the domains of family and parent, friend and romantic relationship, school and teachers, and stressful life events, respectively. For the most harmful exposures, compared with twins who were completely satisfied with their success at work or studies at age 17 (sat_studywork_A17), twins who felt mainly not satisfied or not at all satisfied at age 17 were associated with a 0.65 higher log-transformed GBI score (95% CI: 0.55, 0.74) (Extended Data Fig. 3b). For the most protective exposure, the same as the result in young adulthood, compared with twins who felt their home environment was completely unfair, quite unfair, or somewhat unfair at age 17 (unfair_A17), twins who felt it was not at all unfair at age 17 were associated with a 0.50 lower log-transformed GBI score (95% CI: −0.57, −0.43) (Extended Data Fig. 3b). There are 27 exposures that are significantly associated with log-transformed GBI scores both in young adulthood and at age 17, and 22 exposures belong to the domain of family and parents.
Twin modeling of depressive symptoms with exposome scores
Before the bivariate modeling, the best-fit univariate AE model (had the lowest Akaike information criterion compared with ADE and E models) indicated E explained 61% of the variance of depressive symptoms in males and 45% in females at age 17, and the numbers slightly reduced to 59% and 42%, respectively, in young adulthood (Supplementary Table 3). The exposome score was created by confirmatory factor analysis (CFA) based on the significant exposures from ExWASes. The standardized root mean square residual of models in young adulthood and at age 17 were 0.100 and 0.078, respectively, indicating acceptable model fit. MDD was not included in the CFA or following twin modeling due to the smaller sample size and no significant exposure being identified. Then we used the exposome score to conduct bivariate twin modeling between the exposome score and depressive symptoms. Given the sex differences in the prevalence of depressive symptoms, the differences in heritability, and the fact that sex-limited bivariate models also indicated significant sex differences (Supplementary Table 4) at both age points, we ran the bivariate models separately for males and females.
Figure 3 and Supplementary Table 5 show the path coefficients for the model for exposome score and log-transformed GBI score in young adulthood (mean age: 23.9). Unique environmental factors accounted for 23% and 13% of the covariances in males and females, respectively, while additive genetic factors accounted for 77% in males and 87% in females. In males, standardized variances of Eexposome and EGBI were 0.32 (95% CI: 0.26, 0.39) and 0.51 (95% CI: 0.42, 0.62); the numbers reduced to 0.25 (95% CI: 0.21, 0.30) and 0.50 (95% CI: 0.42, 0.58) in females. The remaining share of variance was accounted for by additive genetic effects.
Extended Data Fig. 4 and Supplementary Table 5 show the path coefficients for the model for exposome score and log-transformed GBI score at age 17. Unique environmental factors accounted for 31% and 13% of the covariances in males and females, respectively. Additive genetic factors accounted for 69% in males and 87% in females. The standardized variances of Eexposome at age 17 are similar to Eexposome in young adulthood regardless of sex. The standardized variance of Eexposome is 0.26 (95% CI: 0.22, 0.30) and 0.22 (95% CI: 0.19, 0.25) and of EGBI is 0.64 (95% CI: 0.55, 0.73) and 0.44 (95% CI: 0.38, 0.50) in males and females, respectively. The remaining share of variance was accounted for by additive genetic effects.
Post hoc mixed model repeated measures
On the basis of the longitudinal design and 27 significant exposures selected by both ExWASes of log-transformed GBI score, after adjusting for covariates and baseline effect, all the exposures were still significantly associated with log-transformed GBI score in young adulthood. The results are presented in Supplementary Table 6.
Discussion
Using data on depressive symptoms and diagnosed MDD from the FinnTwin12 study and a wide range of exposures from multiple sources, we applied a two-stage analysis to first screen the exposome and then estimate the environmental sources of correlation between the exposome and depressive symptoms via twin modeling. First, multiple exposures by self-report have been identified across domains of family and parents, friend and romantic relationships, school and teachers, and stressful life events, which were significantly associated with depressive symptoms in young adulthood and at age 17. By contrast, none of the exposures correlated with the incidence of MDD in young adulthood. Second, after generating an exposome score based on significantly associated exposures, the best-fitting bivariate AE models indicated that unique environmental effects accounted for a marked fraction of the covariance between the exposome score and depressive symptoms. This environmental fraction was higher in males than in females, suggesting a notable sex difference. Our result implies that environmental effects are more impactful compared with genetic effects in males than in females.
Influence from the familial component of the social exposome, especially from the familial atmosphere, was demonstrated by our evidence as having the most substantial impact on depressive symptoms in late adolescence and early adulthood and their trajectory. A large Chinese survey also found that familial factors such as cohesion, conflict, and control correlated with the occurrence of depressive symptoms among university students15. Other studies have revealed the connection of family triangulation (parent–child coalition and alliance) and satisfaction with depressive symptoms from childhood to late adolescence across countries16,17. Fairness (largest protective effect size of GBI at both age points), as a dimension of parentification, was demonstrated as a unique predictor of mental health symptoms18. These existing conventional investigations were consistent with ours, while our ExWAS more systematically evaluated a wide range of exposures and reduced the chance of type I error without any pre-identified hypothesis. Moreover, instead of traditional scales for assessing familial and interpersonal relationships, we treated each scale component as an ‘independent’ exposure in models, which helped us to identify new correlations, detect the relative importance, and prepare for further analysis of more intricate relationships between different components and depressive symptoms.
Results from bivariate twin modeling reveal a complex relationship among genes, environments, and depressive symptoms. Although the unique environmental factor explains a notable amount of covariance between exposome score and depressive symptoms, the additive genetic factor explained relatively more. Many significant exposures were chosen under the guidance of the exposome paradigm, but it does not necessarily imply a pure environmental effect. Many familial influences are considered ‘inheritable factors’ between generations to a certain extent, according to the intergenerational transmission theory. Such effects can be transmitted from parents to children through shared genes but also by shared environments. Early studies have found that life satisfaction or family violence from parents and origin families led to an important impact on the development of subsequent similar familial environments among offspring19,20. Moreover, we should consider the existence of the gene–environment interaction (G×E), which suggests the different effects of a genotype on disease risk in persons with different environmental exposures21. Choi et al11. stratified the ExWAS by polygenic risk scores of major depression and found that some significant factors in the full sample became null in the genetically at-risk sample. Another study suggested the multiple modulation pathways by exposure to DNA methylation, through numerous testing, regarded as the G×E-WAS22. In addition, previous twin studies found geographic confounding in the assessment of A, C, and E variances, possibly attributable to differences in genetic ancestry. Results from the Netherlands Twin Register found 1.8% of the variance in children’s height was captured by regional clustering23. In the Netherlands, there were strong genetic differentiations between the north and south, between the east and west, and between the middle band and the rest of the country by PCA on genome-wide data24. In the Finnish population, also a substantial population structure difference is observed between the east and west parts of country25. In brief, the hidden heritable and genetic factors critically influence the association between the exposome and depressive phenotype through various mechanisms, which potentially lead to a propensity to weak associations in our findings.
Furthermore, exposures from the more external domains, particularly in the physical exposome, also showed, at most, weak connections with depressive symptoms. While it may be the case that the relative importance of the physical exposome is much less than that of the social and familial exposome with respect to depressive symptoms, there are possibly other explanations. First, a more complex structure of the exposome, such as the interaction or correlation between individual exposures and external exposome, may exist. Some previous exposome analyses have indicated this26,27, but the ExWAS design cannot characterize it. For example, the social exposome is an explaining part of the physical exposome, which could not be completely separated. We aim to investigate the complicated effect of the depressive phenotype in the pluralistic platform like machine learning on the basis of our findings in the future. Second, Finland has been ranked very high in the beneficial environmental effect on the child by UNICEF (United Nations Children’s Fund), providing environments with low air pollution, high greenness, safe water, and other constructive aspects relatively equally to most residents in childhood and adolescence28. It could explain null results with external living environments due to a lack of individual variation in exposures. Another matter contributing to large familial effects is the overlap between interpersonal relationships and depressive symptoms. In a Swedish twin study among females, interpersonal relationships contributed between 18% and 31% of the variance for depressive symptoms29. Some personality disorders are tightly connected with interpersonal relationships, for example, borderline, avoidant, and paranoid personality disorders’ liability factors overlapped substantially with MDD’s, in particular, clusters among Norwegian young adults30. This overlapping may have led to an overestimation of the importance of interpersonal relationships.
For social indicators, besides the critical period, various risk models such as accumulation or trajectory may exist, which may also explain the null results. Morrissey and Kinderman confirmed the hypothesis that accumulation of adverse financial hardship negatively affects mental health, but not the hypothesis of critical periods31, while our risk model is the ‘critical period’. Another study demonstrated the complicated effect among changes in racial composition, neighborhood socioeconomic status, and depressive symptoms32. The social indicators derived from Statistics Finland’s (stat.fi/tilastotieto) registers are at the postal code or municipality level, which leads to some concern about the inaccurate measurement of an individual’s exposure (information bias).
Several previous ExWAS studies linking the exposome to mental health had some similar or heterogeneous results to ours. van de Weijer et al10. identified several social indicators such as safety and income being linked to mental well-being, but the links were weak in our analysis. This may be due to using different outcomes, the older age in their samples, and different statistical methods between the two countries’ authorities10. Although the ExWAS of Choi et al. was on the general population in the United Kingdom, they also found that a higher frequency of visits with family/friends reduced the odds of depression incidence, and Mendelian randomization reinforced the causality of this association11. However, we do not have many common variables with Choi et al11. in which they included many lifestyle factors (specific external exposome), while we have more general external exposome variables. Another ExWAS on psychotic experiences identified many stressful life-event factors, a result that was similar to our study8. Despite the divergent findings, the accumulation of ExWAS findings from different countries, populations, and age groups enhances our understanding of growing concepts of the exposome on depression, as well as broad mental health. The inclusion of a large number of exposures about interpersonal and person–societal relationships is also an important addition to the existing evidence. Notably, some of the information was provided by the parents, not only the twins. Furthermore, some scientists have raised the concept of an ‘eco-exposome’ to thoroughly assess the internal exposome, including molecules affected by exogenous exposures33, which could be assimilated into further research.
The sex difference is notable. Our previous study found that male twins tend to stay together longer, implying more exposure to any familial impact34. In a Swedish study, family structure, conflict, and child disclosure of information to parents were associated with offending behavior in boys, while only one factor was salient in girls35. Another British study found that boys in detrimental familial environments were increasingly disadvantaged in school achievement compared with girls36. The evidence hints that males are more easily affected by the family environment, which could explain the higher contribution of E on the covariance between the exposome and depressive symptoms in males. This inference is not certain, and there is contrary evidence37. Moreover, sex differences exist in many biological mechanisms regarding how the body neurophysiologically reflects the external environment. Several sex-differentially expressed neurotransmitters or hormones, such as progesterone in females, are involved in systemic dysregulation, inducing depression38. Furthermore, environmental endocrine-disrupting chemicals are able to alter neurodevelopment with sex-specific effects at very early developmental stages39. In the future, integrating with the internal exposome such as metabolites and other omics will help us advance the study of sex-difference mechanisms on the relationship between the exposome and depressive phenotype.
As a part of the European Human Exposome Network, our overarching goal is to evaluate the impact of the exposome on human health across various age groups and with respect to multiple outcomes. The present analysis represents one individual analysis, and by pooling our collective efforts, important implications for clinical practice can be drawn in the future. Our findings suggest that studies on the familial component of social exposome should be noticed and investigated in the improvement of current therapy. It does not mean that we should ignore the physical exposure group, due to ubiquity, even though their relevance is not salient40. In addition, it is imperative to incorporate the consideration of familial effects and genetic liability at the same time for a more thorough understanding in future studies.
There are some other limitations in our study. First, compared with other ExWASes, our sample size is relatively small. Although Chung et al. indicated that a sample size between 1,795 and 3,625 participants is adequate when using the Bonferroni correction41, we did not stratify the ExWAS by sex due to the sample size being reduced by half. Second, we did not further assess the causality. Causal inferences are critical for further policymaking and intervention. Mendelian randomization in larger samples is a future direction. Third, the ExWAS, CFA, and twin modeling were all performed on the basis of the FinnTwin12 cohort, which raises concerns about model overfitting and leakage. Different models with different purposes, hypotheses, and methodologies in two stages reduce the risk of overfitting and leakage. ExWAS was used to identify salient exposure, while CFA and twin modeling were used to explore. The observational unit was each twin pair in twin modeling, while in ExWAS and CFA, it is each individual twin. Replication on other twin cohorts and in family datasets is warranted.
Conclusion
This study applied a two-stage analysis. First, in ExWAS, we identified that exposures from family and parents, friend and romantic relationships, school and teachers, and stressful life events were significantly associated with depressive symptoms in late adolescence and young adulthood. The family and parent exposures were the most influential. Second, twin modeling between the exposome and depressive symptoms uncovered a complex relationship among genes, environments, and depressive symptoms with sex differences. The findings underline the importance of systematic evaluation of the environmental effects on depressive symptoms and recommend the consideration of genetic effects in future studies.
Methods
Study participants
The participants came from the FinnTwin12 cohort, which is a nationwide prospective cohort among all Finnish twins born between 1983 and 1987. First, the overall epidemiological study consisted of all 5,184 twins who responded (age 11–12) at wave 1, and there are three general following waves at ages 14, 17, and in young adulthood (mean age: 21.9). Moreover, 1,035 families with 2,070 twins were invited to take part in an intensive study with psychiatric interviews, some biological samples, and additional questionnaires42. At age 14 (wave 2), 1,854 twins participated. They were then invited to participate again as young adults (wave 4) of the study. Psychiatric interviews in young adulthood were completed for 1,347 twins in the intensive study, including assessment of MDD using the Semi-Structured Assessment for Genetics of Alcohol based on Diagnostic and Statistical Manual of Mental Disorders IV criteria43,44. The twins also completed questionnaires on health, health behaviors, work, and multiple psychological scales. The flowchart of general FinnTwin12 cohort is presented in Extended Data Fig. 5. An updated review has been published45. .
The ethics committee of the Department of Public Health of the University of Helsinki and the Institutional Review Board of Indiana University approved the FinnTwin12 study protocol from the start of the cohort. The ethical approval of the ethics committee of the Helsinki University Central Hospital District (HUS) is the most recent and covers the most recent data collection (wave 4) (HUS/2226/2021). The HUS reviews the study annually, and 2023’s statement is number 4/2023, dated 1 February 2023. All participants and their parents/legal guardians gave informed written consent to participate in the study. The authors assert that all procedures contributing to this work comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2008.
Measures
The primary outcome is the short-version GBI scores in young adulthood. It is a self-reported inventory to evaluate the occurrence of depressive symptoms, which is composed of ten Likert-scale questions46. The total score ranges from 0 to 30, and a higher score implies more depressive symptoms occurred. There are two secondary outcomes: GBI scores at age 17 and incidence of MDD in young adulthood.
In total, we curated 385 environmental exposures under the concept of the Equal-Life project47 from multiple sources and grouped them into 12 domains. Air pollution exposures came from the annual average air quality of each observation station from the Finnish Meteorological Institute. Domains of building, blue and green spaces, population density, and a part of geocoordinates were from Equal-Life enrichment. Their description can be found in a previous study48 and is presented in Supplementary Note 1. Exposures from prenatal exposures, passive smoking, family and parents, friend and romantic relationships, school and teachers, and stressful life events domains were from FinnTwin12 questionnaires by self-report or parent report and are described in a published review45. Social indicators were from Statistics Finland and are described in Supplementary Note 1. Except for FinnTwin12 questionnaires, exposures from other sources were linked to individual twins via EUREF-FIN geocoordinates. The full residential history of the twins from birth onward until 2020 was obtained as geocoordinates and dates of moving in and out of specific addresses from the Digital and Population Data Services Agency in Finland34. The types of exposures are continuous, binary, and categorical. Considering the temporality, we included repeated exposures for the critical-period risk model, and Extended Data Fig. 6 presents the timeline of the study. There are three exposure inclusion criteria: (1) twins have available residential history, (2) twins and their family completed at least one questionnaire at any wave, and (3) the percentage of missing values is less than 20% in ExWAS. The code names of each exposure were developed from the description as closely as possible, and their domains, resources, and dates are presented in Supplementary Table 1. The missing patterns of each exposure in each ExWASes are presented in Supplementary Table 7
For analysis of outcomes in young adulthood, we a priori identified seven covariates: sex (male, female), zygosity (monozygotic (MZ), DZ, unknown), parental education (limited, intermediate, high)49, smoking (never, former, occasional, current), work status (full-time, part-time, irregular, not working), secondary-level school (vocational, senior high school, none), and age. The latter four variables were reported by twins as young adults (wave 4). For analysis of outcome at age 17, sex, zygosity, parental education, smoking (reported at age 17) remained. Study and working status (neither study nor work, only study, only work) were included when most participants were in school at age 17 (wave 3). The inclusion of covariates, besides sex, zygosity, and age, was based on the previous literature, which shows correlations with the environment and depressive symptoms50,51,52. Parental education was adjusted for to represent the family resources and resilience 49.
Data pre-processing and descriptive statistics
Participants missing information on outcome or covariates were excluded from the corresponding age’s analyses. Due to the skewness of the GBI score, we added one to the GBI score and log-transformed it. Appropriate regrouping was conducted for categorical exposures, and then we used multivariate imputation by chained equation to replace the missing values of exposures. As a dimension reduction technique, PCA was utilized to measure the proportion of total variability of all included exposures attributed to each PC and visually assess the potential clusters of exposures (correlated) on the basis of the two-dimensional coordinate with the first and second components. It was conducted only for outcomes of GBI at age 17 and in young adulthood, not for the incidence of MDD.
Exposome-wide association study
To conduct the ExWAS, a generalized linear regression model with Gaussian distribution (essentially linear regression) for the outcomes of log-transformed GBI score was repeatedly performed for each exposure. We used Bonferroni correction by the number of effective tests (calculated by PCA) to adjust for multiple testing and account for correlation between exposures53. Covariates were adjusted and the cluster effect of sampling based on families of twin pairs was controlled for by the robust standard error. For the outcome of the incidence of MDD, the distribution was switched to be binomial. The number of included exposures of secondary outcomes was smaller due to the third exposure inclusion criteria, and the sample size varied; thus, the P-value thresholds varied. Due to categorical exposures, the number of P values was higher than the number of exposures. We used the rexposome package in the R environment (version 4.2.3) 54.
We further calculated power using the R package WebPower (R environment, version 4.2.3) for ExWASes of the log-transformed GBI score at both age points. These calculations were based on the smallest absolute effect size among significant results (0.12 in young adulthood and 0.10 at age 17), sample size (3,025 in young adulthood and 4,127 at age 17), number of predictor variables in a single model (8 in young adulthood and 6 at age 17), and significant thresholds (3.09 × 10−4 in young adulthood and 3.63 × 10−4 at age 17). The powers were 1 for ExWASes both in young adulthood and at age 17, indicating adequate sample sizes in this study.
Generating exposome score
Based on the significant exposures selected from the ExWAS, CFA was used to estimate an exposome score, preparing for the following twin modeling. According to the concept of the environment’s totality, we indicated a one-factor structure for the exposome. The CFA assumes the correlation between exposures due to the exposome score and verifies it based on structural equation modeling as theory driven. We used maximum likelihood to estimate the score and standardized root mean square residual to evaluate the model fit55. The cluster effect was controlled like before. Due to multiple subgroups in categorical exposures, we included the whole exposure variable when there was at least one subgroup that was significant compared with the reference in ExWAS. The coefficients of significant exposures were presented in Supplementary Tables 8 and 9 for outcomes of GBI in young adulthood and at age 17, respectively. In addition, we conducted exploratory factor analysis (EFA) estimated by maximum likelihood with 100 optimizations, whereas a large number of retained factors indicated potential overfitting of EFA. The CFA and EFA were performed using Stata 18.0 (StataCorp), and package sem was used.
Twin modeling
In twin modeling, the genetic effect is usually divided into additive and dominant genetic effects56. Since MZ twins are roughly genetically identical and DZ twins share roughly half of their segregating genes, the correlation of A is set to 1.0 and 0.5 and of D is set to 1.0 and 0.25 within MZ and DZ twin pairs, respectively. The epistatic effect is a part of A. The environmental effect is also divided into two components: common environment, whose correlation is assumed to be 1.0 regardless of zygosity, and unique environment (no correlation), which includes unmeasured errors. The use of the twin model assumes the absence of assortative mating for the trait under study among the parents and equal effects of the environment by zygosity.
The intrapair correlations of GBI in DZ (ρ = 0.22 in young adulthood and 0.16 at age 17) and MZ (ρ = 0.52 in young adulthood and 0.51 at age 17) indicated to use an ADE model initially, instead of the ACE model (ρMZ > 2ρDZ). Due to using only the twin pair design, instead of the extended family design, we could not use an ACDE model. The saturated twin model was performed to test the assumptions of equal means and variances for twin order and for zygosity, via constraint means and variances, and to detect the sex difference via sex limitation. In the saturated model (Supplementary Table 10), the Akaike information criterion and likelihood ratio test between models suggested that the assumptions were basically met. Results of the sex-limitation saturated model (Supplementary Table 10) indicated a notable sex difference.
Finally, to assess how the current exposome score explains the variance of depressive symptoms, we employed the bivariate Cholesky AE model to fit the exposome score and log-transformed GBI score (Extended Data Fig. 7) at both age points, which efficiently decomposes the phenotypic correlation and offers the attribution (%) to genetic and environmental factors57. Two latent factors (Aexposome and Eexposome) influence both the exposome score (a11 and e11) and log-transformed GBI score (a21 and e21), and another two latent factors (AGBI and EGBI) influence only the log-transformed GBI score (a22 and e22). The overall correlation between the exposome score and GBI could be calculated as \({a}_{11}\times {a}_{12}+{e}_{11}\times {e}_{12}\). Variances of Aexposome, Eexposome, AGBI, and EGBI were calculated as \({a}_{11}^{2}+{a}_{12}^{2}\), \({e}_{11}^{2}+{e}_{12}^{2}\), \({a}_{22}^{2}\), and \({e}_{22}^{2}\), respectively. We also re-assess the sex difference via an additional sex-limited saturated bivariate twin model.
Only full MZ and DZ twin pairs were included in the twin modeling. We dropped the opposite-sex DZ pairs and stratified the univariate and bivariate twin models by sex. The characteristics of included and excluded individual twins in the twin modeling are presented in Supplementary Table 11, and we did not observe a large difference, suggesting low selection bias risk due to sex, zygosity, and twin pair. Age, reported in the young adulthood survey, was adjusted in univariate and bivariate models for the outcome in young adulthood. We used the OpenMx package in the R environment (version 4.2.3) 58.
Post hoc mixed models for repeated measures
On the basis of the exposures significantly associated with GBI at both time points, we performed the mixed models for repeated measures (MMRM) as a post hoc analysis to further explore the effects on the trajectory of depressive symptoms. This method analyzes the influence on the log-transformed GBI in young adulthood by both exposures of interest (fixed effect) and ‘baseline’ log-transformed GBI at age 17 (random effect)59. The sample size and covariates of the MMRM were the same as in the ExWAS of log-transformed GBI score in young adulthood. The cluster effect was controlled by the robust standard error. The multiple testing was controlled by the false discovery rate (Q value < 0.05 was considered statistically significant). These post hoc analyses were performed using Stata 18.0 (StataCorp).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
The FinnTwin12 data are not publicly available due to the restrictions of informed consent. However, the FinnTwin12 data are available through the Institute for Molecular Medicine Finland (FIMM) Data Access Committee (DAC) (fimm-dac@helsinki.fi) for authorized researchers who have IRB/ethics approval and an institutionally approved study plan. To ensure the protection of privacy and compliance with national data protection legislation, a data use/transfer agreement is needed, the content and specific clauses of which will depend on the nature of the requested data. Requests will be addressed in a reasonable time frame (generally two to three weeks), and the primary mode of data access is by either personal visit or remote access to a secure server.
Code availability
No new software, package, or algorithm was developed. All code for data cleaning and analysis associated with the current submission is available upon reasonable request to the corresponding author or Z.W. (zhiyang.wang@helsinki.fi).
References
GBD 2019 Mental Disorders Collaborators. Global, regional, and national burden of 12 mental disorders in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet Psychiatry 9, 137–150 (2022).
Shorey, S., Ng, E. D. & Wong, C. H. J. Global prevalence of depression and elevated depressive symptoms among adolescents: a systematic review and meta-analysis. Br. J. Clin. Psychol. 61, 287–305 (2022).
Köhler, C. A. et al. Mapping risk factors for depression across the lifespan: an umbrella review of evidence from meta-analyses and Mendelian randomization studies. J. Psychiatr. Res. 103, 189–207 (2018).
Mair, C., Roux, A. V. D. & Galea, S. Are neighbourhood characteristics associated with depressive symptoms? A review of evidence. J. Epidemiol. Community Health 62, 940 LP–940946 (2008).
Wild, C. P. Complementing the genome with an ‘exposome’: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol. Biomarkers Prev. 14, 1847–1850 (2005).
Guloksuz, S., van Os, J. & Rutten, B. P. F. The exposome paradigm and the complexities of environmental research in psychiatry. JAMA Psychiatry 75, 985–986 (2018).
Zheng, Y. et al. Design and methodology challenges of environment-wide association studies: a systematic review. Environ. Res. 183, 109275 (2020).
Lin, B. D. et al. Nongenetic factors associated with psychotic experiences among UK biobank participants: exposome-wide analysis and Mendelian randomization analysis. JAMA Psychiatry 79, 857–868 (2022).
Ni, M. Y. et al. Determinants of physical, mental and social well-being: a longitudinal environment-wide association study. Int. J. Epidemiol. 49, 380–389 (2020).
van de Weijer, M. P. et al. Expanding the environmental scope: an environment-wide association study for mental well-being. J. Expo. Sci. Environ. Epidemiol. 32, 195–204 (2022).
Choi, K. W. et al. An exposure-wide and Mendelian randomization approach to identifying modifiable factors for the prevention of depression. Am. J. Psychiatry 177, 944–954 (2020).
Leffers, H. C. B., Lange, T., Collins, C., Ulff-Møller, C. J. & Jacobsen, S. The study of interactions between genome and exposome in the development of systemic lupus erythematosus. Autoimmun. Rev. 18, 382–392 (2019).
Wichers, M. et al. Mechanisms of gene–environment interactions in depression: evidence that genes potentiate multiple sources of adversity. Psychol. Med. 39, 1077–1086 (2009).
Medda, E. et al. The response to oxidative stress and metallomics analysis in a twin study: the role of the environment. Free Radic. Biol. Med. 97, 236–243 (2016).
Yu, Y. et al. The role of family environment in depressive symptoms among university students: a large sample survey in China. PLoS ONE 10, e0143612 (2015).
Stavropoulos, V., Lazaratou, H., Marini, E. & Dikeos, D. Low family satisfaction and depression in adolescence: the role of self-esteem. J. Educ. Dev. Psychol. 5, 109–118 (2015).
Wang, L. & Crane, D. R. The relationship between marital satisfaction, marital stability, nuclear family triangulation, and childhood depression. Am. J. Fam. Ther. 29, 337–347 (2001).
Hooper, L. M. & Wallace, S. A. Evaluating the parentification questionnaire: psychometric properties and psychopathology correlates. Contemp. Fam. Ther. 32, 52–68 (2010).
Cappell, C. & Heiner, R. B. The intergenerational transmission of family aggression. J. Fam. Violence 5, 135–152 (1990).
Powdthavee, N. & Vignoles, A. Mental health of parents and life satisfaction of children: a within-family analysis of intergenerational transmission of well-being. Soc. Indic. Res. 88, 397–422 (2008).
Ottman, R. Gene–environment interaction: definitions and study design. Prev. Med. 25, 764–770 (1996).
Carreras-Gallo, N. et al. The early-life exposome modulates the effect of polymorphic inversions on DNA methylation. Commun. Biol. 5, 455 (2022).
Tamimy, Z. et al. Multilevel twin models: geographical region as a third level variable. Behav. Genet. 51, 319–330 (2021).
Abdellaoui, A. et al. Population structure, migration, and diversifying selection in the Netherlands. Eur. J. Hum. Genet. 21, 1277–1285 (2013).
Kerminen, S. et al. Fine-scale genetic structure in Finland. G3 7, 3459–3468 (2017).
Ohanyan, H. et al. Machine learning approaches to characterize the obesogenic urban exposome. Environ. Int. 158, 107015 (2022).
Ohanyan, H. et al. Associations between the urban exposome and type 2 diabetes: results from penalised regression by least absolute shrinkage and selection operator and random forest models. Environ. Int. 170, 107592 (2022).
Places and Spaces: Environments and Children’s Well-Being (UNICEF IRC, 2022).
Spotts, E. L. et al. Accounting for depressive symptoms in women: a twin study of associations with interpersonal relationships. J. Affect. Disord. 82, 101–111 (2004).
Reichborn-Kjennerud, T. et al. Major depression and dimensional representations of DSM-IV personality disorders: a population-based twin study. Psychol. Med. 40, 1475–1484 (2010).
Morrissey, K. & Kinderman, P. The impact of childhood socioeconomic status on depression and anxiety in adult life: testing the accumulation, critical period and social mobility hypotheses. SSM Popul. Health 11, 100576 (2020).
Lee, H. & Estrada-Martínez, L. M. Trajectories of depressive symptoms and neighborhood changes from adolescence to adulthood: latent class growth analysis and multilevel growth curve models. Int. J. Environ. Res. Public Health 17, 1829 (2020).
Scholz, S. et al. The eco-exposome concept: supporting an integrated assessment of mixtures of environmental chemicals. Environ. Toxicol. Chem. 41, 30–45 (2022).
Wang, Z., Whipp, A., Heinonen-Guzejev, M. & Kaprio, J. Age at separation of twin pairs in the FinnTwin12 study. Twin Res. Hum. Genet. https://doi.org/10.1017/thg.2022.17 (2022).
Nilsson, E.-L. Analyzing gender differences in the relationship between family influences and adolescent offending among boys and girls. Child Indic. Res. 10, 1079–1094 (2017).
Mensah, F. K. & Kiernan, K. E. Gender differences in educational attainment: influences of the family environment. Br. Educ. Res. J. 36, 239–260 (2010).
Blaauboer, M. & Mulder, C. H. Gender differences in the impact of family background on leaving the parental home. J. Hous. Built Environ. 25, 53–71 (2010).
Sikes-Keilp, C. & Rubinow, D. R. In search of sex-related mediators of affective illness. Biol. Sex Differ. 12, 55 (2021).
Rebuli, M. E. & Patisaul, H. B. Assessment of sex specific endocrine disrupting effects in the prenatal and pre-pubertal rodent brain. J. Steroid Biochem. Mol. Biol. 160, 148–159 (2016).
Maria, F. et al. Association of long-term exposure to traffic-related air pollution with blood pressure and hypertension in an adult population–based cohort in Spain (the REGICOR study). Environ. Health Perspect. 122, 404–411 (2014).
Chung, M. K., Buck Louis, G. M., Kannan, K. & Patel, C. J. Exposome-wide association study of semen quality: systematic discovery of endocrine disrupting chemical biomarkers in fertility require large sample sizes. Environ. Int. 125, 505–514 (2019).
Whipp, A. M. et al. Early adolescent aggression predicts antisocial personality disorder in young adults: a population-based study. Eur. Child Adolesc. Psychiatry 28, 341–350 (2019).
Bucholz, K. K. et al. A new, semi-structured psychiatric interview for use in genetic linkage studies: a report on the reliability of the SSAGA. J. Stud. Alcohol 55, 149–158 (1994).
Bell, C. C. DSM-IV: Diagnostic and Statistical Manual of Mental Disorders. JAMA 272, 828–829 (1994).
Rose, R. J. et al. FinnTwin12 cohort: an updated review. Twin Res. Hum. Genet. 22, 302–311 (2019).
Kokko, K. & Pulkkinen, L. Unemployment and psychological distress: mediator effects. J. Adult Dev. 5, 205–217 (1998).
van Kamp, I. et al. Early environmental quality and life-course mental health effects: the Equal-Life project. Environ. Epidemiol. 6, e183 (2022).
Julvez, J. et al. Early life multiple exposures and child cognitive function: a multi-centric birth cohort study in six European countries. Environ. Pollut. 284, 117404 (2021).
Huppertz, C. et al. The effects of parental education on exercise behavior in childhood and youth: a study in Dutch and Finnish twins. Scand. J. Med. Sci. Sports 27, 1143–1156 (2017).
Salmela-Aro, K. et al. Depressive symptoms and career-related goal appraisals: genetic and environmental correlations and interactions. Twin Res. Hum. Genet. 17, 236–243 (2014).
Edelbrock, C., Rende, R., Plomin, R. & Thompson, L. A. A twin study of competence and problem behavior in childhood and early adolescence. J. Child Psychol. Psychiatry 36, 775–785 (1995).
Rose, R. J. & Kaprio, J. Genes, environments, and adolescent substance use: retrospect and prospect from the FinnTwin studies. Acta Psychol. Sinica 40, 1062–1072 (2008).
Li, M.-X., Yeung, J. M. Y., Cherny, S. S. & Sham, P. C. Evaluating the effective numbers of independent tests and significant P-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum. Genet. 131, 747–756 (2012).
Hernandez-Ferrer, C. et al. Comprehensive study of the exposome and omic data using rexposome bioconductor packages. Bioinformatics 35, 5344–5345 (2019).
Kline, R. B. Principles and Practice of Structural Equation Modeling 4th edn (Guilford Publications, 2015).
Polderman, T. J. C. et al. Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nat. Genet. 47, 702–709 (2015).
Røysamb, E. & Tambs, K. The beauty, logic and limitations of twin studies. Nor. Epidemiol. https://doi.org/10.5324/nje.v26i1-2.2014 (2016).
Neale, M. C. et al. OpenMx 2.0: extended structural equation and statistical modeling. Psychometrika 81, 535–549 (2016).
Detry, M. A. & Ma, Y. Analyzing repeated measurements using mixed models. JAMA 315, 407–408 (2016).
Acknowledgments
We acknowledge A. Pia and H. Heidi from the Finnish Meteorological Institute for providing the annual summary of air quality data in Finland. We appreciate A. Ambros and B. Raimbault from ISGlobal for the calculation of the exposome for this study. We thank X. Escribà-Montagut from ISGlobal for technical support on the R package rexposome. FinnTwin12 thanks all participating twins, their parents, and teachers. This research was partly funded by the European Union’s Horizon 2020 research and innovation program under grant agreement no. 874724 (Equal-Life). Equal-Life is part of the European Human Exposome Network. Data collection in FinnTwin12 has been supported by the National Institute on Alcohol Abuse and Alcoholism (grants AA-12502, AA-00145, and AA-09203 to R. J. Rose) and the Academy of Finland (grants 100499, 205585, 118555, 141054, 264146, 308248, 312073, 336823, and 1352792 to J.K.). J.K. acknowledges support by the Academy of Finland (grants 265240, 263278).
Funding
Open Access funding provided by University of Helsinki including Helsinki University Central Hospital.
Author information
Authors and Affiliations
Contributions
All authors conceived and conceptualized the research idea and developed the analysis plan. Z.W., M.F., J.J., and J.K. acquired the data. Z.W. and S.Z. conducted data analysis. Z.W. drafted the original draft of the paper. S.Z., A.M.W., M.H.-G., M.F., J.J., I.v.K., and J.K. reviewed the paper and provided critical comments. I.v.K. and J.K. gave critical supervision and guidance. All authors approved the paper.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Mental Health thanks Ran Barzilay, Sinan Guloksuz, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Principal component analysis for exposures.
A, In young adulthood (individual twin n = 3025). B, At age 17 (individual twin n = 1236).
Extended Data Fig. 2 Association results between exposure and incidence of MDD, adjusted for covariates (individual twin n = 1236), using generalized binomial regression.
The adjusted covariates were: sex, zygosity, parental education, smoking in young adulthood, work status in young adulthood, secondary level school in young adulthood, and age when twins provided the GBI assessment in young adulthood.
Extended Data Fig. 3 Association results between exposure and log-transformed GBI score at age 17, adjusted for covariates (individual twin n = 4127), using generalized linear regressiona.
Panel A is a Manhattan association plot for exposures in relation to log-transformed GBI score at age 17. The y-axis is showing statistical significance as –log10(P value) for the adjustment for multiple testing. Panel B presents the adjusted beta for significant exposures in descending order from top to bottom (from harmful to protective). In panel B, the center dot and bar present the effect size (coefficient of linear regression) and 95% confidence interval, and the size of the dots presents the effect size relatively. The color legend applies to both Panel A (Manhattan association plot) and B (forest plot). The adjusted covariates were: sex, zygosity, parental education, smoking at age 17, and study and working status at age 17.
Extended Data Fig. 4 Bivariate Cholesky AE model for the exposome score and log-transformed GBI score at age 17 (twin pair n = 1000).
A stands for standardized variance of additive genetic effect. E stands for standardized variance of unique environmental effect. MZ and DZ stand for monozygotic and dizygotic twin pairs, respectively. The a and e stand for pathway coefficients from A and E, respectively, to both the exposome score and log-transformed GBI score. The 95% confidence intervals of standardized variances and pathway coefficients are presented in Extended Supplementary Table 4.
Extended Data Fig. 5 Flowchart of general FinnTwin12 cohort.
The cohort is composed of the overall epidemiological study with one baseline and three general follow-ups and the intensive study with two detailed interviews.
Extended Data Fig. 6 Calendar timeline of included exposures and outcomes.
The information started to be recorded in 1983 during the pregnancy of twins’ mothers and until 2015.
Extended Data Fig. 7 Diagram of bivariate Cholesky AE decomposition model (r: correlation).
Note: circles denote latent factors of additive genetic (A) and unique environmental (E) components, while rectangles denote measured/calculated variables.
Supplementary information
Supplementary Information
Supplementary Tables 1–11 and Note 1.
Supplementary Table 1
Exposures’ description and summary statistics based on twins included in ExWAS of GBI in young adulthood (before imputation).
Supplementary Table 2
Results of ExWAS for each outcome.
Supplementary Table 3
Univariate twin models of GBI score in both sexes.
Supplementary Table 4
Sex-limitation saturated bivariate model fit comparisons.
Supplementary Table 5
Bivariate Cholesky AE models for the exposome score and log-transformed GBI score in both sexes.
Supplementary Table 6
Association of selected exposures with log-transformed GBI score in young adulthood using MMRM.
Supplementary Table 7
Number of participants with missing data in ExWAS for each outcome.
Supplementary Table 8
Results of CFA for exposome score in young adulthood.
Supplementary Table 9
Results of CFA for exposome score at age 17.
Supplementary Table 10
Saturated univariate model fit comparisons for equal means and variances assumption and sex difference.
Supplementary Table 11
The characteristics of included and excluded individual twins in twin modeling.
Supplementary Note 1
Additional description of exposures.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, Z., Zellers, S., Whipp, A.M. et al. The effect of environment on depressive symptoms in late adolescence and early adulthood: an exposome-wide association study and twin modeling. Nat. Mental Health 1, 751–760 (2023). https://doi.org/10.1038/s44220-023-00124-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s44220-023-00124-x