Main

The rising prevalence of major depressive disorder (MDD) is imposing a substantial burden on public health worldwide1,2,3. The factors affecting the onset of depression are complicated and involve both behavioral and biological factors. Therefore, to prevent the onset of depression, comprehensive strategies are needed to explore the risk factors and mechanisms so as to explain all aspects of the pathophysiology of depression. A recent meta-review assessed the evidence regarding how multiple lifestyle factors impact the risk and treatment outcomes across a range of mental health disorders, including depression, and discussed the potentially shared neurobiological pathways4. In contrast to previous studies, which focused on investigating a specific aspect of lifestyle in the prevention of depression and its associated genetic and neurobiological mechanisms5,6,7,8,9,10, we incorporate a wide range of lifestyle factors, including alcohol consumption, smoking habits, physical activity, diet, circadian habits, sedentary behavior and social connection. Research has revealed that, when exercising, muscles secrete myokines, which contribute to the regulation of hippocampal function, improving mood symptoms11. It has also been found that overeating and a sedentary lifestyle may increase the risk of depression by suppressing adaptive cellular stress responses12. Circadian disruption is associated with an increased risk of depression resulting from insufficient sleep and the disturbance of melatonin13. Smoking and alcohol dependence are associated with lesions in brain circuits14,15, and reduced social connection is known to impact metabolic and brain health status, which is associated with depression16. By integrating a number of lifestyle factors into a comprehensive score and systematically revealing its relationship with incident depression and the underlying mechanisms, we aim to provide an effective strategy for reducing the risk of depression.

Compelling evidence has indicated that the genetic architecture of psychiatric risk is complex and is dominated by multiple contributing factors17, so we also aim to investigate the combined effect of stratified polygenic-risk score (PRS) and lifestyle conferring risk for depression. To understand the neurobiological mechanisms by which genetic variation increases risk, the previous literature considered the brain as the intermediate phenotype affected by risk gene variants to further elucidate the mechanistic aspects of brain function implicated in psychiatric disease17. Genetic studies have also revealed the contributions of genetic variation in innate immune mechanisms for diseases18. Therefore, we further comprehensively investigate the intermediate neurobiological processes influenced by genetic variation underlying the association between lifestyle and depression.

We hypothesized that adherence to a healthy lifestyle would contribute to reducing the risk of depression across a population with different genetic profiles. We also hypothesized that there would be multiple shared neurobiological mechanisms modulated by genetic variants underlying the association of lifestyle with depression. Therefore, our study aims to examine the composite lifestyle risk for depression and its underlying neurobiological mechanisms. We made use of the UK Biobank (UKB), a large prospective cohort, with information on behavioral, brain imaging, biochemistry and genetic measures. The present study has three objectives. The first is to investigate the composite lifestyle impact for depression risk and the combining effect of polygenetic and lifestyle conferring risk for depression. Second, we aim to perform a correlation analysis to estimate the relationships of lifestyle with brain structural imaging, blood cells, biochemistry and metabolic markers to unravel how lifestyle factors regulate neurobiological processes. Because brain morphology alteration and disruption of immunometabolic systems have been linked to depression in previous studies, our third aim is to construct a structural equation model (SEM) to elucidate the relationships between lifestyle, PRS, brain structure, immunometabolic function and depression.

Results

Population characteristics

We utilized UKB data from a total of 502,409 participants to perform survival analysis and correlation analysis and to construct a structural equation model. For survival analyses we retrieved 287,282 individuals (mean age 57.52 years, 50.70% female) who had participated in the assessment of lifestyle factors at baseline (2006–2010) and had linkage to clinical diagnosis records. The numbers of participants falling into healthy or unhealthy lifestyles for each factor are listed in Table 1. The mean score for the seven lifestyles is 4.75 (s.d. = 1.36). Among the participants, 1.25% were categorized as following an unfavorable lifestyle (scores in the range 0–1), 38.90% followed an intermediate lifestyle (scores of 2–4), and 59.85% followed a favorable lifestyle (scores of 5–7). Of the 287,282 participants, 12,916 had an onset of depression during a median follow-up of 9.01 years. The demographic characteristics of the participants and the distribution of lifestyle factors are provided in Table 1. The PRSs for depression from 197,344 participants were normally distributed (Supplementary Fig. 5) and categorized into three levels, low (25.09%), intermediate (49.97%) and high (24.94%). Brain structural imaging data were collected in 2014 and contained T1-weighted structural magnetic resonance imaging (MRI) results from 32,839 participants, which were used in the correlation analysis. Blood chemistry and cell count indicators from ~480,000 participants and NMR metabolic biomarkers from ~120,000 participants were collected at baseline. The sample size of each biomarker used in the correlation analyses is presented in Supplementary Tables 18 and 19. We also used data from 448,849 participants who had completed a Patient Health Questionnaire-4 (PHQ-4, collected from 2006–2010) score. Combining the data of lifestyle score, PRS, immunometabolic markers, brain structural imaging and depression, the final analysis included 18,244 participants for estimation of the structural equation model. Extended Data Fig. 1 shows the research guidelines for the study. Supplementary Fig. 1 and Supplementary Table 4 present the sample size utilized in each analysis.

Table 1 Baseline characteristics of participants

Survival analyses on the association between lifestyle and depression

Multivariable Cox proportional hazard regression models were used to examine the association between lifestyle and depression. In terms of lifestyle factors, each showed a significant association with depression independently. For each factor, the healthy category showed a lower risk of depression as compared with the unhealthy category. Specifically, the results showed that moderate alcohol consumption decreased the risk of depression by 11% (hazard ratio (HR) of 0.89 (95% CI, 0.85–0.92)), healthy diet decreased the risk of depression by 6% (0.94, 0.90–0.97), regular physical activity decreased the risk of depression by 14% (0.86, 0.83–0.90), never smoking decreased the risk of depression by 20% (0.80, 0.78–0.83), healthy sleep decreased the risk of depression by 22% (0.78, 0.75–0.81), low-to-moderate sedentary behavior decreased the risk of depression by 13% (0.87, 0.84–0.90), and frequent social connection decreased the risk of depression by 18% (0.82, 0.78–0.86). For the lifestyle class, compared with the unfavorable lifestyle, intermediate and favorable lifestyle were associated with 41% (0.59, 0.53–0.65) and 57% (0.43, 0.38–0.47) lower risk of depression, respectively. When all factors were combined, the HRs for depression for participants with a lifestyle score of 1, 2, 3, 4, 5, 6 and 7 compared to those with a score of 0 were 0.83 (95% CI, 0.62–1.11), 0.61 (0.46–0.80), 0.53 (0.40–0.70), 0.45 (0.34–0.59), 0.39 (0.30–0.52), 0.33 (0.25–0.43) and 0.28 (0.21–0.37), respectively (Fig. 1). When treated as quantitative variable, each one-point increment in healthy lifestyle score was associated with an HR of 0.85 (95% CI, 0.84–0.86) (Supplementary Table 5).

Fig. 1: The association of lifestyle factor, class and score with depression risk.
figure 1

We utilized participants (n = 287,282) from UKB with completed lifestyle factor assessments in the survival analyses. The upper panel presents the association of lifestyle factors with depression. The bottom panel presents the association of a per-one-point increment in healthy lifestyle score with depression. The middle panel presents the association between each class of lifestyle and depression, with each lifestyle score reclassified into three classes (favorable scoring 5 to 7, intermediate scoring 2 to 4 and unfavorable scoring 0 to 1). All these models were adjusted for age, sex, Townsend deprivation index, BMI and education level. Two-sided unadjusted association P values from multivariate Cox models are given. Data are presented as mean values ± s.e.m. The widths of the lines extending from the center points represent 95% CI.

Combining effect of genetic risk and lifestyle on depression

Risk of depression was reduced monotonically across PRS classes, and the HR values for depression in participants with intermediate and low PRS compared to those with high PRS were 0.87 (95% CI, 0.83–0.91) and 0.75 (95% CI, 0.71–0.80) (Supplementary Table 8). We also found a monotonically decreasing trend of depression risk across increasingly favorable lifestyle class when controlling for the genetic risk group (Supplementary Table 10). When combining genetic risk and lifestyle classes, with participants with high genetic risk and unfavorable lifestyle as the reference group, participants with low genetic risk and favorable lifestyle showed the greatest reduction in depression risk, (HR, 0.36 (95% CI, 0.28–0.46); Fig. 2). However, there was no significant interaction between PRS and lifestyle score associated with risk of depression (Pinteraction = 0.41), indicating that lifestyle may not be substantially altered by the genetic risk for depression, and lifestyle has a strong protective role across different levels of polygenetic risk population. By further performing comparisons among the eight groups except the reference group using permutation tests, we found that participants with high genetic risk but favorable lifestyle had a lower risk of depression than those with intermediate or low genetic risk but unfavorable lifestyle (intermediate genetic risk, P = 3.0 × 10−4; low genetic risk, P = 5.8 × 10−3) and those with intermediate genetic risk and intermediate lifestyle (P = 1.1 × 10−2) (Supplementary Table 9). Participants with intermediate genetic risk and favorable lifestyle also showed lower risk of depression than those with low genetic risk and adherence to unfavorable and intermediate lifestyles (unfavorable, P = 2.9 × 10−3; intermediate, P = 2.1 × 10−2; Supplementary Table 9). The results also elucidated that favorable lifestyle would mitigate the genetic vulnerabilities of depression in population with different levels of polygenetic risk and safeguard the high-risk population against depression.

Fig. 2: Risk of depression according to genetic risk and lifestyle.
figure 2

We utilized 197,344 participants with both PRS and lifestyle scores in the analyses. The results were adjusted for age, sex, Townsend deprivation index, BMI and education level. Two-sided unadjusted association P values from multivariate Cox models are given. Individuals with high risk for depression and unfavorable lifestyle were used as the reference category. Data are presented as mean values ± s.e.m. The widths of the lines extending from the center points represent 95% CI.

Mendelian randomization evidence for the association between lifestyle and depression

Mendelian randomization (MR) analysis showed a protective causal relationship between lifestyle and depression. By utilizing genetic variants as proxies, the results from the inverse variance weighted (IVW) method found that per one-point increment in lifestyle score was associated with 35% reduced risk of depression (odds ratio (OR), 0.65; 95% CI, 0.55–0.76; P = 1.0 × 10−7; Fig. 3). Model-based sensitivity analyses showed that the estimates were similar in size in the simple median method (0.63, 0.50–0.80; P = 1.8 × 10−4), weighted median method (0.65, 0.51–0.83; P = 4.7 × 10−4) and weighted mode method (0.52, 0.31–0.90; P = 2.6 × 10−2), illustrating the robustness of the causal relationship. Scatter plots of lifestyle and depression risk association for the instruments are presented in Fig. 3a, with colored lines representing the slopes of different regression analyses. MR estimates for the effects of single-nucleotide polymorphisms (SNPs) associated with lifestyle on depression risk are presented in a forest plot in Supplementary Fig. 13. Sensitivity analyses also examined the reverse causality and showed significant results (0.85, 0.78–0.92; P = 5.9 × 10−5) with the IVW method (Supplementary Fig. 14), potentially indicating a bidirectional causal relationship between lifestyle and depression.

Fig. 3: MR plots for the relationship between lifestyle and depression.
figure 3

a, Scatter plot of SNP effects on lifestyle versus depression, with the slope of each line corresponding to the estimated MR effect per method. The data are expressed as raw β values with 95% CI values. b, Model-based sensitivity analyses from four methods (n = 26 SNPs). The effect estimates represent the odds ratio for depression per one-point increment in healthy lifestyle score. Data are presented as mean values ± s.e.m. The width of the lines extending from the center point represent the 95% CI. Two-sided unadjusted association P values from four models are given.

Association of lifestyle with brain volume and peripheral markers

The correlations between lifestyle and brain structures are in line with expectations that higher lifestyle scores are associated with larger brain volumes (for example, in cortical structures such as the superior prefrontal cortex, orbitofrontal cortex, precentral cortex and insula, and subcortical structures including the pallidum, thalamus, amygdala and hippocampus; Fig. 4a). The full results for the correlation between each lifestyle factor and brain structure volumes are listed in Supplementary Table 14. We also found that these brain structures are negatively associated with depression symptoms assessed by PHQ-4 (including the precentral, orbitofrontal and middle temporal cortexes, the thalamus, hippocampus and putamen; Supplementary Fig. 16 and Supplementary Table 16). We used the lifestyle and depression scores assessed at the neuroimaging visit timepoint (2014+) to calculate the correlation with brain volumes and obtained similar results (Supplementary Figs. 15 and 17 and Supplementary Tables 15 and 17). The spatial correlation between lifestyle and depression-associated brain maps at the neuroimaging visit was −0.52 (P = 5.4 × 10−7; Supplementary Fig. 18), which supports the consistency of the relationship of brain structural volumes and lifestyle and depression.

Fig. 4: Association of lifestyle with brain structure and peripheral markers.
figure 4

a, A total of 32,839 participants were utilized in the correlation analyses between lifestyle and brain structural imaging. Significant associations between lifestyle scores and brain volumes were adjusted for age, sex, BMI, Townsend deprivation index, education levels, neuroimaging scanning sites and estimated total intracranial volume after FDR correction (α = 0.05). b, Associations of lifestyle with peripheral markers, adjusted for age, sex, BMI, Townsend deprivation index and education level. As the different biomarkers have different sample sizes, we list the sample size corresponding to each specific biomarker in the full correlation results tables (Supplementary Tables 18 and 19). The height, color and size of each data point indicate the Pearson correlation value (r) between the lifestyle score and one marker. The horizontal dashed line denotes the positive and negative correlation boundary. NS, not significant. HDL High-Density Lipoprotein, VLDL Very-Low-Density Lipoprotein.

For the peripheral markers, 48 blood markers and 130 metabolic markers passed the significance threshold after Bonferroni adjustment (Fig. 4b). The r values for each marker are presented in Fig. 4b. We found that C-reactive protein (r = −0.065, P < 1.0 × 10−295) and triglycerides (r = −0.075, P < 1.0 × 10−295) were the most significant of the blood biochemistry markers, and neutrophil count (r = −0.106, P < 1.0 × 10−295) and leukocyte count (r = −0.094, P < 1.0 × 10−295) were the most significant among the blood cells. In terms of metabolic markers, the degree of unsaturation (r = 0.153, P < 1.0 × 10−295) and glycoprotein acetyls (r = −0.109, P = 1.1 × 10−240) were the most significant markers positively and negatively correlated with lifestyle, respectively. Full results are presented in Supplementary Tables 18 and 19.

Structure equation model

We employed a structural equation model to analyze the relationship between lifestyle, PRS, brain structure, immunometabolic function and depression in 18,244 participants, further revealing the neurobiological mechanisms underlying the association between lifestyle and depression. First, confirmatory factor analyses were used to examine the latent variables in the structural equation model, including depression, brain structure and immunometabolic function. The left superior frontal cortex in cortical volume and left insula in subcortical volume were the most significant markers to predict the latent variable brain structure (both β = 0.58; P < 0.001). C-reactive protein, triglycerides, glycated hemoglobin (HbA1c) and glucose were the four significant markers predicting the latent variable immunometabolic function (β = 0.06, 0.09, 0.86 and 0.55, respectively; P < 0.001). The latent variable depression was represented by the combined four PHQ items (β = 0.81, 0.75, 0.59 and 0.51, respectively; P < 0.001). The loading coefficients for each marker to the corresponding latent variables are shown adjacent to the model panel in Fig. 5.

Fig. 5: Structural equation model.
figure 5

A total of 18,244 participants were utilized in structural equation model analyses. Standardized coefficients are shown. Lifestyle was a significant predictor of depression (β = −0.157, P < 1.0 × 10−20), immunometabolic function (β = −0.043, P = 4.7 × 10−7) and brain structure (β = 0.038, P = 2.9 × 10−6). PRS (β = 0.036, P = 1.5 × 10−5), brain structure (β = −0.023, P = 1.2 × 10−2) and immunometabolic function (β = 0.020, P = 3.5 × 10−2) were also significant predictors of depression. PRS was also a significant predictor of lifestyle (β = −0.022, P = 3.5 × 10−3) and immunometabolic function (β = 0.018, P = 3.4 × 10−2). All paths represent significant associations except for the association between brain structure and PRS, and the association between brain structure and immunometabolic function. Latent variables including brain structure, depression and immunometabolic function were estimated in the model and are shown adjacent to the model panel, respectively. All significant paths could pass the FDR correction (α = 0.05). *Two-sided unadjusted P < 0.05; **two-sided unadjusted P < 0.01; ***two-sided unadjusted P < 0.001.

Next, we input the prepared latent variables into the structural equation model to estimate the path coefficients. We tested all hypothetical paths in the model (Fig. 5). Lifestyle was a significant predictor of depression (β = −0.157, P < 1.0 × 10−20), immunometabolic function (β = −0.043, P = 4.7 × 10−7) and brain structure (β = 0.038, P = 2.9 × 10−6). PRS (β = 0.036, P = 1.5 × 10−5), brain structure (β = −0.023, P = 1.2 × 10−2) and immunometabolic function (β = 0.020, P = 3.5 × 10−2) were also significant predictors of depression. PRS was a significant predictor of lifestyle (β = −0.022, P = 3.5 × 10−3) and immunometabolic function (β = 0.018, P = 3.4 × 10−2). All paths represented significant associations and passed the false discovery rate (FDR) correction except for the association between brain structure and PRS, and the association between brain structure and immunometabolic function.

Sensitivity analyses

The main survival analyses results, including the association of lifestyle factors, classes and scores with depression risk, were also confirmed in the sensitivity analyses across three subtypes of depression, including single depressive episode, recurrent depressive disorder and treatment-resistant depression (TRD). For lifestyle factors, healthy sleep was still the greatest factor in reducing the risk of a single depressive episode (HR, 0.78 (95% CI, 0.75–0.81)) and TRD (HR, 0.64 (95% CI, 0.58–0.71)) (Supplementary Figs. 6 and 8). However, for recurrent depressive disorder, frequent social connection reduced the risk most (HR, 0.61 (95% CI, 0.51–0.72)) (Supplementary Fig. 7). To further explore the risk of depression according to previous and current smoking status, with never smoking as the reference group, we found a significant monotonically increasing risk of depression in the previous–never smoking comparison and current–never smoking comparison across the three subtypes (Supplementary Table 6). The consistent monotonical reduction of depression risk across lifestyle classes was also found in each subtype of depression (Supplementary Figs. 68). In addition, the favorable lifestyle class with a larger sample size as the reference group also showed similar significant results (Supplementary Table 7). In terms of the combining lifestyle score, each one-point increment in the healthy lifestyle score was associated with an HR of 0.85 (95% CI, 0.84–0.86) for a single depressive episode, 0.82 (95% CI, 0.78–0.87) for recurrent depressive disorder and 0.81 (95% CI, 0.78–0.84) for TRD (Supplementary Table 5). Additionally, similar patterns of depression risk according to genetic risk category were found in the three subtypes of depression (Supplementary Table 8). For the three subtypes of depression, we also confirmed a monotonical association of decreasing depression risk across increasing lifestyle class when the comparison was restricted to the same genetic risk group (Supplementary Tables 1113). Similarly, the risk of depression in the three subtypes was not altered by the interaction of genetics and lifestyle, indicating that adherence to a healthy lifestyle would attenuate genetic risk across the different PRS groups.

Two possible structural equation models were additionally estimated to account for multiple directionalities among the latent variables. The first model supported the reverse effect of depression on lifestyle (Supplementary Fig. 19) and the second model validated the mediation mechanism of immunometabolic function and brain structure underlying the path from depression to lifestyle (Supplementary Fig. 20). The two additional models further provide evidence for a bidirectional and multifactorial relationship between lifestyle and depression.

Discussion

In the current study we have integrated multimodal data to investigate the association of lifestyle with depression. We identified the protective roles of seven lifestyle factors and the monotonical reduction of depression risk across lifestyle classes and scores. The findings regarding the combining effect of lifestyle and polygenic risk on depression revealed that favorable lifestyle plays a strongly protective role in the prevention of depression across a population with different polygenetic risk. MR analysis added convincing causal evidence of a bidirectional relationship between lifestyle and depression. In addition, we also found that changes in brain volume and peripheral biological markers are associated with lifestyle score, which implies that lifestyle is an important phenotype reflecting both brain and physical health status. Furthermore, structural equation modeling was performed to demonstrate the coherent mediating pathway involving genetics, lifestyle, brain structure, immunometabolic function and depression, supporting the existing hypothesis of a multifactorial pathogenesis in depression.

Previous studies identifying the relationship between single lifestyle factors and depression found that lack of social connection19, higher sedentary behavior levels20, failure to engage in physical activities21,22, smoking23,24 and an unhealthy diet25 are associated with an increased risk of depression. Similar to previous studies showing that the relationships between depression and sleep duration26,27 or alcohol consumption are U-shaped28, we also found that light-to-moderate drinking and 7–9-h sleep duration were optimal to reduce the risk of depression. Among all these protective risk factors, having optimal sleep showed the greatest risk reduction for depression, including a single depressive episode and treatment-resistant depression. These findings are consistent with previous evidence showing improvements in depressive symptoms following non-pharmacological insomnia-specific treatment29. As for recurrent depressive disorder, frequent social connection was the most protective factor, indicating the benefits of social activity prescription in helping patients better manage their depressive conditions30. By incorporating multidimensional factors, our study shows the impact of a broad spectrum of lifestyle factors on depression as well as an approximate comparison between the impacts of these separate factors while controlling for their interaction.

To further determine a generalized effect of incorporated lifestyle index on depression, we combined the seven lifestyle factors and stratified participants into unhealthy, intermediate and favorable lifestyle classes31,32. In addition, we also conducted analyses of combined lifestyle and genetic risk classes, aiming to better compare depression risk for individuals with different levels of lifestyles and polygenetic risk profiles. Our findings of an overall monotonic association with an increasing genetic risk and an unfavorable lifestyle and a non-significant gene-by-lifestyle interaction are consistent with a previous study33 that revealed that adherence to a healthy lifestyle may lower the risk of depression, regardless of the genetic risk. These results indicate that a healthy lifestyle plays a significantly strong protective role in any polygenetic risk population. They further highlight the importance of adjusting to a healthy lifestyle to reduce the risk of depression.

The potential mechanism by which lifestyle factors impact depression risk is complex and multifactorial, involving genetic, behavioral and neurobiological aspects. A previous study found that the impacts of unfavorable lifestyle factors on dysregulated pathways are associated with depression, and involve neurotransmitter processes and immuno-inflammatory measures34. Another review has also suggested that the contribution of genetic variation to the mechanistic aspects of brain function is implicated in psychiatric disorders17. Our study employed a structural equation model to investigate the precise mechanisms by which lifestyle, brain structure, immunometabolic function and genetic risk affect depression. We found that the pathway from lifestyle to immunometabolic function is significant. Previous research has associated life stress exposures with indicators of adverse glycemic measures, deterioration of immune function and the accelerated accumulation of age-associated molecular and cellular damage35. Studies have reported that poor physical activity and sleep restriction result in dysregulation of the hypothalamic-pituitary-adrenal (HPA) axis36,37. Dietary habits are related to the levels of specific metabolites present in plasma38,39. Social status, which may affect stress levels, influences the immune system at multiple levels and alters signaling pathways in response to infection40. The influence of immunometabolic factors on depression is noteworthy and consistent with previous investigations. A previous study reported that immunometabolic dysregulation is also critical in the pathophysiology of depression41. Increased inflammation and disrupted energy-regulating neuroendocrine signaling (for example, leptin and insulin) in depressed patients have also been found42, suggesting that a dysfunctional molecular condition in the neuroimmune system is potentially the cause of the exacerbation of depression symptoms.

Altered brain morphology is another plausible mechanism underlying the association of lifestyle with depression. Specifically in our results, higher lifestyle scores are correlated with a larger volume of the orbitofrontal cortex and the medial prefrontal cortex, which might suggest improved cognitive control and emotion regulation43,44. It is noted that the posterior lateral and medial orbitofrontal cortex and medial prefrontal cortex have shown gray-matter volumetric reductions and functional impairment in patients with mood disorders44. For specific aspects of lifestyle, previous research has demonstrated that amygdala neurons play a pivotal role in mediating social behaviors45 and that the acquisition of knowledge through social networks is associated with activation of the lateral prefrontal cortex and medial prefrontal cortex46. Consumption of an unhealthy diet and addiction to smoking may lead to encephalopathy and focal brain damage, which are associated with depression symptoms24,47. Adequate sleep also promotes synaptic plasticity mechanisms in the hippocampus that optimize emotional responses to future behavioral stressors48. Lack of physical activity and excessive consumption of alcohol may lead to oxidative stress, neuroadaptive changes and differences in brain structure and function49,50, which are associated with mental health disorders. The findings described above support our results regarding the lifestyle-related brain structures and validate the modulatory role of brain function in the association between lifestyle and depression. However, in the context of the SEM framework, the association between the brain and immunometabolic function was not significant, although the brain–immune interaction has previously been suggested as a biological basis for major depressive disorder51,52. We supposed that the two strongly significant paths between brain measures and immunometabolic function with depression may attenuate the effect of brain–immune interaction in the comprehensive mathematical model. Further studies could examine an experimental and clinical approach to precisely test the molecular and cellular mechanisms of the brain–immune interaction associated with depression.

In addition, we further considered the bidirectional and multifactorial relationship between lifestyle and depression. The reversed MR result revealed the bidirectional causal effect of depression on lifestyle. We also added two possible SEMs to respectively explain the reversed directionality of depression on lifestyle and the mediation mechanism of immunometabolic function and brain structure mechanisms underlying the path from depression to lifestyle. The findings of the two models are aligned with previous evidence indicating that major depressive disorder is linked to an unhealthy lifestyle, characterized by insomnia or hypersomnia, psychomotor agitation or retardation, and fluctuations in appetite53. Additionally, individuals with depression were also accompanied by a change of motivation to socialize and decreased social participation in life54. Depressed patients also exhibit signs of immune activation, including elevated levels of proinflammatory cytokines, C-reactive protein and cortisol, which may interact with the HPA axis and sympathetic nervous system, leading to a disruption of circadian rhythms55, changes in appetite56, motor abnormalities57 and stress58. Another study utilizing functional MRI revealed that depression-related increases in appetite are associated with hyperactivation of the mesocorticolimbic reward circuitry, and depression-related appetite loss is linked to hypoactivation of insular regions59.

Furthermore, we also found that the impact of lifestyle on depression is notably higher than any other significant relationship in the SEM, with an effect size ranging from three- to tenfold. The effect sizes of our results are similar to those of previous studies33,60 and this could be attributed to a variety of reasons. First, in the SEM, we noted that the impact of comprehensive lifestyle on depression is notably higher due to the combined effect of multiple mechanisms. Previous studies have identified that single lifestyle factors may influence the prevalence of depression through diverse genetic and biological pathways, as seen in the case of sleep. Sleep not only impacts the HPA axis, hippocampal oscillations and sympathetic nervous system36, but also changes the levels of proinflammatory cytokines, C-reactive protein and cortisol levels in the peripheral immune system55. Furthermore, immunometabolic markers in peripheral systems may be modulated by a negative feedback mechanism to ensure the stability of the environment in the body and the maintenance of immune homeostasis61. As for brain structural imaging, the association between brain structure or function and complex cognitive or mental health phenotypes may be limited by small sample sizes, which may potentially have been too small to capture reproducible brain–behavioral phenotype associations62. Therefore, there may be a mild effect of immunometabolic and brain markers on depression symptoms. It is also worth noting that previous studies have also shown that the HR of lifestyle on depression is more significant than those of genetic factors33. Importantly, environmental factors, such as lifestyle, can be modified, in contrast to genetic factors.

There are three main strengths of the present study. First, it utilizes the large UKB cohort and investigates the association of multidimensional lifestyle factors with depression and the underlying neurobiological mechanisms involving brain structural imaging and biochemical, metabolic and genetic data. The use of large sample sizes guarantees the precise estimation of effect sizes. We conducted an exploratory analysis to identify broad phenotypes of brain structures and peripheral systems associated with lifestyle. Because the dataset is longitudinal, we also considered the time difference for behavioral and neuroimaging data collection and added additional analyses to examine the consistency over time. Second, we define a healthy lifestyle based on validated national guidelines and previous studies, which guarantees a standard lifestyle criterion for populations. In addition, our findings regarding the combined effect of genetic and lifestyle factors on depression highlights that adherence to a favorable lifestyle is beneficial for populations with different genetic vulnerabilities. Although our findings address the risk of depression and how it can be mitigated by a healthy lifestyle, it is likely that maintaining a healthy lifestyle would be beneficial for those with depression. Finally, we performed structural equation modeling to specify multifactorial interplayed processes and potential mechanisms underlying this association, including lifestyle, genetics, brain structure, immunometabolic function and depression.

Our study has several limitations. First, lifestyle was assessed by subjective self-report questionnaires, which might be prone to measurement error. In future, objective measures, such as accelerometer-based ones, could be used. However, due to time limitations and the availability of technology, these are often not feasible for use in large-scale populations. In this study, the use of actigraphy data would result in a relatively smaller sample size and an inadequate follow-up time, which would limit our statistical model. As for outcomes, we used the date of depression diagnosis, which is a typical procedure for Cox model estimation, but does not take into account the time lapsing between the first onset of an episode and being diagnosed. Second, our results should be further validated in an independently ascertained population. Our study is limited due to the selection bias of the UKB cohort, as the participants are relatively healthier than the general population. There are limited data on ethnic minorities. We also acknowledge that the large number of variables collected by the UKB can make it difficult to identify and control for all potential confounding factors, which may have biased our observational results. Future research utilizing longitudinal brain imaging and peripheral markers would provide a comprehensive illustration of the changing association pattern during the aging process. Third, we have acknowledged the incongruity of the stratification methods employed in lifestyle and PRS categorization according to previously published studies. Although the categories were not established based on a uniformly validated standard, we characterized better comparisons for the risk of depression between participants distributed in different lifestyle and genetic risk levels. The outcomes of categorized and continuous lifestyle scores were generally consistent, with an increase in lifestyle score being associated with a lower risk of depression, which confirms the robustness of our results. Finally, we used the PHQ-4 score as an assessment of depression symptoms to estimate the SEM, instead of the depression diagnosis, considering that the diagnosis of depression is a binary variable, and the dates were distributed over a longer period ranging from 2006 to 2021. We additionally employed an independent t-test on the PHQ-4 score between the diagnosed-depression group and the non-depression group and found a significant difference (t = 52.589, P < 2.2 × 10−16), so we suggest that the continuous PHQ-4 score better reflects the depressive status. The limited number of participants with a diagnosis of depression would result in a smaller sample size (310 with onset depression in a total of 13,324 participants with completed multimodal data used in the SEM), which would limit the accuracy of the model estimation. Future research could utilize clinical diagnostic records and larger sample sizes of patients with a diagnosis of depression.

In conclusion, the present study has confirmed a causal protective relationship between multiple dimensional lifestyle factors and the risk of depression. Adherence to a healthy lifestyle would aid in the prevention of depression across a population with different genetic risk profiles. A structural equation model was used to reveal the underlying neurobiological mechanisms involving lifestyle, brain structures, immunometabolic function, genetics and depression. Our results highlight the importance of promoting a comprehensive healthy lifestyle for depression intervention.

Methods

Study population

The UK Biobank (UKB) is a prospective cohort that recruited more than 500,000 participants aged 37–73 years who attended one of 22 assessment centers between 2006 and 201063. The UKB has received approval from the National Information Governance Board for Health and Social Care and the National Health Service North West Centre for Research Ethics Committee (ref. 11/NW/0382). All participants provided informed consent through electronic signature. The data utilized in the analyses contains demographic and behavioral assessments, depression diagnoses, brain structural imaging, blood biochemistry markers, blood cell counts and NMR metabolic markers.

Lifestyle factors and healthy lifestyle score

We constructed a lifestyle score for 394,053 participants based on seven factors—smoking, physical activity, alcohol consumption, diet, sleep duration, sedentary behavior and social connection—that were assessed at baseline using a touchscreen questionnaire32. Full details for each factor, corresponding questionnaires and national guidelines are listed in Supplementary Table 3. Participants scored one point for the healthy category of each factor based on national recommendations. According to the quit smoking guidelines from the National Health Service (NHS) and National Institute for Health and Care Excellence (NICE) 2015 quality standard on ‘Smoking: reducing and preventing tobacco use’, never smoking was classified as the healthy category31,64, and current and previous smoking were considered as the unhealthy category. Physical activity was assessed using the International Physical Activity Questionnaire-Short Form (IPAQ-SF)65. Regular physical activity was defined as meeting the American Heart Association recommendations and World Health Organization guidelines (at least 150 min of moderate activity or 75 min of vigorous activity per week (or an equivalent combination) or engaging in moderate physical activity at least five days a week or vigorous activity once a week (more than 10 min))66,67. Due to the U-shaped relationship between alcohol consumption and depression risk28, moderate alcohol consumption was defined as never drink or 0–14 g per day for women and 0–28 g per day for men, with the maximum limit reflecting US Dietary Guidelines for Americans (DGA)68,69. Following the DGA70,71,72 and recommendations on dietary priorities for cardiometabolic health68, healthy diet was based on an appropriate consumption of at least four of seven food groups including fruits, vegetables, fish, processed meats, unprocessed red meats, whole grains and refined grains, with specific cutoff criteria as listed in Supplementary Table 3. In accordance with the NHS, the American Academy of Sleep Medicine (AASM) and the Sleep Research Society (SRS) recommendation73, sleep duration of 7–9 h was considered healthy. According to the WHO guidelines66 and previously published literature32, screen-based sedentary behavior was estimated according to the Global Physical Activity Questionnaire (GPAQ)74 as the sum of self-reported hours spent watching TV and using a computer (not including using a computer at work) during a typical day, with 0–4 h reclassified as low-to-moderate and healthy. Social connection was assessed according to the social isolation index75. The index sums up the following three items: number of people in the household (one point was given for living alone), frequency of friend/family visits (one point was given for answering about once a month, once every few months, never or almost never, or no friends or family outside household), and participation in leisure/social activity (one point was given for answering none)75. Individuals were defined as least isolated when scoring 0, moderately isolated when scoring 1 and most isolated when scoring 2 or 375. Participants who were least and moderately isolated were defined as healthy with frequent social connection32,75. The lifestyle score ranged from 0 to 7, with a higher score indicating adherence to a healthier lifestyle31. The lifestyle score was subsequently categorized as favorable (5 to 7 lifestyle factors), intermediate (2 to 4) and unfavorable (0 to 1) for further analyses31,32.

Depression diagnoses

Depression diagnoses were ascertained using hospital inpatient records from Hospital Episode Statistics for England, Scottish Morbidity Record data for Scotland and Patient Episode Database for Wales. Additional cases were detected through linkage to death register data from the National Health Service Digital, National Health Service Central Register and National Records of Scotland. Diagnoses were recorded according to the International Classification of Diseases (ICD-10) system codes, including depressive episode (F32.0-9) and recurrent depressive disorder (F33.0-9). Depression diagnoses were also retrieved from primary care data using read codes (version 2 (Read v2) and version 3 (CTV3 or Read v3)), which are presented in Supplementary Table 1. We integrated the data from hospital inpatient, primary care and death register sources, while those from self-report only were excluded. The date of diagnosis was defined as the earliest date recorded.

Structural MRI data

Quality-controlled T1-weighted neuroimaging data (n = 39,932) were measured at the UKB assessment center. The scanner was a standard Siemens Skyra 3T system with a standard Siemens 32-channel RF receive head coil. The details of image acquisition are provided on the UKB website (http://biobank.ctsu.ox.ac.uk/crystal/refer.cgi?id=2367). T1 images were processed with FreeSurfer. Surface templates were used to extract imaging-derived phenotypes, referred to as atlas regions’ surface volume76. Subcortical regions were extracted via FreeSurfer’s aseg tool77. FreeSurfer aparc (category ID 192) and the aseg (category ID 190) atlas corresponding to 68 cortical regions76 and 14 subcortical regions77 were used in this study. We used the Qoala-T approach to check the FreeSurfer outputs and excluded those that failed to pass quality control from the FreeSurfer imaging-derived phenotypes. By merging data with the lifestyle score, we used 32,839 brain structural imaging data for correlation analyses.

Peripheral markers

Blood biochemistry (category ID 17518) and blood count (category ID 100081) data were taken from ~480,000 participants at the baseline assessment (2006–2010). The biomarker assay quality procedures are provided in an open-source document (https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/biomarker_issues.pdf). Beckman Coulter LH750 instruments were used to analyze the blood count samples, which were collected in 4 ml of ethylenediaminetetraacetic acid (EDTA) vacutainers. More information about the hematology analysis is provided at https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/haematology.pdf. Thirty blood biochemistry markers and 31 blood cell counts were available from these two UKB categories. The study used 59 markers from 61, excluding nucleated red cell count and percentage because the count or percentage of the majority (>90%) was 0. We then categorized the blood biochemistry markers as ‘liver function’, ‘renal function’, ‘endocrine’, ‘immunometabolic’ or ‘bone and joint’ according to previous literature78. Blood count was categorized as ‘white blood cell’, ‘red blood cell’ or ‘platelet’. More details are provided in Supplementary Table 18.

Metabolic markers were measured in randomly selected EDTA plasma samples using an NMR-based metabolic biomarker profiling platform, which included ~120,000 UKB participants at the baseline assessment. The process for assessing the metabolic markers is available at https://biobank.ndph.ox.ac.uk/showcase/label.cgi?id=220. The NMR metabolomics (category ID 220) included 249 metabolic markers (168 directly measured and 81 ratios of these). We used 168 directly measured metabolic markers and categorized metabolic markers as ‘amino acids’, ‘apolipoproteins’, ‘lipoprotein particle sizes’, ‘lipoprotein particle concentrations’, ‘fatty acids’, ‘triglycerides’, ‘phospholipids’, ‘cholesteryl esters’, ‘free cholesterol’, ‘cholesterol’, ‘other lipids’, ‘total lipids’, ‘ketone bodies’, ‘glycolysis-related metabolites’, ‘fluid balance’ and ‘inflammation’. More details are available in Supplementary Table 19.

PRS for depression

The computation of PRSs for individuals required two distinct datasets. The first dataset, consisting of individual-level genotype data, was obtained from all 502,409 participants in UKB v3 imputation. Detailed genotyping and quality control procedures are available in a previous publication79. We excluded SNPs with call rates of <95%, minor allele frequency of <0.1%, deviation from the Hardy–Weinberg equilibrium with P < 1 × 10−10 and selected subjects with British ancestry that had no more than ten putative third-degree relatives in the kinship table. Finally, we obtained 8,239,652 SNPs from 337,151 participants.

The second dataset was a meta-analysis of large-scale genome-wide association (GWAS) results for depression80, which included 9,874,289 total SNPs from 480,359 individuals (135,458 cases and 344,901 controls). After excluding data from the 23andMe (75,607 cases and 231,747 controls, restricted by privacy) and UKB (14,260 cases and 15,480 controls) to avoid sample overlap with the UKB genotype data, this collective GWAS encompassed 6,435,918 variants from 143,265 individuals (45,591 cases and 97,674 controls).

We utilized PRSice-2 (http://www.prsice.info) to calculated the PRS for depression. First, SNPs were clumped with a cutoff of r2 = 0.1 in a 250-kb window81. Second, the P thresholds were set at P < 0.0005, P < 0.001, P < 0.005, P < 0.01, P < 0.05, P < 0.1, P < 0.5 and P < 1 (ref. 82), and we finally used PRS (P < 0.05) for further analyses. The PRS was then categorized into low (lowest quintile), intermediate (quintiles 2 to 4) and high (highest quintile) risk levels31,32. We also used the original GWAS, excluding samples from 23andMe to calculate the PRS, which included 5,401,556 variants and 173,005 individuals (59,851 cases and 113,154 controls), and found a high correlation between the two PRSs (r = 0.512, P < 1.0 × 10−295).

Covariates

The data for covariates were collected at baseline. All models were adjusted for age at baseline (field ID 21022), sex (field ID 31), Townsend deprivation index (field ID 189, referring to an area-based measure of socioeconomic deprivation), BMI (field ID 21001) and education level (field ID 6138). Association analyses involving brain structural imaging data were additionally adjusted neuroimaging scanning sites and estimated total intracranial volume. Association analyses involving genetic data additionally corrected for the top 20 ancestry principal components. The detailed covariates utilized in each statistical analysis are presented in Supplementary Fig. 2.

Statistical analyses

Cox proportional hazard regression model

Cox proportional hazard regression models were utilized to examine the association of lifestyle categories, genetic risk categories and the combination of genetic and lifestyle categories (nine categories using low genetic risk and favorable lifestyle as reference) with time to depression. The missing data of covariates were imputed by their mean value. The proportion of missing data is presented in Supplementary Table 2. HRs and corresponding 95% CIs were calculated in complete-case analyses. Participants were considered at risk for depression from baseline (2006–2010) and followed up until the date of first diagnosis, death, loss to follow-up or last date with available information (April 2021), whichever came first. The proportional hazards assumption was assessed using the Schoenfeld residuals method83 and satisfied.

Association of lifestyle scores with brain volume and peripheral markers

To identify the mediating role of brain structural imaging and peripheral markers underlying the association between lifestyle and depression, we conducted an exploratory Pearson correlation analysis of lifestyle at baseline with these neurobiological phenotypes. FDR correction was conducted for multiple comparisons (Supplementary Table 14). Considering the time difference from baseline (2006–2010) to imaging visit timepoint (2014+), we also calculated correlation analyses between brain volumes and lifestyle scores assessed at the imaging visit (Supplementary Table 15). Depression scores at the two timepoints were also correlated with brain volumes (Supplementary Tables 16 and 17). Both lifestyle and depression score at the two collected timepoints were similar and highly correlated (lifestyle, r = 0.85, P = 1.2 × 10−23; depression, r = 0.85, P = 3.2 × 10−24). Therefore, lifestyle and depression did not change to a great extent. We further calculated the spatial correlation to compare the similarity of lifestyle and depression-related brain maps at baseline and imaging visit (Supplementary Fig. 18).

The associations of lifestyle at baseline with peripheral markers were also tested, with Bonferroni correction conducted for multiple comparisons (Supplementary Tables 18 and 19).

Mendelian randomization

We further investigated the possible causal relationship between lifestyle and depression using bidirectional two-sample MR analysis, which employs genetic variants as the instrumental variables to estimate the corresponding causal effect size.

For lifestyle, we performed GWAS on lifestyle scores. The genotype data were the same as used in the PRS calculation (including 8,239,652 SNPs from 337,151 participants). We used PLINK 2.0 (https://www.cog-genomics.org/plink/2.0/) for the calculation, adjusting for age, sex and the first 20 ancestry principal components. For depression, we also used the meta-analysis of depression GWAS80 and excluded the participants from UKB and 23andMe to avoid sample overlap, which was consistent with the PRS calculation.

Two-sample MR analyses used four methods—IVW, weighted median, simple median and weighted mode—implemented in the R package TwoSampleMR (https://mrcieu.github.io/TwoSampleMR/). The genetic instruments were selected with P < 1 × 10−7 and we removed correlated SNPs (r2 > 0.01, kb < 1,000) to avoid linkage-disequilibrium. Cochran’s Q statistics was used to assess the heterogeneity of IVW, and radial MR was used to remove the SNPs contributing the most heterogeneity. After removing two SNPs with significant heterogeneity (rs4930349, PIVW.Q = 3.4 × 10−2; rs66994942, PIVW.Q = 4.2 × 10−3), we performed the heterogeneity test again by using Q statistics and the Mendelian Randomization-Pleiotropy Residual Sum and Outlier (MR-PRESSO) test. No further heterogeneity was observed (PIVW.Q = 0.26). The MR-Egger method was used and found no sign of pleiotropy (Ppleiotropy = 0.52). In the main MR analysis, we utilized 26 SNPs and prioritized the models using the fixed-effect IVW method, because no pleiotropy and heterogeneity effects were observed. For sensitivity analyses, we performed the simple median method, weighted median and weighted mode methods to ensure the robustness of the IVW result. We also assessed the reverse causality by testing the effect of depression genetic liability (using the same P value and clumping threshold) on lifestyle similarly. Finally, we created scatter and forest plots to visually inspect the data for a strong influence by single variants (Fig. 4 and Supplementary Figs. 13 and 14).

Structural equation model

A structural equation model was estimated for participants with lifestyle scores, depression scores, brain structural imaging, PRS and immunometabolic markers (implemented in the R package lavaan 0.8). Three latent variables were estimated using confirmatory factor analysis. The latent variable representing depression was estimated using four PHQ-4 items. The latent variable of immunometabolic function was calculated from the first four significant markers correlated with lifestyle score. Finally, the latent variable for brain structures was derived from the first 20 cortical and subcortical brain volumes significantly correlated with lifestyle score. These three latent variables were investigated to determine the directional dependencies with PRS and lifestyle via path modeling. Before inputting into the model, we normalized the PRS, lifestyle score and each sub-score of the latent variables, respectively, to maintain the same scale. FDR corrections were conducted to adjust the P values accounting for multiple comparisons of all paths.

Sensitivity analyses

In the sensitivity analyses, we further tested the association of lifestyle with risk of three depression subtypes: single depressive episode, recurrent depressive disorder and TRD. The ICD-10 defines ‘depressive episode’ as F32.0-9, indicating a singular occurrence, and ‘recurrent depressive disorder’ as F33.0-9. The data were also retrieved from primary care by utilizing Read codes (Supplementary Table 1). According to the treatment records in the primary care prescriptions table (gp_scripts, UKB Field ID 42039), antidepressant use was gathered utilizing British National Formulary codes 0403010 (tricyclic and related antidepressant drugs), 0403020 (monoamine oxidase inhibitors), 0403030 (selective serotonin reuptake inhibitors) and 0403040 (other antidepressant drugs). TRD was defined by at least two switches between antidepressant drugs, each prescribed for at least six weeks84, as well as having partially distinct genetic85 and clinical-demographic characteristics86 compared with non-TRD. As such, TRD should be considered as a unique subtype of depression87,88,89. Non-response after the second medication switch was confirmed for subjects considered treatment-resistant.

Due to the limited sample size scoring 0 and 1 in recurrent depressive disorder and TRD, we combining 0 and 1 as the reference group (Supplementary Figs. 7 and 8). Considering the difference between previous smokers and current smokers, we further calculated the risk of depression according to different smoking statuses with never smoking as reference (Supplementary Table 6). Regarding the small reference group issue, we also used favorable lifestyle class with a larger sample size as reference (Supplementary Table 7). In addition, we calculated depression risk according to genetic and lifestyle across three subtypes of depression (Supplementary Figs. 911).

Considering the change of depression symptoms during the follow-up period from baseline to imaging visit, we selected two groups based on the PHQ-4 classifying criterion90. One group was severely depressed at baseline (PHQ-4 ≥ 9) and recovered at imaging visit (PHQ-4 ≥ 6); this was labeled the recovered group (n = 538). The other group was suffering persistent symptoms over time (PHQ-4 ≥ 9, whether at baseline or imaging visit); this was labeled the persistent group (n = 809). To investigate whether the change in depression symptoms would impact the association between depression and brain volumes, we utilized an independent-sample t-test to examine the difference in brain volumes between the two groups. We found there were only two significant brain volumes (left middle temporal and right insula), both with P = 0.04, but neither survived the FDR correction (Supplementary Table 20). We also considered whether antidepressant treatment would impact the association between depression and brain volumes in these two groups. In the recovered group, we further classified the treated-recovered group (n = 132) and not-treated-recovered group (n = 451) based on antidepressant use. We found there were only two significant brain volumes (right cuneus and right posterior cingulate), with P = 0.03 and 0.01, but neither survived the FDR correction (Supplementary Table 21). In the persistent group, we also classified the treated-persistent group (n = 251) and the not-treated-persistent group (n = 558) based on antidepressant use. We found that there was only one significant brain volume (right accumbens area), with P = 0.03, but it did not survive FDR correction (Supplementary Table 22). As such, we suggest that the time difference between baseline and imaging visit did not substantially influence the correlation between depression and brain structure.

Finally, we estimated two possible SEMs accounting for multiple directionalities among the latent variables. In the first model, we changed the direction from depression to lifestyle and kept the other paths’ direction unchanged to test the reversed effect (Supplementary Fig. 19). In the second model, we further examined the mediation mechanism of immunometabolic function and brain structure underlying the path from depression to lifestyle, by changing the directions between depression and immunometabolic function, depression and brain structure, immunometabolic function and lifestyle, and brain structure and lifestyle based on possible model 1 (Supplementary Fig. 20). Both of these possible models were fitted under the same procedures as the original model and provided an improved interpretation of our results.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this Article.