Associated factors of white matter hyperintensity volume: a machine-learning approach

Grosu, Sergio; Rospleszcz, Susanne; Hartmann, Felix; Habes, Mohamad; Bamberg, Fabian; Schlett, Christopher L.; Galie, Franziska; Lorbeer, Roberto; Auweter, Sigrid; Selder, Sonja; Buelow, Robin; Heier, Margit; Rathmann, Wolfgang; Mueller-Peltzer, Katharina; Ladwig, Karl-Heinz; Grabe, Hans J.; Peters, Annette; Ertl-Wagner, Birgit B.; Stoecklein, Sophia

doi:10.1038/s41598-021-81883-4

Download PDF

Article
Open access
Published: 27 January 2021

Associated factors of white matter hyperintensity volume: a machine-learning approach

Sergio Grosu¹,
Susanne Rospleszcz^2,13,
Felix Hartmann¹,
Mohamad Habes^3,4,5,
Fabian Bamberg⁶,
Christopher L. Schlett⁶,
Franziska Galie¹,
Roberto Lorbeer¹,
Sigrid Auweter¹,
Sonja Selder¹,
Robin Buelow⁷,
Margit Heier^2,15,
Wolfgang Rathmann^8,9,
Katharina Mueller-Peltzer⁶,
Karl-Heinz Ladwig^2,10,
Hans J. Grabe^5,7,11,
Annette Peters^2,12,13,
Birgit B. Ertl-Wagner^1,14 &
…
Sophia Stoecklein¹

Scientific Reports volume 11, Article number: 2325 (2021) Cite this article

5194 Accesses
15 Citations
1 Altmetric
Metrics details

Subjects

Abstract

To identify the most important parameters associated with cerebral white matter hyperintensities (WMH), in consideration of potential collinearity, we used a data-driven machine-learning approach. We analysed two independent cohorts (KORA and SHIP). WMH volumes were derived from cMRI-images (FLAIR). 90 (KORA) and 34 (SHIP) potential determinants of WMH including measures of diabetes, blood-pressure, medication-intake, sociodemographics, life-style factors, somatic/depressive-symptoms and sleep were collected. Elastic net regression was used to identify relevant predictor covariates associated with WMH volume. The ten most frequently selected variables in KORA were subsequently examined for robustness in SHIP. The final KORA sample consisted of 370 participants (58% male; age 55.7 ± 9.1 years), the SHIP sample comprised 854 participants (38% male; age 53.9 ± 9.3 years). The most often selected and highly replicable parameters associated with WMH volume were in descending order age, hypertension, components of the social environment (i.e. widowed, living alone) and prediabetes. A systematic machine-learning based analysis of two independent, population-based cohorts showed, that besides age and hypertension, prediabetes and components of the social environment might play important roles in the development of WMH. Our results enable personal risk assessment for the development of WMH and inform prevention strategies tailored to the individual patient.

Machine-learning based exploration of determinants of gray matter volume in the KORA-MRI study

Article Open access 20 May 2020

Central obesity is selectively associated with cerebral gray matter atrophy in 15,634 subjects in the UK Biobank

Article Open access 10 February 2022

Metabolism-related brain morphology accelerates aging and predicts neurodegenerative diseases and stroke: a UK Biobank study

Article Open access 29 June 2023

Introduction

White matter hyperintensities (WMH) of the brain were considered as incidental findings without clinical significance for a long time. However, several studies over the last decade have shown that WMH are associated with various severe morbidities such as stroke and cognitive decline^1,2,3. Considering that WMH are more prevalent in old age and that our population is continuously aging, it is becoming increasingly important to draw a clear picture of the associated risk factors of WMH in order to identify appropriate preventive measures and early treatment strategies.

The pathophysiology and aetiology of WMH, however, is not yet fully understood. Histopathological findings in regions of WMH are myelin paleness, tissue rarefaction with loss of myelin and axons, as well as mild gliosis^4,5. Hypoxic-ischemic axonal loss and demyelination, hypoperfusion due to altered cerebrovascular autoregulation as well as blood–brain barrier dysfunction have been established as leading pathophysiological causes for WMH^5,6.

A wide range of potential risk factors for WMH have been described. Our targeted literature search, that included the 100 most relevant publications on WMH over the past 5 years, identified N = 45 different parameters potentially associated with WMH. Further details are presented in the Supplemental Material. Frequently described factors include hypertension^7,8,9, age^1,10,11, diabetes^12,13,14, dyslipidaemia^15,16,17, and renal impairment^18,19,20. Yet, studies assessing the effect of these parameters on WMH show partially conflicting results. For instance, hypertension is one of the most commonly identified factors associated with WMH^1,7,8,9. It was demonstrated that adequate antihypertensive treatment may reduce the course of WMH progression^8,9. On the other hand, from age 73 to 76 years neither measured blood pressure nor self-reported hypertension were significant predictors of WMH volume change¹⁷. Wardlaw et al. showed in 2 prospective cohorts that vascular risk factors including hypertension explained only 0.1–2.0% of WMH variance, suggesting that “nonvascular” factors largely contribute to the aetiology of WMH²¹. As another example, studies assessing the effect of diabetes status on WMH show inconsistent results. While some studies demonstrated an association of WMH burden with diabetes, but not prediabetes^12,13, a recent population-based cohort study showed a significant association of WMH burden with prediabetes and further WMH increase in type 2 diabetes¹⁴. These results might be partially explained by strong collinearity and partially small effect size of many potential risk factors.

Regarding the plethora of parameters influencing the development of WMH, drawing a clear picture of the most powerful parameters associated with WMH volume is challenging, especially as these factors usually do not occur isolated but as a combination of multiple risk factors determining WMH volume. In this context, classic regression models have major downsides. First, they are inherently hypothesis-driven, requiring prior knowledge about relevant confounding variables. Second, they are unable to deal with multiple correlated, possibly collinear variables. Third, they have limitations in datasets with a large number of variables in relation to sample size. Thus, more complex methods are required to thoroughly evaluate predictors of WMH. Elastic net regularization is a machine-learning method able to select and rank the most important factors out of a large number of variables, even when intercorrelation is present²². For instance, elastic net regularization was previously applied to identify the most important parameters associated with gray matter volume, showing better performance than a constant linear regression model²³.

The aim of this study was to use elastic net machine-learning algorithms in order to disentangle and better understand the respective roles of different variables on WMH volume in two independent population-based cohorts.

Methods

Study design

Our study comprises cross-sectional data from two independent, population-based cohorts. The MRI substudy of the Cooperative Health Research in the Region of Augsburg (KORA) study served to train and internally test the elastic net regression model²⁴. This yielded a “top 10 list” of important predictors of WMH volume, ranked according to their relative importance. The MRI substudy of the Study of Health in Pomerania (SHIP-TREND-0) was used as a second independent sample to externally assess the robustness of these predictors of WMH volume²⁵.

KORA-MRI was sampled in Southern Germany and consists of 400 participants who underwent whole-body MRI to investigate differences in subclinical disease between diabetic, prediabetic and normoglycemic participants. Therefore, the study was enriched with participants with impaired glucose metabolism, i.e. 54 individuals (13.5%) had established Type 2 diabetes and 103 individuals (25.8%) had prediabetes²⁶. SHIP-TREND-0 was sampled in North-eastern Germany and comprises 2188 MRI participants. There was no enrichment for impaired glucose metabolism. Further details of the respective study designs are presented in the Supplementary Material.

This study was approved by the ethics committee of the Bavarian Chamber of Physicians, the ethics committee of the Ludwig-Maximilians-University Munich and the ethics committee of the University of Greifswald and complies with the Declaration of Helsinki, including written informed consent of all participants.

Collected parameters

As potential predictors of WMH volume, a panel of 90 parameters including measures of diabetes, blood pressure, adipose tissue, medication intake, sociodemographics including gender, anthropometrics, life-style factors, somatic and depressive symptoms, and sleep were collected in a standardised method as part of the KORA study design and have been described previously²⁴.

Briefly, the applied definition of hypertension was systolic blood pressure ≥ 140 mmHg, diastolic blood pressure ≥ 90 mmHg and/or use of antihypertensive medication, given that the individuals were aware of being hypertensive. “Hypertension, unknown” was defined as unawareness of hypertension in a participant with hypertension, “controlled hypertension” as self-reported diagnosis of hypertension by a physician and intake of antihypertensive medication. Medications were classified as antihypertensive if the compounds were regarded as antihypertensively effective by recent guidelines. Diabetes was determined as either established type 2 diabetes validated by a physician or by fasting glucose level and OGTT. For the definition of prediabetes and diabetes the World Health Organization/International Diabetes Federation criteria were applied²⁷. HbA_1c values were assessed. Details on the collected parameters are provided in Supplementary Table 2.

MRI

In the KORA sample, image acquisition was performed on a single 3 T MRI system (Magnetom Skyra; Siemens AG, Healthcare Sector, Erlangen, Germany). WMH volume was assessed on T2w 3D-FLAIR sequences (SPACE, slice thickness (ST): 0.9 mm, 0.5 mm × 0.5 mm in-plane spatial resolution, repetition time (TR): 5000 ms, echo time (TE): 389 ms, inversion time (TI): 1800 ms, flip angle: 120°), in accordance with STRIVE recommendations²⁸.

In the SHIP sample, imaging was performed on a single 1.5 T MRI system (Magnetom Avanto; Siemens AG, Healthcare Sector, Erlangen, Germany). WMH volume was assessed on T1w sequences (ST: 1.0 mm, 1 × 1 mm in-plane spatial resolution, TR: 1900 ms, TE: 3.4 ms, flip angle: 15°) and T2w 3D-FLAIR sequences (ST: 3.0 mm, 0.9 × 0.9 mm in-plane spatial resolution, TR: 5000 ms, TE: 325 ms, flip angle: 15°), in accordance with STRIVE recommendations²⁸.

WMH volume

In the KORA sample, ITK-SNAP Version 3.6.0 was used for segmentation²⁹. Cerebral WMH were manually segmented by a radiology resident (2 years of experience in neuroimaging), and edited and modified where necessary by a board-certified radiologist (7 years of experience in neuroimaging) on sagittal acquired FLAIR images reconstructed in axial plane with a ST of 0.5 mm (see Fig. 1). For homogeneous image intensity the tool “auto-adjust contrast” in ITK-SNAP Version 3.6.0 was used²⁹. WMH were defined as signal abnormalities of variable size in the white matter of the brain that show a hyperintense signal on FLAIR images²⁸. WMH in the brainstem and the cerebellum were not included.

In the SHIP sample an automated multimodal segmentation algorithm for WMH quantification was used. The algorithm produced a probabilistic map that was further thresholded to generate a binary image³. Furthermore, to calculate WMH volume within specific regions of interest a multiatlas segmentation method was applied³⁰. This included nonlinear registration of multiple atlases with ground-truth labels for every individual scan. Finally, WMH volume was determined for every region of the brain by masking WMH from all other regions³¹. For the present analysis, measurements from the frontal, parietal, temporal and occipital lobes were summarised.

Image analyses were performed blinded to all clinical data as well as other measurements.

Descriptive statistics

Continuous predictor covariates are described as arithmetic means with standard deviation (SD) or medians with 1st and 3rd quartile. Categorical predictor variables are presented as counts and percentages. P-values < 0.05 are considered to denote statistical significance.

Analysis model

In both the KORA and SHIP sample, the outcome of interest was WMH volume on a continuous scale. We identified relevant predictor covariates associated with WMH volume in the KORA cohort by penalised zero-inflated negative binomial (ZINB) regression models based on elastic net regularization³². The ZINB model accounts simultaneously for the skewed distribution of WMH volumes with overdispersion (“count part”) and the large point mass at zero stemming from those participants without WMH (“structural zeros”). Elastic net combines the properties of both Ridge and least absolute shrinkage and selection operator (LASSO) regression and is therefore appropriate for variable selection on potentially correlated covariates²². The amount of blending between Ridge and LASSO regression is regulated by the hyperparameter α (Ridge: α = 0, LASSO: α = 1). All analyses were computed on a grid of α values from 0.01 to 1 with stepwise increments of 0.1.

The model was derived and evaluated on 1000 data splits. A data split was defined as a random division of the full data set into 90% training data and 10% testing data. By evaluating the model on 1000 data splits with mutually exclusive training and testing data, we ensured a very comprehensive internal model validation. Continuous covariates were standardised to mean = 0 and SD = 1. A ZINB model was computed on the training data, with WMH volume data being modelled by a negative binomial model using logarithmic link. The shrinkage parameter λ was determined by internal tenfold cross validation on the training data with upper thresholds being fractions of 0.5 and 0.1 of λ_max (the smallest value of λ for which all coefficients are shrunk to zero). Selection frequency across the 1000 splits served as a measure of variable importance. To disentangle and assess the roles of different variables in their association with WMH volume, both the model’s explanatory performance and its predictive performance have to be evaluated. Root Mean Squared Error (RMSE) served as a measure of predictive performance on the testing data and Akaikes Information Criterion (AIC) served as measures of model fit, i.e. explanatory performance, on the training data. Coefficient estimates are reported as raw beta values which have to be exponentiated to obtain incidence rate ratios. For comparison, we calculated the Null ZINB model that includes no covariates and predicts the mean WMH volume for each participant. A likelihood-ratio test was used to formally assess the model fit of the final model compared to the Null model.

Predictors of WMH volume identified in the KORA sample were then ranked according to selection frequency across 1000 data splits. A cut-off value based on selection frequency was not applicable due to the varying numbers of parameters in the KORA sample (N = 90) and SHIP sample (N = 34). Consequently, the ten most frequently selected variables were subsequently examined in the SHIP sample. A negative binomial regression model was evaluated on the whole sample and compared to the Null model predicting constant WMH volume in terms of RMSE and AIC. Furthermore, analogous to the procedure on the KORA cohort, variables were evaluated on 1000 data splits with 90% training and 10% testing data and ranked according to selection frequency.

As a sensitivity analysis, the complete machine-learning pipeline was re-run on a subsample of N = 333 participants of the KORA study with available data on intracranial volume (ICV), including ICV as an additional predictor to all the other variables.

R version 3.4.4 was used for all statistical analyses, including descriptive statistics. Package zipath 0.3–5 was used for calculation of ZINB models.

Results

Study population

In the KORA sample of 400 participants 12 had to be excluded due to insufficient MRI image quality, 2 due to visible lesions with other aetiology (1 participant with lesions suspicious for multiple sclerosis; 1 participant with not WMH-like FLAIR-hyperintense lesion in the left parietal lobe) and 16 participants due to missing covariate data. The final KORA sample consisted of 370 participants (58% male; age: 55.7 ± 9.1 years). In the SHIP sample of 2188 participants 229 had to be excluded due to missing WMH data, 322 due to insufficient MRI image quality, 415 due to missing covariate data. In addition, for consistency between the KORA and SHIP study 86 SHIP participants with prior myocardial infarction, stroke or revascularization and 368 SHIP participants younger than 39 years or older than 73 years were excluded. The final SHIP sample consisted of 854 participants (38% male; age 53.9 ± 9.3 years), as presented in Fig. 2. Further details are presented in Table 1. In the KORA sample, mean WMH volume was 2798 ± 7392 mm³ (median: 997 mm³) compared to 532 ± 1750 mm³ (median: 135 mm³) in the SHIP sample. The distribution of WMH volume is presented in Fig. 3, an example of different WMH volumes in Fig. 4.

Table 1 Description of predictor covariates for the KORA sample and SHIP sample.

Full size table

Identification of predictors of WMH volume—KORA sample

In the KORA sample, the best model in terms of RMSE and AIC was obtained for α = 0.8 (RMSE = 4742 mm³, AIC = 4130). For comparison, the Null elastic net model (without adjusting for any cofactors) yielded RMSE = 4829 mm³ and AIC = 1,396,092, which serves as a proof-of-concept that the elastic net model provides additional explanatory value of WMH volume. The likelihood ratio test showed that the final model fitted the data significantly better than the Null model (p < 0.001). The ten most frequently selected variables for α = 0.8 are presented in Fig. 5. Variables included “age” (22.4% selection frequency), “controlled hypertension” (16.9%), “HbA_1c” (14.8%), “widowed” marital status (14.5%), “prediabetes” assessed by OGTT (13.5%), “antiplatelet medication” (13.4%), “hypertension, unknown” defined as hypertension unawareness of a participant with hypertension (10.6%), “NSAID medication” (6.5%), “physical activity of 2 h/week” (4.6%; negative mean beta) and “alcohol consumption up to 20 g/day” (4.6%). Variable selection was relatively stable across all values of α, as presented in Supplementary Figure 1. In the sensitivity analysis including only KORA participants with available ICV volume, ICV was not selected in the top ten of predictors (Supplementary Table 4).

Robustness testing of predictors of WMH volume—SHIP sample

In the SHIP sample, a negative binomial model incorporating the top ten predictors from the KORA sample yielded a RMSE of 1667 mm³ and an AIC of 11,600 on the whole cohort, whereas the Null model (predicting constant WMH volume) yielded an RMSE of 1749 mm³ and an AIC of 12,000 on the whole cohort.

When evaluating the elastic net regression model on 1000 data splits, the best model in terms of RMSE and AIC was obtained for α = 1 (RMSE = 1499 mm³, AIC = 10,443). For both KORA and SHIP, prediction seemed to be worse for high WMH volumes, i.e. on average the model underestimated true WMH volumes (Supplemental Figure 5). Ranking of the variables according to selection frequency is presented in Fig. 5 and Supplementary Figure 4. “Age” (selection frequency 100%), “controlled hypertension” (100%), “unknown hypertension” (97%) and “prediabetes” (66%) were replicated as important predictors. Furthermore, while “widowed” family status was not replicated in the SHIP sample, “separated or divorced” (87%) and “living alone” (80%) were selected. Other variables were either not replicated (HbA_1c, antiplatelet medication) or showed different effect directions compared to the KORA sample (alcohol consumption, physical activity, NSAID medication).

Discussion

In a population-based sample, we performed a data-driven analysis without a-priori hypotheses including 90 different parameters in order to disentangle and better understand the respective roles of these parameters on WMH volume. Relevant parameters were re-examined in an independent population-based sample.

Considering that WMH are associated with cognitive decline, increased stroke risk and worse outcome post stroke, decreased mobility due to gait disturbance as well as increased risk of depression, having a clear picture of the associated risk factors is important, especially regarding treatment and prevention^{2,33,34,35,36,37}. Although a lot of information is nowadays available on the epidemiology and risk factors of WMH, some of these data are conflicting¹. Given that plenty partially inter-correlated factors with potentially small effect size impact WMH volume, drawing a clear picture of the most powerful determinants of WMH volume is challenging. In order to overcome limitations of traditional regression models we used a machine-learning approach that allows for the selection and ranking of the most important factors out of a large number of variables, even when intercorrelation is present. Our machine-learning based model identified age and hypertension as well established determinants of WML volume. Additionally, the model identified the less established, more controversial parameters “prediabetes”, “HbA_1c”, “alcohol consumption”, “NSAID medication” and components of the social environment such as “widowed marital status” as potential factors that contribute to WMH burden. Interestingly, ICV was not selected among the ten most important predictors. We hypothesise that in our analysis the association of ICV with WMH was captured by other variables in the model.

Diabetes-related atherosclerosis appears to be an important component in the development of WMH^38,39. However, the relation between diabetes and particularly prediabetes and WMH is under debate^1,12,14. Studies assessing the correlation between WMH and HbA_1c show conflicting results^40,41. Interestingly, our analysis yielded prediabetes, but not diabetes, as a relevant predictor of WMH volume. This result was replicated in the SHIP sample. Furthermore, HbA_1c was only identified in the KORA sample, but failed to replicate in the SHIP sample. These results might be due to the co-occurrence of diabetes and hypertension and the different risk factor distribution in the two studies: Individuals with diabetes in SHIP are more likely to have hypertension (prevalence of “controlled hypertension” is 41%) compared to KORA (prevalence of “controlled hypertension” is 31%). Therefore, unfavourable effects of diabetes might be superimposed by stronger effects of unfavourable blood pressure profiles.

Our results thus suggest that the prediabetic phenotype as a dynamic state between normoglycemia and type 2 diabetes represents an independent risk factor for the development of WMH. Hence, individuals with prediabetes would need more comprehensive assessments for signs of early pathophysiological changes, preventive measures and adequate treatment not only to stop the underlying development of diabetes, but also to avoid the development of WMH-associated morbidity.

The relation between WMH and alcohol consumption is still unclear. Heavy alcohol consumption might lead to a higher WMH burden through cerebrovascular effects of associated hypertension⁴². However, after correcting for hypertension, heavy alcohol did not show a significant association with WMH in previous analyses^43,44. Prior studies also demonstrated a protective effect of moderate alcohol consumption on the development of WMH through multiple potential pathways, including anti-atherosclerotic, anti-thrombotic and anti-inflammatory mechanisms reducing the risk of cardiocerebrovascular morbidity^43,45,46. However, while moderate alcohol consumption was associated with decreased WMH volume in the SHIP sample, it was associated with increased WMH volume in the KORA sample; heavy alcohol consumption was associated with increased WMH volume in the SHIP sample.

In both the KORA sample and the SHIP sample, “NSAID medication” was identified as a relevant predictor of WMH volume, albeit with different effect directions. NSAID medications are generally used for pain and inflammation treatment, which comprises multiple conditions, thus rendering the groups under treatment quite heterogeneous. In KORA, all participants under NSAID medication reported regular intake of their medication, as opposed to intake as needed. No distinction between regular intake and intake as needed of NSAID medication was made in SHIP. This does not only explain the difference in prevalence (2.2% in KORA vs. 8.5% in SHIP), but might indicate that participants in KORA are affected by more severe and chronic pain. Severity of pain could be a relevant determinant of WMH volume. In this regard, our findings might support findings of previous studies that report associations of pain and increased WMH burden⁴⁷.

The elastic net model showed a strong correlation between age and WMH burden. This result is in keeping with literature suggesting that age is the most important risk factor for WMH^1,10. WMH are a common finding in elderly people, where WMH burden increases with age¹. As such, WMH may to some extent be part of the normal aging process of the brain, yet precise data on the burden of WMH that can be regarded as “normal” at a certain age do not exist. As populations are aging, WMH-related morbidities, such as cognitive decline and increased stroke risk will have an increasing impact on individuals and health care systems^34,35.

Hypertension is strongly associated with and probably the most important modifiable risk factor of WMH. Several studies clearly indicate that an increased systolic as well as diastolic blood pressure favour the development of WMH^1,3,5,7,8,9. Our results are consistent with these findings. The variables “hypertension, controlled” and “hypertension, unknown (hypertension unawareness of a participant with hypertension)” were both among the 10 most frequently selected variables of the elastic net model associated with WMH. The frequent selection of the variable “hypertension, controlled” in the elastic net model could be indicative of irreversible brain damage caused by micro- and macroangiopathy in the pre-treatment episode of hypertension³⁸.

A meta-analysis showed that higher physical activity was cross-sectionally associated with lower WMH volume, although effect sizes were small and many studies reported null findings⁴⁸. A recent longitudinal study and a recent intervention trial found no effect of physical activity on WMH volumes^49,50. In our study, high physical activity was associated with decreased WMH volume in the KORA sample, but not in the SHIP sample, indicating that the role of physical activity is unstable.

“Widowed family status”, “separated or divorced” and “living alone” are components of the social environment that were revealed to be relevant predictors of WMH volume in our study. It can be hypothesised that these predictors might comprise a cluster of mental-health related factors, such as loneliness, anxiety or post-traumatic stress disorder, which were not assessed in this study³⁸. The importance of social networks and stressful life events on mental well-being and health in general^51,52, as well as the association of social economic factors with WMH have been established⁵³. It was also shown that widowhood accelerates cognitive decline in cognitively normal older adults⁵⁴. Our results support this finding, having in mind that WMH are associated with cognitive decline³. However, further studies are needed to clarify the association of components of the social environment with WMH volume.

The results of this study need to be interpreted in light of its limitations. The regularised regression employed here is not a causal model in the formalised sense and thus cannot identify whether the reported variables are etiologically linked to WMH volume. For observational data, different statistical tools can be used to evaluate causality, e.g. graphical models such as directed acyclic graphs, methods based on counterfactuals from the potential outcomes framework, methods based on instrumental variables such as Mendelian Randomization which emulate the design of randomised controlled trials, or structural causal models⁵⁵. These methods, however, require prior knowledge and assumptions about the potential etiologic layout in the analysed variables. By our statistical model, we used a hypothesis-free approach without assumptions about the underlying etiologic factors. It has been emphasised, especially within the epidemiologic field, that evidence from different study designs and models should be taken into account to investigate causality⁵⁶. We therefore believe that the results of our prediction-based analysis can provide useful starting points to inform further, more formalised, causal reasoning.

For (zero-inflated) negative binomial models as employed here, easily interpretable metrics of the proportion of outcome variance explained are not straightforward. Therefore, the relative contribution of the respective predictor variables has to be assessed by the inclusion frequencies only.

In the same vein, the regularization by elastic net and the underlying ZINB regression represent an intricate multi-layered model with complex interpretation. However, elastic net regularization is an appropriate and established method for variable selection, and the ZINB model captures the data distribution best. By presenting a ranking according to inclusion frequencies of the identified variables, we can still provide an adequate interpretation of the findings.

Furthermore, in the KORA sample mean WMH volume was significantly higher than in the SHIP sample. Possible explanations for this discrepancy in WMH volume are different measurement methods and different study collectives. However, identical methodologies across large population-based studies are not to be expected, and the fact that some parameters were consistently associated with WMH volume in both, KORA and SHIP, does show a certain robustness of the association. Further well-characterised MRI studies are needed to corroborate these findings.

Conclusion

In conclusion, a systematic machine-learning based analysis of 90 parameters showed in two independent samples, that besides age and hypertension prediabetes and components of the social environment (i.e. widowed, living alone) might play important roles in the development of WMH. Our results therefore enable personal risk assessment for high WMH burden and prevention strategies tailored to the individual patient.

Abbreviations

AIC:: Akaikes information criterion
DEEX:: Depressed mood and exhaustion
FLAIR:: Fluid-attenuated inversion recovery
KORA:: Cooperative health research in the region of Augsburg
MRI:: Magnetic resonance imaging
NSAID:: Nonsteroidal anti-inflammatory drug
OGTT:: Glucose tolerance test result
RMSE:: Root mean squared error
SHIP:: Study of health in Pomerania
SD:: Standard deviation
WMH:: White matter hyperintensities
ZINB:: Zero-inflated negative binomial

References

Grueter, B. E. & Schulz, U. G. Age-related cerebral white matter disease (leukoaraiosis): A review. Postgrad. Med. J. 88, 79–87. https://doi.org/10.1136/postgradmedj-2011-130307 (2012).
Article PubMed Google Scholar
Wong, T. Y. et al. Cerebral white matter lesions, retinopathy, and incident clinical stroke. JAMA 288, 67–74 (2002).
Article Google Scholar
Habes, M. et al. White matter hyperintensities and imaging patterns of brain ageing in the general population. Brain 139, 1164–1179. https://doi.org/10.1093/brain/aww008 (2016).
Article PubMed PubMed Central Google Scholar
Fazekas, F. et al. Pathologic correlates of incidental MRI white matter signal hyperintensities. Neurology 43, 1683–1689 (1993).
Article CAS Google Scholar
Pantoni, L. & Garcia, J. H. Pathogenesis of leukoaraiosis: A review. Stroke 28, 652–659 (1997).
Article CAS Google Scholar
Simpson, J. E. et al. White matter lesions in an unselected cohort of the elderly: Astrocytic, microglial and oligodendrocyte precursor cell responses. Neuropathol. Appl. Neurobiol. 33, 410–419. https://doi.org/10.1111/j.1365-2990.2007.00828.x (2007).
Article CAS PubMed Google Scholar
Caunca, M. R. et al. Diastolic blood pressure is associated with regional white matter lesion load: The Northern Manhattan study. Stroke 51, 372–378. https://doi.org/10.1161/strokeaha.119.025139 (2020).
Article PubMed Google Scholar
Godin, O., Tzourio, C., Maillard, P., Mazoyer, B. & Dufouil, C. Antihypertensive treatment and change in blood pressure are associated with the progression of white matter lesion volumes: The three-city (3C)-Dijon magnetic resonance imaging study. Circulation 123, 266–273. https://doi.org/10.1161/circulationaha.110.961052 (2011).
Article PubMed Google Scholar
Dufouil, C. et al. Effects of blood pressure lowering on cerebral white matter hyperintensities in patients with stroke: The PROGRESS (perindopril protection against recurrent stroke study) magnetic resonance imaging substudy. Circulation 112, 1644–1650. https://doi.org/10.1161/circulationaha.104.501163 (2005).
Article PubMed Google Scholar
Fernando, M. S. et al. White matter lesions in an unselected cohort of the elderly: Molecular pathology suggests origin from chronic hypoperfusion injury. Stroke 37, 1391–1398. https://doi.org/10.1161/01.STR.0000221308.94473.14 (2006).
Article ADS PubMed Google Scholar
Cannistraro, R. J. et al. CNS small vessel disease: A clinical review. Neurology 92, 1146–1156. https://doi.org/10.1212/wnl.0000000000007654 (2019).
Article PubMed PubMed Central Google Scholar
Schneider, A. L. C. et al. Diabetes, prediabetes, and brain volumes and subclinical cerebrovascular disease on MRI: The atherosclerosis risk in communities neurocognitive study (ARIC-NCS). Diabetes Care 40, 1514–1521. https://doi.org/10.2337/dc17-1185 (2017).
Article PubMed PubMed Central Google Scholar
Marseglia, A. et al. Prediabetes and diabetes accelerate cognitive decline and predict microvascular lesions: A population-based cohort study. Alzheimers Dement 15, 25–33. https://doi.org/10.1016/j.jalz.2018.06.3060 (2019).
Article PubMed Google Scholar
van Agtmaal, M. J. M. et al. Prediabetes is associated with structural brain abnormalities: The Maastricht study. Diabetes Care 41, 2535–2543. https://doi.org/10.2337/dc18-1132 (2018).
Article CAS PubMed Google Scholar
Park, K. et al. Significant association between leukoaraiosis and metabolic syndrome in healthy subjects. Neurology 69, 974–978. https://doi.org/10.1212/01.wnl.0000266562.54684.bf (2007).
Article CAS PubMed Google Scholar
Todate, Y. et al. High prevalence of cerebral small vessel disease on 7T magnetic resonance imaging in familial hypercholesterolemia. J. Atheroscler. Thromb. 26, 1045–1053. https://doi.org/10.5551/jat.48553 (2019).
Article PubMed PubMed Central Google Scholar
Dickie, D. A. et al. Vascular risk factors and progression of white matter hyperintensities in the Lothian Birth Cohort 1936. Neurobiol. Aging 42, 116–123. https://doi.org/10.1016/j.neurobiolaging.2016.03.011 (2016).
Article PubMed PubMed Central Google Scholar
Hirao, K. et al. Association of serum cystatin C with white matter abnormalities in patients with amnestic mild cognitive impairment. Geriatr. Gerontol. Int. 19, 1036–1040. https://doi.org/10.1111/ggi.13767 (2019).
Article PubMed PubMed Central Google Scholar
Georgakis, M. K., Chatzopoulou, D., Tsivgoulis, G. & Petridou, E. T. Albuminuria and cerebral small vessel disease: A systematic review and meta-analysis. J. Am. Geriatr. Soc. 66, 509–517. https://doi.org/10.1111/jgs.15240 (2018).
Article PubMed Google Scholar
Makin, S. D., Cook, F. A., Dennis, M. S. & Wardlaw, J. M. Cerebral small vessel disease and renal function: Systematic review and meta-analysis. Cerebrovasc. Dis. 39, 39–52. https://doi.org/10.1159/000369777 (2015).
Article PubMed Google Scholar
Wardlaw, J. M. et al. Vascular risk factors, large-artery atheroma, and brain white matter hyperintensities. Neurology 82, 1331–1338. https://doi.org/10.1212/wnl.0000000000000312 (2014).
Article CAS PubMed PubMed Central Google Scholar
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. 67(2), 301–320 (2005).
Article MathSciNet Google Scholar
Galiè, F. et al. Machine-learning based exploration of determinants of gray matter volume in the KORA-MRI study. Sci. Rep. 10, 8363. https://doi.org/10.1038/s41598-020-65040-x (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Holle, R., Happich, M., Löwel, H., Wichmann, H. E. & Group, M. K. S. KORA—a research platform for population based health research. Gesundheitswesen 67(Suppl 1), S19-25. https://doi.org/10.1055/s-2005-858235 (2005).
Article PubMed Google Scholar
Völzke, H. et al. Cohort profile: The study of health in Pomerania. Int. J. Epidemiol. 40, 294–307. https://doi.org/10.1093/ije/dyp394 (2011).
Article PubMed Google Scholar
Bamberg, F. et al. Subclinical disease burden as assessed by whole-body mri in subjects with prediabetes, subjects with diabetes, and normal control subjects from the general population: The KORA-MRI study. Diabetes 66, 158–169. https://doi.org/10.2337/db16-0630 (2017).
Article CAS PubMed Google Scholar
World Health, O. & International Diabetes, F. Definition and Diagnosis of Diabetes Mellitus and Intermediate Hyperglycemia: Report of a WHO/IDF Consultation. World Health Organization, p.-13–28 (2006).
Wardlaw, J. M. et al. Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration. Lancet Neurol. 12, 822–838. https://doi.org/10.1016/s1474-4422(13)70124-8 (2013).
Article PubMed PubMed Central Google Scholar
Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. Neuroimage 31, 1116–1128. https://doi.org/10.1016/j.neuroimage.2006.01.015 (2006).
Article PubMed Google Scholar
Habes, M. et al. Relationship between APOE genotype and structural MRI measures throughout adulthood in the study of health in pomerania population-based cohort. AJNR Am. J. Neuroradiol. 37, 1636–1642. https://doi.org/10.3174/ajnr.A4805 (2016).
Article CAS PubMed PubMed Central Google Scholar
Habes, M. et al. Regional tract-specific white matter hyperintensities are associated with patterns to aging-related brain atrophy via vascular risk factors, but also independently. Alzheimers Dement (Amst) 10, 278–284. https://doi.org/10.1016/j.dadm.2018.02.002 (2018).
Article Google Scholar
Wang, Z., Ma, S. & Wang, C.-Y. Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany. Biom. J. 57, 867–884. https://doi.org/10.1002/bimj.201400143 (2015).
Article MathSciNet PubMed PubMed Central MATH Google Scholar
Curtze, S. et al. Cerebral computed tomography-graded white matter lesions are associated with worse outcome after thrombolysis in patients with stroke. Stroke 46, 1554–1560. https://doi.org/10.1161/STROKEAHA.115.008941 (2015).
Article PubMed Google Scholar
Debette, S. & Markus, H. S. The clinical importance of white matter hyperintensities on brain magnetic resonance imaging: systematic review and meta-analysis. BMJ 341, c3666 (2010).
Article Google Scholar
Longstreth, W. T. et al. Clinical correlates of white matter findings on cranial magnetic resonance imaging of 3301 elderly people. The Cardiovascular Health Study. Stroke 27, 1274–1282 (1996).
Article Google Scholar
Wang, L., Leonards, C. O., Sterzer, P. & Ebinger, M. White matter lesions and depression: A systematic review and meta-analysis. J. Psychiatr. Res. 56, 56–64. https://doi.org/10.1016/j.jpsychires.2014.05.005 (2014).
Article PubMed Google Scholar
Baezner, H. et al. Association of gait and balance disorders with age-related white matter changes: The LADIS study. Neurology 70, 935–942. https://doi.org/10.1212/01.wnl.0000305959.46197.e6 (2008).
Article CAS PubMed Google Scholar
Barkhof, F. & Scheltens, P. Imaging of white matter lesions. Cerebrovasc. Dis. 13(Suppl 2), 21–30. https://doi.org/10.1159/000049146 (2002).
Article PubMed Google Scholar
Weckbach, S. et al. Systemic cardiovascular complications in patients with long-standing diabetes mellitus: Comprehensive assessment with whole-body magnetic resonance imaging/magnetic resonance angiography. Invest. Radiol. 44, 242–250. https://doi.org/10.1097/RLI.0b013e31819a60d3 (2009).
Article PubMed Google Scholar
Murray, A. D. et al. Brain white matter hyperintensities: Relative importance of vascular risk factors in nondemented elderly people. Radiology 237, 251–257. https://doi.org/10.1148/radiol.2371041496 (2005).
Article PubMed Google Scholar
de Bresser, J. et al. Progression of cerebral atrophy and white matter hyperintensities in patients with type 2 diabetes. Diabet. Care 33, 1309–1314. https://doi.org/10.2337/dc09-1923 (2010).
Article CAS Google Scholar
Clark, V. A., Chapman, J. M. & Coulson, A. H. Effects of various factors on systolic and diastolic blood pressure in the Los Angeles heart study. J. Chron. Dis. 20, 571–581 (1967).
Article CAS Google Scholar
McEvoy, L. K. et al. Alcohol intake and brain white matter in middle aged men: Microscopic and macroscopic differences. Neuroimage Clin. 18, 390–398. https://doi.org/10.1016/j.nicl.2018.02.006 (2018).
Article PubMed PubMed Central Google Scholar
Anstey, K. J. et al. Weekly alcohol consumption, brain atrophy, and white matter hyperintensities in a community-based sample aged 60 to 64 years. Psychosom. Med. 68, 778–785. https://doi.org/10.1097/01.psy.0000237779.56500.af (2006).
Article PubMed Google Scholar
Collins, M. A. et al. Alcohol in moderation, cardioprotection, and neuroprotection: Epidemiological considerations and mechanistic studies. Alcohol. Clin. Exp. Res. 33, 206–219. https://doi.org/10.1111/j.1530-0277.2008.00828.x (2009).
Article CAS PubMed Google Scholar
den Heijer, T. et al. Alcohol intake in relation to brain magnetic resonance imaging findings in older persons without dementia. Am. J. Clin. Nutr. 80, 992–997. https://doi.org/10.1093/ajcn/80.4.992 (2004).
Article Google Scholar
Binnekade, T. T. et al. White matter hyperintensities are related to pain intensity in an outpatient memory clinic population: Preliminary findings. J. Pain Res. 12, 1621–1629. https://doi.org/10.2147/JPR.S158488 (2019).
Article PubMed PubMed Central Google Scholar
Sexton, C. E. et al. A systematic review of MRI studies examining the relationship between physical fitness and activity and the white matter of the ageing brain. Neuroimage 131, 81–90. https://doi.org/10.1016/j.neuroimage.2015.09.071 (2016).
Article PubMed PubMed Central Google Scholar
Moon, S. Y. et al. Physical activity and changes in white matter hyperintensities over three years. J. Nutr. Health Aging 22, 425–430. https://doi.org/10.1007/s12603-017-0959-3 (2018).
Article CAS PubMed Google Scholar
Venkatraman, V. K. et al. Effect of a 24-month physical activity program on brain changes in older adults at risk of Alzheimer’s disease: The AIBL active trial. Neurobiol. Aging https://doi.org/10.1016/j.neurobiolaging.2019.02.030 (2019).
Article PubMed Google Scholar
Domènech-Abella, J. et al. Loneliness and depression in the elderly: The role of social network. Soc. Psychiatry Psychiatr. Epidemiol. 52, 381–390. https://doi.org/10.1007/s00127-017-1339-3 (2017).
Article PubMed Google Scholar
Johnson, A. D. et al. Effects of stressful life events on cerebral white matter hyperintensity progression. Int. J. Geriatr. Psychiatry 32, e10–e17. https://doi.org/10.1002/gps.4644 (2017).
Article PubMed Google Scholar
Backhouse, E. V., McHutchison, C. A., Cvoro, V., Shenkin, S. D. & Wardlaw, J. M. Early life risk factors for cerebrovascular disease: A systematic review and meta-analysis. Neurology 88, 976–984. https://doi.org/10.1212/wnl.0000000000003687 (2017).
Article PubMed Google Scholar
Biddle, K. D. et al. Associations of widowhood and beta-amyloid with cognitive decline in cognitively unimpaired older adults. JAMA Netw. Open 3, e200121. https://doi.org/10.1001/jamanetworkopen.2020.0121 (2020).
Article PubMed PubMed Central Google Scholar
Judea, P., Glymour, M. & Jewell, N. P. Causal Inference in Statistics: A Primer (John Wiley & Sons, New York, 2016).
MATH Google Scholar
Vandenbroucke, J. P., Broadbent, A. & Pearce, N. Causality and causal inference in epidemiology: The need for a pluralistic approach. Int. J. Epidemiol. 45, 1776–1786. https://doi.org/10.1093/ije/dyv341 (2016).
Article PubMed PubMed Central Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL. The KORA study was initiated and financed by the Helmholtz Zentrum München—German Research Center for Environmental Health, which is funded by the German Federal Ministry of Education and Research (BMBF) and by the State of Bavaria. Furthermore, KORA research was supported within the Munich Center of Health Sciences (MC-Health), Ludwig-Maximilians-Universität, as part of LMUinnovativ. The KORA MRI sub-study received funding by the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft). The KORA MRI sub-study was supported by an unrestricted research grant from Siemens Healthcare.

Author information

Authors and Affiliations

Department of Radiology, University Hospital, LMU Munich, Marchioninistraße 15, 81377, Munich, Germany
Sergio Grosu, Felix Hartmann, Franziska Galie, Roberto Lorbeer, Sigrid Auweter, Sonja Selder, Birgit B. Ertl-Wagner & Sophia Stoecklein
Institute of Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Munich-Neuherberg, Germany
Susanne Rospleszcz, Margit Heier, Karl-Heinz Ladwig & Annette Peters
Biggs Institute Neuroimaging Core (BINC), Glenn Biggs Institute for Neurodegenerative Disorders, University of Texas Health Science Center At San Antonio, San Antonio, USA
Mohamad Habes
Department of Radiology and Penn Memory Center, Center for Biomedical Image Computing and Analytics, University of Pennsylvania, Philadelphia, USA
Mohamad Habes
Institute of Community Medicine and Department of Psychiatry, University of Greifswald, Greifswald, Germany
Mohamad Habes & Hans J. Grabe
Department of Diagnostic and Interventional Radiology, Medical Center - University of Freiburg, Freiburg, Germany
Fabian Bamberg, Christopher L. Schlett & Katharina Mueller-Peltzer
Institute of Diagnostic Radiology and Neuroradiology, University of Greifswald, Greifswald, Germany
Robin Buelow & Hans J. Grabe
Institute for Biometrics and Epidemiology, German Diabetes Center, Duesseldorf, Germany
Wolfgang Rathmann
German Center for Diabetes Research (DZD), Munich-Neuherberg, Germany
Wolfgang Rathmann
Department of Psychosomatic Medicine and Psychotherapy, Klinikum Rechts Der Isar, Technical University Munich, Munich, Germany
Karl-Heinz Ladwig
German Center for Neurodegenerative Diseases (DZNE), Rostock, Greifswald, Germany
Hans J. Grabe
German Centre for Cardiovascular Research (DZHK E.V.), Munich, Germany
Annette Peters
Chair of Epidemiology, Ludwig-Maximilians-University München, Munich, Germany
Susanne Rospleszcz & Annette Peters
Department of Radiology, The Hospital for Sick Children, University of Toronto, Toronto, Canada
Birgit B. Ertl-Wagner
KORA Study Centre, University Hospital of Augsburg, Augsburg, Germany
Margit Heier

Authors

Sergio Grosu
View author publications
You can also search for this author in PubMed Google Scholar
Susanne Rospleszcz
View author publications
You can also search for this author in PubMed Google Scholar
Felix Hartmann
View author publications
You can also search for this author in PubMed Google Scholar
Mohamad Habes
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Bamberg
View author publications
You can also search for this author in PubMed Google Scholar
Christopher L. Schlett
View author publications
You can also search for this author in PubMed Google Scholar
Franziska Galie
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Lorbeer
View author publications
You can also search for this author in PubMed Google Scholar
Sigrid Auweter
View author publications
You can also search for this author in PubMed Google Scholar
Sonja Selder
View author publications
You can also search for this author in PubMed Google Scholar
Robin Buelow
View author publications
You can also search for this author in PubMed Google Scholar
Margit Heier
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Rathmann
View author publications
You can also search for this author in PubMed Google Scholar
Katharina Mueller-Peltzer
View author publications
You can also search for this author in PubMed Google Scholar
Karl-Heinz Ladwig
View author publications
You can also search for this author in PubMed Google Scholar
Hans J. Grabe
View author publications
You can also search for this author in PubMed Google Scholar
Annette Peters
View author publications
You can also search for this author in PubMed Google Scholar
Birgit B. Ertl-Wagner
View author publications
You can also search for this author in PubMed Google Scholar
Sophia Stoecklein
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.G., S.R. and S.S. wrote the main manuscript text and S.G., S.R. and S.S. prepared the figures and tables. S.G., F.H. and S.S. performed the analyses of the imaging data. S.R. performed the statistical analyses. S.G., S.R., M.H., F.G., M.H. and S.S. analysed and interpreted the findings. R.L., S.A., S.Se., R.B., F.B., C.L.S., W.R., K.M-P., K-H.L., H.J.G., A.P., B.E-W. and S.S. were involved in the design and supervision of the research. All authors contributed to the interpretation of the results and reviewed the manuscript.

Corresponding authors

Correspondence to Sergio Grosu or Sophia Stoecklein.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Grosu, S., Rospleszcz, S., Hartmann, F. et al. Associated factors of white matter hyperintensity volume: a machine-learning approach. Sci Rep 11, 2325 (2021). https://doi.org/10.1038/s41598-021-81883-4

Download citation

Received: 14 June 2020
Accepted: 11 January 2021
Published: 27 January 2021
DOI: https://doi.org/10.1038/s41598-021-81883-4

This article is cited by

Association between household size and risk of incident dementia in the UK Biobank study
- Chao-Hua Cong
- Pan-Long Li
- Jing-Jing Su
Scientific Reports (2024)
Association between central obesity and incident diabetes mellitus among Japanese: a retrospective cohort study using propensity score matching
- Changchun Cao
- Haofei Hu
- Yongcheng He
Scientific Reports (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Machine-learning based exploration of determinants of gray matter volume in the KORA-MRI study

Central obesity is selectively associated with cerebral gray matter atrophy in 15,634 subjects in the UK Biobank

Metabolism-related brain morphology accelerates aging and predicts neurodegenerative diseases and stroke: a UK Biobank study

Introduction

Methods

Study design

Collected parameters

MRI

WMH volume

Descriptive statistics

Analysis model

Results

Study population

Identification of predictors of WMH volume—KORA sample

Robustness testing of predictors of WMH volume—SHIP sample

Discussion

Conclusion

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Association between household size and risk of incident dementia in the UK Biobank study

Association between central obesity and incident diabetes mellitus among Japanese: a retrospective cohort study using propensity score matching

Comments

Search

Quick links