## Introduction

Alzheimer’s disease (AD) is a gradual progressive neurodegenerative disorder in which memory deficit is typically the most salient cognitive symptom1. Patients with amnestic mild cognitive impairment (aMCI) are at higher risk of developing AD, where aMCI is frequently considered as early stage of AD2,3. Converging evidence suggests that both AD and aMCI are associated with large-scale functional network dysconnectivity, especially in the default mode network (DMN), which consists of the posterior cingulate cortex (PCC), precuneus, medial prefrontal cortex (mPFC), and bilateral angular gyrus4. DMN dysconnectivity is often associated with worsened memory4,5. In parallel, grey matter volume (GMV) loss in the medial temporal lobe (MTL) and DMN regions6,7,8, are typically related to memory decline in AD patients9,10. Moreover, diffusion tensor imaging (DTI) studies have revealed that compromised white matter (WM) microstructures, particularly in the corpus callosum, cingulum, and fornix11, are associated with memory deficit in AD12,13. Recently, free-water (FW) imaging using diffusion MRI data was proposed to address the partial volume effect problem14. As a result, FW increases have been associated with extracellular processes such as inflammation and small vascular damage in neurodegenerative diseases15. On the other hand, the FW-corrected DTI metrics represent microstructural tissue changes such as degeneration and myelin sheath alterations16. However, one critical gap is whether and how these brain structural and functional degenerative processes differ in the temporal sequence of their influence on memory performance in the AD continuum.

The spectrum of AD spans from clinically asymptomatic to severely impaired17. Based on the hypothetical AD cascade model11,18,19,20,21, the influences of abnormal brain imaging measures on memory in AD would be more appropriately considered as a multi-facet process moving along a seamless continuum rather than as discrete clinical stages1. Recent evidence suggests that pathophysiological abnormalities of AD precede overt memory decline and progress in a non-linear manner18,22,23. For example, atrophy rates of MTL and DMN regions are not uniform across disease stages and they exhibit differential trajectories24,25,26. The vascular damage and neuroinflammatory-related brain changes also vary with AD continuum11,27,28. However, previous studies have mostly associated abnormal brain measures with memory decline in AD patients using linear regression models12,29. These models assumed a constant linear relationship between brain measures and cognition over stages, which ignored the possibility of varying brain-cognition relationship across the disease spectrum30. Taken together, we speculate that the influence of brain abnormities on memory varies according to disease stage. However, the significance of these dynamic associations and their potential role in AD continuum have not been characterized.

To address this gap, we examined the stage-dependent associations between multimodal brain measures and memory decline in AD continuum using a novel sparse varying coefficient (SVC) model31. SVC model allows us to use one model to simultaneously compare the trajectories from multiple brain measures32. Furthermore, unlike conventional linear models in previous studies4,11,18, SVC model does not assume a constant linear association between brain measures and memory performance across stages; instead, it allows the association to vary non-linearly with dementia severity. Specifically, based on prior evidence that WM microstructural abnormalities and functional network degeneration might occur earlier than the MTL atrophy in AD6,11,19, we hypothesized that the influence of WM microstructural abnormalities and DMN functional dysconnectivity on memory impairment would take place in the aMCI stage, while the influence of MTL atrophy would be more prominent later.

## Results

### Specific brain structural and functional abnormalities are associated with memory deficit

To determine regions-of-interests for SVC modelling, we performed several whole-brain voxel-wise analyses on the associations between brain abnormalities and memory deficit in patients. The whole-brain voxel-wise analysis on the FW-corrected diffusion MRI metrics showed that lower memory scores in aMCI and AD patients were associated with higher FW in most WM regions. (Fig. 1A, Supplementary Table 1). In contrast, lower memory score was associated with lower fractional anisotropy (FAT) in the body of the fornix only (Fig. 1B, Supplementary Table 1).

The voxel-wise analysis on grey matter volume revealed that lower GMV in the bilateral MTL (particularly in the HIP), PCC, and mPFC were associated with lower memory scores across all patients (Fig. 2A, Supplementary Table 2).

Finally, the voxel-wise analysis on the DMN FC revealed that lower memory score was associated with lower FC in the precuneus and part of PCC regions across all patients (Fig. 2B, Supplementary Table 3).

These findings remained significant after controlling for years of education. Further details are provided in Supplemental Data (Supplementary Fig. 5, Supplementary Results).

In addition, we found greater brain abnormities (FW, FAT, GMV and FC) in AD patients compared with aMCI patients as expected (Supplementary Fig. 2), which included those memory-related brain measures. Further details for group difference results among HC, aMCI and AD are provided in Supplemental Data (Supplementary Results).

### Differential stage-dependent associations of multimodal brain abnormalities with memory performance

To investigate the severity-dependent (CDR- sum of boxes (CDR-SB) to denote dementia severity) contributions of both brain functional and structural measurements simultaneously, we built an SVC model with memory as the dependent variable and FW, FAT in the fornix, GMV-mPFC, GMV-PCC, GMV-HIP, and FC-DMN derived from the significant regions of voxel-wise analysis as predictors.

We found these brain measures exhibited differential severity-dependent associations with memory (Fig. 3). For DTI, FW had the greatest influence on memory deficit in the early aMCI phase where higher FW was associated with lower memory score (peak beta = −0.9). However, this influence gradually decreased in late aMCI and AD stage (i.e., less negative betas approaching zero). Similarly, the association of FAT in the fornix with memory score was the greatest in early aMCI stage (peak beta = 4.5), where higher FAT was associated with better memory score. However, this association quickly diminished in the AD stage (i.e., smaller positive betas approaching zero).

For GMV, both PCC and mPFC had the strongest associations with memory in the early aMCI stage, where larger volume was associated with better memory score (mPFC peak beta = 4.5; PCC peak beta = 1.4). Similar to FAT, this relationship gradually diminished in the AD stage (i.e., smaller positive betas). In contrast, the relationship between hippocampus (and MTL) and memory were more evident in the late aMCI stage and peaked at the early AD phase (beta = 2.4) where larger volume was associated with better memory (i.e., greater positive betas). The association between FC-DMN and memory was evident throughout the disease continuum. Higher FC was associated with higher memory score regardless of severity (i.e., comparable positive betas).

We also evaluated the specificity of SVC model following our previous approach32. We randomly permuted the memory scores 100 times across the subjects and repeated SVC modelling 100 times on each of the 100 permuted data sets (dependent variable: memory z-scores; 10 predictors: brain measures [FW, FAT, FC-DMN, GMV-PCC, GMV-mPFC, GMV-HIP] together with nuisance variables [age, gender, handedness and ethnicity]). In 52 out of the 100 permuted data sets, no variable was selected by all 100 repetitions. For each of the remaining 48 permutated datasets, the SVC model selected one variable from 10 predictors as the key predictor of verbal memory scores based on 100 repetitions. However, the frequency distribution of variable selection across these 48 data sets was random. None of the predictors was selected for all 100 repetitions (Supplementary Fig. 3). Overall, the selected variables using our original data set did not favour other variables in the null distribution. This indicates the high specificity of SVC models built on the original dataset.

Lastly, when the years of education was added into the SVC modelling as a covariate, the estimated severity-dependent relationships of all brain regions with memory remained similar as the SVC model without education (Supplementary Results and Supplementary Fig. 6).

## Discussion

Another important finding in our study is that the FAT in the body of the fornix is associated with memory deficit. Our SVC model showed that this association peaked at aMCI and then decreased during the AD stage. The fornix is a predominant tract connecting the hippocampus to the septal nuclei and the mammillary bodies in the hypothalamus. It is particularly susceptible to pathological assaults and shows early changes in AD37. Moreover, the fornix microstructure has been used to classify AD diagnosis and assess cognitive changes and response to therapy in both human13 and animal models38. Recent studies have demonstrated that fornix microstructure accounts for both age-related and age-independent variations in free recall test39. A prior longitudinal study also indicated that FA in the fornix could predict memory decline and progression to AD in MCI patients12. Of note, this focal fornix tissue damage had greater association (in term of beta) with memory deficit than the global FW increase, which suggests that memory-related WM tract deterioration may play a more dominant role than the widespread ‘background’ vascular/inflammatory damage in memory performance decline. Therefore, our SVC results further bolstered the plausibility that the fornix may be one of the earliest damaged regions that potentially contribute to worse memory outcome in AD.

In contrast to the stronger influence of hippocampal atrophy in AD stage, we found atrophy in the DMN hubs (mPFC and PCC) to be more strongly associated with poorer memory performance in the aMCI phase. Past studies have reported both MTL atrophy and DMN damage occurs at the early stage of AD6,40. However, the stage-dependant contribution of these GM regions to memory deficits remains unknown. Using the multivariant SVC model that combined mPFC, PCC and MTL regions, our findings provide evidence that DMN atrophy may have greater influence to memory decline at the aMCI stage, while MTL atrophy has greater contribution at AD stage. Furthermore, studies have demonstrated that GM atrophy mediates the effects of amyloid and Tau on memory41,42. Our results on the differential stage-dependent atrophy-memory association are consistent with the pathophysiological mechanisms of AD progression: neuronal degeneration in the DMN related to early amyloid burden and hypometabolism and medial temporal atrophy related to later Tau pathology in the clinical stage of AD7,18,43. Both hippocampus and DMN hubs functionally support complementary functions in episodic memory. The hippocampus organizes memories in the context in which they were experienced (a defining feature of episodic memory), whereas the DMN hubs control the retrieval of memories by suppressing competing memories and are responsible for flexibly switching between memory ‘tracks’ according to contextual rules9,44. Indeed, interference suppression and retrieval processes have been compromised in healthy elderly and patients with aMCI45,46, consistent with the observation of an early stronger GMV-memory association in the DMN than in the MTL. Additionally, we observed that mPFC had slightly higher association (in term of beta) with memory than the PCC at the early aMCI stage, which was consistent with previous literature that prefrontal cortex plays an essential role in the memory processing pathway9.

In contrast to the structural measures, the FC of the DMN hubs showed positive associations with memory performance across both prodromal and clinical AD stages. These findings are consistent with prior studies5,19. Synaptic dysfunction and grey and white matter deteriorations could impact the functional organization of the DMN and lead to memory deficit7,10,47. As a result, the association between PCC-based DMN FC and memory remained relatively stable across disease progression.

Overall, our results suggest a possible mechanism of memory deficit in AD. During the early stage of AD, the structure and function of the DMN hubs (particularly PCC and mPFC) may be targeted due to selective vulnerability48 and/or early amyloid burden43, accompanied by the associated WM deterioration, disconnection with hippocampus, and widespread WM inflammation and vascular damage15,35,40. Taken together, these factors may impair memory performance. As AD progresses, the impacts of WM damages to memory would be greatly reduced due to possible ceiling effects11. Along with this process, MTL atrophy and more severe functional network breakdown become the dominant factors contributing to further memory impairments6,18, supporting the hypothetical AD cascade model11,18,19. Therefore, our results implied that extracellular FW increases and DMN degeneration may be the potential targets for early intervention strategies to slow down memory decline in AD, while MTL atrophy in late AD may be used as an imaging marker to monitor progression of memory deficit18.

Although we have demonstrated the significance of stage-dependent contributions of multimodal brain structural and functional deterioration to memory impairment in AD progression, our study has limitations. One limitation is that the associations between brain function/structure and memory derived from the cross-sectional dataset may be confounded by inter-subject anatomical variability and not fully reflect within-subject longitudinal stage-dependent brain-cognition associations. However, our findings are consistent with the AD cascade hypothesis and can serve as a working model for future longitudinal studies. Secondly, no amyloid PET imaging or cerebrospinal fluid markers were available for this cohort. Therefore, we could not rule out the possibility of other pathologies besides AD in patients with aMCI and AD. Thirdly, although we used global signal regression to remove physiological noise, residuals of physiological signals could still remain49,50. Advanced methods such as RETROICOR51 making use of concurrent physiological recordings are needed in the future to mitigate the influence of physiological noise. Fourthly, there was a relatively limited sample size of participants in those bins with severe dementia symptom (CDR-SB > 10), leading to non-uniform CDR-SB distribution (Supplementary Fig. 4), which might affect the estimation accuracy in the SVC modelling at the end of the dementia spectrum. Future studies on larger sample with longitudinal follow-ups would help characterize finer severity-dependent brain-cognition trajectories. Furthermore, the initial screening step of linear regression might miss some brain regions whose structural or functional properties influence memory in a non-linear manner, which require complex statistical modelling to infer nonlinear stage-dependent brain-behaviour relationship28. Lastly, compared to the current single shell diffusion MRI data, advanced FW correction based on multi-shell data would further improve the accuracy of FW separation16.

## Conclusion

Based on the sequential but temporally overlapping patterns of brain-memory associations, our study supports the hypothetical progression models of multimodality brain integrity related to memory dysfunction in the AD continuum. Furthermore, our results underscore the importance of WM microstructure, extracellular water, and DMN degeneration in the early stage of the disease, which may guide treatment options to slow down cognitive decline.

## Methods

### Ethics approval and consent to participate

This study was conducted in accordance with the Declaration of Helsinki, and written informed consent was obtained from each participant. Ethical approval was provided by the National Healthcare Group Domain-Specific Review Board, Singapore.

### Participants

All patients were recruited from the National University Hospital of Singapore and St. Luke’s Hospital in Singapore15. Trained psychologists assessed each participant with a comprehensive clinical and neuropsychological evaluation including the Clinical Dementia Rating Scale (CDR), the Mini-Mental State Examination (MMSE), the Montreal Cognitive Assessment, the informant questionnaire on cognitive decline, and a formal neuropsychological battery, all of which had been validated for older Singaporeans. The neuropsychological battery assessed seven cognitive domains, two of which were memory domains: verbal (word list recall and story recall) and visual (picture recall and Wechsler memory scale-revised visual reproduction) memories52 (see details in supplementary). Both visual and verbal memory domain scores were combined into a composite memory z-score for further analyses.

Of the 172 eligible HC, aMCI and AD subjects who were selected between August 12, 2010, and June 22, 2016, 5 participants did not have full MRI scans; 16 participants did not pass quality control criteria for structural MRI, resting-state functional MRI, or DTI (see quality control criteria in supplementary); and 5 participants did not complete the neuropsychological assessments. The remaining 151 participants (51 HC, 54 aMCI, 46 AD) were included in the analyses (Table 1).

### Image acquisition

Each subject underwent MRI scanning at the Clinical Imaging Research Centre, National University of Singapore (3-T MAGNETOM Trio™, A Tim® System; Siemens, Germany). High-resolution T1-weighted structural MRI was performed using a magnetization-prepared rapid gradient echo (MPRAGE) sequence (192 continuous sagittal slices, repetition time (TR) = 2300 ms, echo time (TE) = 1.9 ms, inversion time = 900 ms, flip angle = 9˚, field of view (FOV) = 256 × 256 mm2, matrix = 256 × 256, isotropic voxel size = 1-mm isotropic, bandwidth = 240 Hz/pixel). Diffusion MRI scans were acquired using a single-shot fast echo-planar imaging sequence (TR = 6800 ms, TE = 85 ms, slices = 48, FOV = 256 × 256 mm2, voxel size = 3-mm isotropic, b value = 1150 s/mm2, 61 diffusion directions, and 7 b0). A 5-minute task-free functional MRI scan was acquired using a T2*-weighted echo-planar sequence (TR= 2300 ms, TE = 25 ms, flip angle = 90˚, FOV = 192 × 192 mm2, voxel size = 3-mm isotropic, and 48 axial slices, with interleaved acquisition). Fluid-attenuated inversion recovery (FLAIR) imaging was also performed (TR = 11,000 ms, TE = 125 ms, inversion time = 2,800 ms, FOV = 256 × 256 mm2, sensitivity encoding factor 1.5, voxel size = 1.02 × 1.02 mm2, 60 slices, and slice thickness = 2.5 mm).

### Diffusion MRI data pre-processing

The diffusion MRI data were pre-processed using FSL (http://www.fmrib.ox.ac.uk/fsl)32. Head movements and eddy current distortions were corrected to the first b = 0 volume via affine registration of the diffusion-weighted images. Data were discarded if the maximum displacement relative to the first b = 0 volume was greater than 3 mm. The diffusion gradients were rotated to compensate for the registration. Individual maps were visually inspected for signal dropout, artefacts, and additional motion. Individual fractional anisotropy (FA) maps were created by fitting the DTI model to the pre-processed diffusion data at each voxel. FA images were non-linearly registered to the high-resolution (1 mm3) FMRIB58 FA image and then skeletonized using TBSS for further statistical analysis.

### Free-water imaging method

We employed the free-water imaging method on the pre-processed diffusion MRI data to estimate the fractional volume of freely diffusing extracellular water molecules (FW) and the fractional anisotropy of water molecules in the proximity of tissue (FAT)14,15. Briefly, the FW compartment models water molecules that are free to diffuse and not restricted or hindered during the diffusion process. This compartment has a fixed diffusivity of 3 × 10−3 mm2/s (the diffusion coefficient of free-water at body temperature), and the fractional volume of this compartment in each voxel forms the FW map. The FW-corrected DTI compartment models water molecules in the proximity of cellular membranes of brain tissue using a diffusion tensor, from which the FAT measure is derived. Therefore, the FW-corrected DTI compartment is corrected for contamination with freely diffusing extracellular water and is consequently expected to be more sensitive and specific to axonal changes than the measures derived from the single tensor model33. Voxel-wise FW and FAT were obtained for each subject16. The aligned FW and FAT maps of each participant were then projected onto the standardized FA skeleton, resulting in subject-level skeletonized images.

### Voxel-based morphometry

We applied optimized voxel-based morphometry (Computational Anatomy Toolbox 12) using Statistical Parametric Mapping (SPM12)55. Briefly, we derived the subject-level GMV probability maps from the T1 structural images using an approach that included: (1) segmentation of individual T1-weighted images into the GM, WM and CSF; (2) creation of a study-specific template using non-linear DARTEL (Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra) registration of the affine-registered GM and WM segments; (3) registration of each GM/WM probability map to the study-specific template in Montreal Neurological Institute (MNI) space; (4) modulation by multiplying the voxel values by the Jacobian determinants to account for individual brain volumes; and (5) smoothing of the normalized GM maps by a 8-mm isotropic Gaussian kernel.

### Functional image pre-processing

Task-free functional MRI images were pre-processed using the Analysis of Functional NeuroImages software (https://afni.nimh.nih.gov/) and FSL55. The pre-processing steps included: (1) removal of the first five volumes to allow for magnetic field stabilization; (2) motion correction; (3) time series de-spiking; (4) spatial smoothing; (5) grand mean scaling; (6) band pass temporal filtering; (7) removal of linear and quadratic trends; (8) co-registration of T1 images using boundary-based registration and subsequent registration of the functional images into an MNI-152 space using a non-linear registration tool (FNIRT); and (9) regression of nine nuisance signals (WM, CSF, global signals and six motion parameters) from the pre-processed functional images. To determine whether global signal regression was preferred, we calculated the global negative index for each subject, taken as the percentage of voxels showing a negative correlation with the global signal55. Majority of our subjects (90.1%) had the global negative index of <3%, suggesting that the global signal was more representative of non-neural noise and should be regressed out from the images.

### Functional connectivity analyses

Individual-level DMN functional connectivity maps were obtained using a seed-based approach with the REST toolbox58. We created spherical region of interest (ROIs) with a 4-mm radius centred at the left posterior cingulate cortex (MNI coordinates [−7, −43, 33]). This seed was previously determined as a core region of DMN48,59. Pearson’s correlations were then computed between the time-series of every voxel in the brain and the average time series of the seed ROI. The FC correlation maps were converted to z-score maps using Fisher’s r-to-z transformation.

### Statistical analyses

We analysed the demographic, clinical, and cognitive measures across groups via ANOVA or χ2 tests using Statistical Package for Social Sciences (SPSS v. 23.0) software. The results were reported at a significance level of p < 0.05.

#### Associations between brain structure/functional measures and memory impairment

At the first step, to identify region-specific WM changes underlying memory deficit in patients, we built voxel-wise general linear models (GLMs) with the skeletonized FW and FAT images as the dependent variables separately using the FSL. In each model, the memory domain z-score was the independent variable of interest, with age, gender, handedness and ethnicity as covariates. Regions were examined for statistical significance using threshold-free cluster enhancement (TFCE) and permutation-based non-parametric testing (FSL Randomise). Results were family-wise error (FWE) corrected at p < 0.05.

To examine the association between GMV and memory function among the aMCI and AD patients, we built the voxel-wise GLMs using SPM12 toolbox, with a threshold at p < 0.05, FWE corrected. To examine whether and how FC within the DMN related to memory performance across the aMCI and AD patients, we built voxel-wise GLMs using the REST toolbox58. Analysis was restricted to the DMN based a predefined group-level mask derived from an independent group of healthy control subjects55. The results were reported at a height threshold of p < 0.01 and a cluster threshold of p < 0.05 with Gaussian random field (GRF) correction58. We then extracted the mean values of brain structural/functional measures from the resulting significant regions for further statistical analyses.

#### Sparse varying coefficient (SVC) modelling of severity-dependent associations between brain measures and memory impairment

In reality, the differential pathophysiologies in GM and WM might interact with each other to influence with memory in AD18. Furthermore, there are no firm boundaries between the various clinical stages1. Therefore, in the second step, we employed the SVC model31,32 to integrate all structural and functional measures derived from the previous screening step as predictors in the same model to evaluate their relative contribution to and severity-dependent (CDR sum-of-boxes (CDR-SB) as a measure of dementia severity) impact on memory, which provides a more comprehensive and nuanced picture. Specifically, we tested whether and how the associations of brain function/structures with memory were dependent on dementia severity using memory z-scores as the dependent variable:

$${y}_{i}({t}_{k})=\sum _{j=1}^{p}{\beta }_{j}({t}_{k}){x}_{ij}({t}_{k})+{\varepsilon }_{i}({t}_{k}),$$

where yi (tk) represents the memory z-scores for subject i(i = 1, 2, …, n) at the dementia severity tk, measured by CDR-SB. xij (tk) is the jth (j = 1, 2, …, p) predictor of subject i at CDR-SB tk. βj (tk) is the estimated coefficient function depending on CDR-SB tk for each predictor. εi (tk) represents the independent and identically distributed random errors at tk.

For predictors xij (tk), we extracted the mean values from the previous identified candidate regions of interest (i.e., FW, FAT, FC-DMN, and GMV from mPFC, PCC, hippocampus (HIP)). All predictors were put in the same model with age, gender, handedness, and ethnicity included as nuisance variables. Each predictor was standardized to have zero mean and equal variance across observations. To simultaneously achieve regression model fitting and predictor variable selection, we applied the least absolute shrinkage and selection operator (LASSO)60 to estimate βj (tk) by minimizing the following penalized least squares function.

$$\frac{1}{2n}\sum _{i=1}^{n}\sum _{k=1}^{K}{[{y}_{i}({t}_{k})-\sum _{j=1}^{p}{x}_{ij}({t}_{k}){\beta }_{j}({t}_{k})]}^{2}+\lambda \sum _{j=1}^{p}\sqrt{{\int }^{}{\beta }_{j}^{2}(t)dt},$$

where λ is the sparsity penalty tuning parameter chosen by a five-fold cross-validation method. The LASSO algorithm performs variable selection by constraining the sum of the squared magnitudes of the coefficients. SVC modelling with the LASSO algorithm was specifically designed for feature selection problems with small sample sizes31. We approximated each coefficient function βj using linear combinations of the B-spline basis (number of basis functions L = 4).

Our SVC model offers several advantages over a traditional linear regression model: (i) it does not assume that the association of the brain measures with memory remains constant over disease progression and thus considers each beta coefficient (the association of brain function or structure with memory) as a non-linear function of a continuous variable of dementia severity (i.e., CDR-SB); (ii) feature selection with the LASSO sparsity penalty chooses the most important predictors while eliminating the contributions of the less important predictors; and (iii) rather than analysing brain measures in separate models, all variables are entered as predictors in the same multivariate model.

To assess the stability of these beta coefficients, we calculated the means and standard errors of the severity-dependent coefficients estimated from 100 replicates. We reported the brain measures that were selected in all 100 repetitions of SVC modelling. SVC modelling was performed by in-house R scripts based on Daye and colleagues31.