## Abstract

With the explosive growth of biomarker data in Alzheimer’s disease (AD) clinical trials, numerous mathematical models have been developed to characterize disease-relevant biomarker trajectories over time. While some of these models are purely empiric, others are causal, built upon various hypotheses of AD pathophysiology, a complex and incompletely understood area of research. One of the most challenging problems in computational causal modeling is using a purely data-driven approach to derive the model’s parameters and the mathematical model itself, without any prior hypothesis bias. In this paper, we develop an innovative data-driven modeling approach to build and parameterize a causal model to characterize the trajectories of AD biomarkers. This approach integrates causal model learning, population parameterization, parameter sensitivity analysis, and personalized prediction. By applying this integrated approach to a large multicenter database of AD biomarkers, the Alzheimer’s Disease Neuroimaging Initiative, several causal models for different AD stages are revealed. In addition, personalized models for each subject are calibrated and provide accurate predictions of future cognitive status.

## Introduction

Among the top 10 causes of death in the United States, Alzheimer’s disease (AD) is the only condition without a viable treatment to cure or prevent it, or even significantly slow its progression^{1}. Failure to develop a successful disease-modifying therapy for AD, despite large investments of public and private resources, is rooted in its complexity^{2,3,4}. For instance, signaling pathway analyses of AD pathophysiology has implicated over 30 metabolic pathways and over 1000 chemical species^{4}. Our incomplete understanding of how these mechanisms vary and interact at an individual level to create a clinically and biologically heterogeneous phenotype has resulted in an attempt to treat patients with varying underlying pathophysiology in a similar fashion^{5,6}. Thus, failure to characterize and subtype AD at an individual level has represented a major roadblock in the development of effective therapeutic strategies to slow or halt AD progression. Recent biological classification of AD, based on imaging and cerebral spinal fluid (CSF) biomarkers, represents a major step toward the future development of personalized prognoses and therapeutic strategies^{7,8}. The increasing availability of such data in large cohorts of subjects has made possible the development and testing of rigorous quantitative models of AD pathophysiology. For example, the Alzheimer’s Disease Neuroimaging Initiative (ADNI), a multicenter, prospective, naturalistic study, began in 2003, comprises four sequential studies—ADNI-1, ADNI-GO, ADNI-2, and ADNI-3—which followed subjects up to 15 years, using genetic, blood- and CSF-based, imaging, and cognitive biomarkers. The abundance of data from this and similar multinational biomarker studies in AD will require a rigorous quantitative data-driven modeling approach to analyze, integrate and interpret data at the level of the individual, where it can have maximum clinical impact.

Several mathematical models of AD progression have been developed recently. For example, one mathematical model includes a cellular biologic system of neurons, glia, macrophages, amyloid*β* aggregation, and tau to simulate and validate at a cellular level the mechanisms underlying the failure of several drugs in recent clinical trials, and suggest alternative approaches^{9}. Moreover, a mathematical modeling approach has also been used to describe the key AD clinical biomarkers including pathologic hallmark biomarkers (beta-amyloid and tau), neuronal loss biomarkers, and cognitive impairment^{10}. This model was parameterized and tested to successfully simulate the natural history scenarios of three sub-types of AD presented in^{11}: (1) early-onset autosomal dominant AD, (2) late-onset amyloid-first AD, and (3) late-onset tau-first AD.

Although these mathematical models bring new insights in understanding AD progression and enable simulation of therapeutics, the current models are built upon a priori hypotheses of the AD pathophysiological network which still is an open area of research^{12}. In fact, there are dozens of pathophysiological pathways implicated in AD by systems biologists, and our understanding of these networks and their interactions remains incomplete^{13}. Moreover, there has been limited work on mechanistic modeling of clinically measurable AD biomarkers. Most research to date on the keyAD biomarkers has been observational or correlational. Such modeling approaches do not benefit from the tools of a more integrative systems approach that address disease mechanism^{14,15}.

Computational data-driven modeling approaches have already achieved success in analyzing multi-dimensional clinical data in diseases such as cancer^{16,17} and cardiovascular disease^{18,19}. Such data-driven approaches employ mathematical models for patient populations using clinical, omics, and biomarker data, as well as powerful and new means to personalize such models based on individual data, yielding personal risk profiles. These data-driven modeling approaches can simulate complex systems, helping to elucidate complex physiological interactions and optimize personalized prevention and treatment strategies. Examples of such work include statistical approaches, such as Bayesian generalized linear models^{20}, Bayesian hierarchical models^{21,22}, and those based on Markov chain Monte Carlo simulations^{23,24,25,26} to analyze genome sequencing and biomarker dynamics. Recently, machine learning techniques, such as deep recurrent neural networks, have been used to predict AD progression^{27}.

In this paper, we propose to develop a computational data-driven modeling framework to predict AD biomarker progression. We propose a methodology to construct data-driven causal models at a group and individual patient level. This method does not depend on any specific hypothesis of AD progression and extracts the causal model completely from the empirical data. More specifically, we derive the causal model based on clinical biomarkers in the ADNI dataset. In this data-driven modeling approach, the causal model is learned from four biomarkers (amyloid-beta pathology, total-tau pathology, hippocampal volume, and cognitive decline) to describe AD progression. Moreover, we incorporate a disease progression score (DPS) in the causal model^{28} to unify AD progression for different subjects since the onset age and rates of progression may markedly vary within and across the different subject classes in ADNI.

## Results

We elaborate on the effectiveness of the proposed data-driven causal model here. First, we construct a population-based causal model that describes the biomarker dynamics for all eligible subjects in ADNI-1, including normal controls. By fitting the population parameters via the ADNI dataset, the population model describes the transition of AD biomarkers between three different disease stages, cognitively normal (CN), late mild cognitive impairment (LMCI), and Alzheimer’s disease (AD). Second, we derive a population model for LMCI and AD subjects only. Third, we analyze the Sobol sensitivity^{29,30} of the parameter space of the population model, which identifies the attribution of each model parameter. Based on the sensitivity analysis results, we finally construct a personalized model for each subject and provide personalized biomarker predictions for subjects who have more than four longitudinal biomarker data points.

### A population model

We construct a causal model by fitting biomarkers of all subjects across the ADNI dataset. Since the causal model is a dynamic system expressed as ordinary differential equations (ODEs), we require at least two longitudinal data points for each subject. More specifically, we remove patients who do not provide at least two measurements for any one of the four biomarkers. The histograms in Fig. 1(a) summarize the available biomarker data in the ADNI dataset.

### Algorithm 1

Population model calibration algorithm to compute the population parameters *w*^{(1)} and DPS parameters (** α**,

**). See details in Methods section.**

*β***Input** \({{{\boldsymbol{y}}}}={\{{y}_{ijk}\}}_{ijk}\), \({{{\boldsymbol{t}}}}={\{{t}_{ij}\}}_{ij}\).

**Initialize** *α*^{0}, *β*^{0}, and *w*^{0}.

1: **for** *l*=1 to *L* **do**

2: **for** *k* ∈ {*A*, *T*, *N*, *C*} **do** ⊳ Population parameter calibration

3: \({{{{\boldsymbol{w}}}}}_{k}^{l}={{{\mbox{argmin}}}}_{{{{\boldsymbol{{w}}}_{k}}}}{\sum }_{(i,j)\in {{{{\mathcal{I}}}}}_{k}}{\left({y}_{ijk}-{f}_{k}\left({\alpha }_{i}^{l}{t}_{ij}+{\beta }_{i}^{l};{{{{\boldsymbol{w}}}}}_{k}^{l-1}\right)\right)}^{2}.\)

4: \({\sigma }_{k}^{l}=\frac{1}{| {{{{\mathcal{I}}}}}_{k}-2I-4| }{\sum }_{(i,j)\in {{{{\mathcal{I}}}}}_{k}}{\left({y}_{ijk}-{f}_{k}\left({\alpha }_{i}^{l}{t}_{ij}+{\beta }_{i}^{l};{{{{\boldsymbol{w}}}}}_{k}^{l-1}\right)\right)}^{2}.\)

5: **end** **for**

6:

7: **for** *i*=1 to *I* **do** ⊳ Update DPS parameters

8: \(({\alpha }_{i}^{l},\,{\beta }_{i}^{l})={{{\mbox{argmin}}}}_{{\alpha }_{i},{\beta }_{i}}{\sum }_{(j,k)\in {{{{\mathcal{I}}}}}_{i}}\frac{1}{{\sigma }_{k}^{l}}{\left({y}_{ijk}-{f}_{k}\left({\alpha }_{i}^{l}{t}_{ij}+{\beta }_{i}^{l};{{{{\boldsymbol{w}}}}}_{k}^{l}\right)\right)}^{2}.\)

9: **end for**

10: **end for**

**Output** *w*^{L} as the population parameter *w*^{(1)}, *α*^{L}, *β*^{L}.

By using Algorithm 1, the initial value of *α*_{i} is randomly chosen in (0, 4) and the initial value of *β*_{i} satisfies − 10 ≤ *s*_{i}(*t*) ≤ 20 on all the measurement. Then we obtain the population model in terms of the fitted DPS shown in Fig. 1(b). The population model (black solid) is learned on three different disease stages, namely, CN (orange circle), LMCI (green circle), and AD (blue hexagons). The gray area is the confidence interval of the population model. More specifically, we sample the population parameters, *w*^{(1)}, from the posterior distribution (given by the simulation study) and run the model with the same initial condition 1000 times. Then the 95% confidence interval at every time point is plotted. The simulation study and diagnostic plots corresponding to the population model are shown in the Supplementary Materials. From this figure, we can separate biomarkers into three stages according to the population model. In particular, CN and AD patients correspond to *s* < 0 and *s* > 0, respectively while LMCI patients locate around *s* = 0. Moreover, the first three biomarkers (*A*_{β}, *τ*, and *N*) start at steady-states when *s* < 0 (CN), change gradually when *s* = 0 (LMCI), and finally approach another steady-state (AD). Different from other biomarkers, ADAS continues to grow which means that cognitive symptoms get worse as AD progresses.

We also compare the population model with the sigmoid function fitting (black solid in Fig. 2). First, the population model provides relatively smooth transitions from one stage to another while the sigmoid function fitting gives more abrupt changes for *A*_{β} at *s* ≈ 4. Second, the population model follows the biomarker cascade theory which is that *τ* rises after *A*_{β} starts decreasing, *N* increases after *τ*, and *C* rises after *N*. However, the sigmoid function fitting makes *A*_{β} and *τ* change after *s* = 0, while *N* and *C* change at *s* ≈ −3 and *s* ≈ −10.

Since the CN group might not follow the same disease trajectory, we next derive a population model based on the LMCI and AD groups only. In order to better identify the biomarker dynamics among LMCI and AD group patients, we fix the parameters of DPS, (** α**,

**), that we obtained before and only update the causal model parameters,**

*β***, by using Algorithm 1. Figure 1(c) shows the population model based on LMCI and AD groups.**

*w*The transitions for different biomarkers shown in Fig. 1(c) are similar to Fig. 1(b). But Fig. 1(c) advances the onset of changes since the LMCI and AD groups are prone to suffer from cognitive decline earlier. We summarize the parameters of the above-mentioned causal models in Table 1, which corresponds to the results given in Fig. 1(b) & (c).

### Sensitivity analysis

The quasi-Monte Carlo method is applied to compute sensitivity indices. For more details about Sobol sensitivity analysis, please refer to^{29,30}. By taking *C*(0) as the output, Fig. 3(a) list the top nine most sensitive parameters for the first-order effects and total order sensitivity index. We see that the weight with greater first-order impact (*S*_{m} > 0.4) is associated with *A*_{β}(⋅).

The right figure in 3(a) shows the second-order interaction between two parameters. We see that the parameters associated with \({A}_{\beta }^{2}\) are always positively related to other terms. While *A*_{β} with parameter *w*_{A1} is almost positively related with other terms, the *A*_{β} term with parameter *w*_{T3} are negatively related with other parameters except \({A}_{\beta }^{2}\). But compared to their first-order sensitivity contribution, the second-order ones contribute slightly.

Figure 3(b, c) shows the dynamics of sensitivities with respect to DPS. From the figures, we can see the first-order sensitivity value of *w*_{A1} drops down over DPS which implies that the effect of Abeta on cognitive decline switches from linear to nonlinear in later-stage disease. At the same time, the first-order sensitivity values of some other parameters increase gradually, with a notable increase of *w*_{C3} and *w*_{C5}. The second-order sensitivities between different parameters eventually converge to zero thus the interactions among different parameters become less as the biomarkers reach equilibrium. Based on the results shown in Fig. 3(a), we select *w*_{A1}, *w*_{A2}, *w*_{T4}, *w*_{T5}, *w*_{N4}, *w*_{N5}, *w*_{C3}, and *w*_{C5} as the most sensitive parameters for personalization by setting the threshold, Tol, as 0.01 in Algorithm 2.

### Personalized model and biomarker prediction

### Algorithm 2

Personalized model calibration algorithm. The personalized parameters are initialized by the population model. The personalized models are applied for subjects who meet the requirement denoted as *i* ∈ Ω.

**Input** longitudinal biomarker data {*y*_{ijk}} at {*t*_{ij}} with *i* ∈ Ω;

**Input** the DPS parameter values (*α*_{i}, *β*_{i}) for each subject *i* ∈ Ω;

**Input** the population parameter values *w*^{(1)} (** w** for simplicity);

**Input** sensitivity threshold, TOL.

1: **for** *m*=1 to 21 **do** ⊳ First order sensitivity.

2: \({{{{\rm{S}}}}}_{m}(z)=\frac{{{{{\rm{Var}}}}}_{{w}_{m}}\left[{{{{\rm{E}}}}}_{{{{\rm{{w}}}_{ \sim m}}}}(z| {w}_{m})\right]}{{{{\rm{Var}}}}(z)}.\)

3: **if** *S*_{m}(*z*)≥ TOL **then**

4: set *w*_{m} as a personalized parameter and denote as \({w}_{m}^{(2)}\) else

5: keep *w*_{m} as a population parameter.

6: **end** **if**

7: **end** **for**

8:

9: **for** *i*=1 to ∣Ω∣ **do** ⊳ Personalized model calibration.

10: **for** *k* ∈ {*A*, *T*, *N*, *C*} **do**

11: Denote the personalized parameters in *k*-th equation as \({{{{\boldsymbol{{w}}}_{k}}}}^{(2)}\).

12: ⊳ Select parameters to calibrate.

13: \({{{{\boldsymbol{{w}}}_{k}}}}^{(2)}={\arg \min }_{{{{{\boldsymbol{{w}}}_{k}}}}^{(2)}}\mathop{\sum }\limits_{j=1}^{M-1}{\left({\hat{y}}_{ijk}-{f}_{k}\left({\alpha }_{i}{t}_{ij}+{\beta }_{i};{{{{\boldsymbol{{w}}}_{k}}}}^{(2)}\right)\right)}^{2}.\)

14: \(P{A}_{ik}=\frac{{\hat{y}}_{iMk}-{f}_{k}\left({\alpha }_{i}{t}_{(iM)}+{\beta }_{i};{{{{\boldsymbol{{w}}}_{k}}}}^{(2)}\right)}{{\hat{y}}_{iMk}}\times 100 \% .\)

15: ⊳ Compute prediction accuracy.

16: **end for**

17: **end for**

**Output** *P**A*_{ik} for *i* ∈ Ω and *k* ∈ {*A*, *T*, *N*, *C*}.

Next, we build personalized models and provide biomarker prediction for subjects whose data satisfies the following two criteria: (1) There are at least four measurements for each biomarker; (2) Each biomarker measurement changes monotonically with respect to DPS. Based on the first-order sensitivity analysis results shown in Fig. 3(a), we chose the eight most sensitive parameters as personalized parameters by choosing TOL = 0.01 in Algorithm 2. For each subject, we denote the biomarker data as \(\hat{{{{\boldsymbol{y}}}}}({s}_{i})={[{\hat{{{{\boldsymbol{A}}}}}}_{\beta }({s}_{i})\hat{{{{\boldsymbol{\tau }}}}}({s}_{i})\hat{{{{\boldsymbol{N}}}}}({s}_{i})\hat{{{{\boldsymbol{C}}}}}({s}_{i})]}^{T}\) (*i* = 1, ⋯ , *M*), fit the sensitive personalized parameters of the population model *w*^{(1)} by using the first *M* − 1 data points, and test the prediction accuracy on the last data point by \(\frac{\hat{{{{\boldsymbol{y}}}}}({s}_{M})-{{{\boldsymbol{y}}}}({s}_{M})}{\hat{{{{\boldsymbol{y}}}}}({s}_{M})}\times 100 \%\). A detailed procedure is outlined in Algorithm 2.

Figure 4 shows the biomarker trajectories of the personalized model by training (blue) and testing (red) data for one subject (pseudo ID = 18). We also compare the personalized model with the sigmoid function fitting, the personalized model provides a better prediction accuracy. In fact, the prediction accuracies given by the personalized model are 97.3% (*A*_{β}), 95.9% (*τ*), 98.4% (*N*), and 95.1% (*C*), respectively while the ones given by the sigmoid function fitting are 95.5% (*A*_{β}), 90.8% (*τ*), 95.7% (*N*), and 63.4% (*C*), respectively. Since the sigmoid function fitting predicts by using the longitudinal information of the current biomarker only, it provides a less accurate cognitive score.

Furthermore, we build personalized models for the CN and LMCI groups (there are not enough data points in the AD group) with different numbers of longitudinal data points and summarize the predictive results in Tables 2–3. The tables indicate that our personalized models can provide high predictive accuracy compared to the sigmoid function fitting. Moreover, the accuracy of predicting biomarker dynamics increases as the number of biomarkers data points increases.

## Discussion

Different from the existing pathophysiological AD network which is based on a priori assumptions about biomarker trajectories, this work develops a data-driven causal modeling approach informed by AD clinical biomarker data and demonstrates both population and personalized models. The proposed population model traces the general biomarker dynamics for all patient data without any specific assumptions regarding the form of the model and enables personalized AD risk prediction via incorporating historical clinical data such as CSF protein and imaging biomarkers as well as cognitive scores. By introducing a DPS for each subject, we calibrate and scale AD biomarker progression across the ADNI population and derive population parameters. We also compare the proposed data-driven modeling approach to an empirical fitting approach with a sigmoid function fitting and conclude that the proposed causal model is able to better capture disease progression with a smoother transition over time. Moreover, this causal model allows us to explore the underlying cascade relationship among biomarkers, while the empirical sigmoid function approach considers each biomarker as an independent term. The population model not only provides a means to classify different stages of AD progression for each biomarker, but also lays the foundation for personalized modeling.

Before constructing the personalized model, we performed a sensitivity analysis for the population parameters. From a clinical standpoint, the sensitivity analysis provides insights on AD progression in terms of which parameters play the greatest role in disease progression, and when during the disease course they are most relevant. From a computational standpoint, the sensitivity analysis aids the subsequent personalized parameter selection . Based on the sensitivity analysis, we see that change in cognition is driven primarily by first-order effects and is time-dependent. Initially, the greatest effects are by amyloid, represented by *w*_{A1}, and to a lesser extent tau and neuronal vulnerability to tau, represented by *w*_{N4} and *w*_{N5}, respectively. The amyloid parameter *w*_{A1} is most sensitive when the disease starts (*D**P**S* = 0) and the sensitivity diminishes as DPS increases. On the other hand, the sensitivity of parameters related to *N* and *C*, namely *w*_{C3} and *w*_{C5}, increase significantly as the disease progresses. Thus, the sensitivity analysis suggests that at the early stage of AD cognitive decline is driven by *A*_{β} levels and sensitivity decreases linearly as the disease progresses. Whereas at the later stages, cognitive decline is driven mainly by downstream effects including the level of neuronal degeneration, represented by *w*_{C3}, and the interaction of cognition and neuronal degeneration, represented by *w*_{C5}. These results are consistent with prior observational studies based on ADNI and other longitudinal cohorts, which suggest that cognitive decline is driven primarily by high amyloid levels at earlier disease stages and by neurodegeneration at later stages^{31}.

Sensitivity analysis also provides key insights in terms of personalized parameter selection. The paucity of longitudinal biomarker data and the relatively larger number of model parameters can easily lead to overfitting for personalized models. Based on the sensitivity analysis results, we chose the eight most sensitive population parameters as personalized parameters and set the rest of the parameters at the mean population parameter values. In this case, calibration of personalized parameters based on sparse longitudinal biomarker data for each patient avoids the overfitting issue and provides a high-precision personalized prediction for each subject, as outlined in Results section.

Limitations of this work include sampling bias. Because the ADNI dataset is a research cohort from academic clinics, only one-third of ADNI subjects agreed to provide CSF biomarkers. Thus we need to replicate these findings using data from more general practice settings in the future. Despite these limitations, this model advances our understanding of the complexity of AD biomarker pathophysiology over that of current biomarker models which have primarily been independent and ad hoc in nature, with inherent assumptions regarding the shape of individual biomarker trajectories. Our current approach is integrative and based on the cascade mechanism, yet without assumptions regarding the exact mathematical form of the individual biomarker models or the resulting shape of the biomarker trajectories. In the future, we intend to extend the current approach to the spatiotemporal domain by utilizing longitudinal imaging data to determine mechanisms driving the spread of pathology in time and space.

## Methods

We propose a pathophysiology and data-driven modeling approach to construct a causal model of AD clinical biomarkers. We construct a causal model from the serial clinical biomarker measures across 819 subjects from the ADNI-1 datasets with mild AD (N = 192), late mild cognitive impairment (LMCI, N = 398), and normal cognition (N = 229) (more details are shown in Table 4). We use PseudoIDs instead of RIDs to link across all clinical biomarker data belonging to a patient. The CSF proteins measured in ADNI are the following A-Beta 42 and Phosphorolated tau 181 (p-tau 181)^{32,33}. These measures were obtained through serial spinal taps on subjects over approximately two-year intervals. Of note, A-Beta in the CSF goes down, and total and phosphorylated tau go up as the disease progresses. Hippocampal volume, a measure of neurodegeneration, was measured through volumetric analysis of serial MRI images obtained at approximately one-year intervals. It goes down as the disease progresses. Finally, cognitive decline was measured through a pencil-and-paper neuropsychological test, the thirteen-item Alzheimer s Disease Cognitive Assessment Scale (ADAS13). This measures function in several cognitive domains affected by AD, including memory, language, and praxis and is the de facto primary outcome measure in AD clinical trials. It goes up as the disease progresses.

After constructing and calibrating the population model with data across all ADNI subjects, we then personalize the parameters of the model using each patient’s longitudinal data to provide a personalized prediction of biomarker trajectories. The overall modeling approach is outlined in Fig. 5, and each step is elaborated in the following subsections.

### The data-driven causal model learning via ADNI dataset

Four AD biomarkers are key factors in AD diagnosis and monitoring of AD progression, and include amyloid-beta *A*_{β}, tau *τ*, neuronal degeneration *N*, and cognitive decline *C*. Amyloid-beta is the main component of amyloid plaques and is considered to be an early event of the pathological cascade of AD. Amyloid production leads to downstream Tau phosphorylation causing the formation of neurofibrillary tangles and neuropil threads. Tau is a microtubule-associated protein, which is very common in neurons of the central nervous system. Both amyloid-beta and tau phosphorylation contribute to neuronal degeneration and cognitive decline.

To describe the cascade relationship among the above-mentioned four biomarkers of AD progression, we consider a canonical system of ODEs to describe their relations. The amyloid-dependent cascade is initiated by amyloid-beta pathology *A*_{β}, and mediated via tau *τ*. Neuron degeneration *N* starts with the rise of tau *τ*, and in turn, leads to the initiation of cognitive decline *C*. According to the above description, we consider the causal model as the system of ODEs:

where ** ℓ** = (

*ℓ*

_{1},

*ℓ*

_{2}), ∣

**∣ = ∣**

*ℓ**ℓ*

_{1}∣ + ∣

*ℓ*

_{2}∣, and

*m*is the degree of the model. We choose the polynomial basis function in the initialized ODE model, namely,

We then learn the causal model parameters in (1) by using ADNI data. More specifically, we use CSF amyloid-beta 1-42 (*A*_{β}), CSF total tau (*τ*), the ratio of hippocampal volume to whole-brain volume on MRI (*N*), and the Alzheimer’s Disease Assessment Scale-cognitive (*C*) to calibrate *A*_{β}, *τ*, *N*, and *C*, respectively in the causal model. In order to denoise longitudinal data for different subjects, we applied a sigmoid interpolation for each biomarker. Moreover, because AD has a different time of onset and rate of progression for different subjects, we employ DPS^{28} to unify the time scale across subjects in the causal model.

### Disease progression scores

For different subjects in ADNI, the onset of disease and rate of progression are different within and among subject classes of CN, LMCI and AD. To fit the causal model for all subjects in the ADNI-1 study, we standardize the longitudinal measurement among patients by employing the DPS^{28}. In particular, we define DPS *s*_{i}(*t*) as a linear function of the patient’s age *t* for each patient:

where *i* = 1, 2, ⋯ , *I* is the patient index, *α*_{i} is the rate of AD progression, and *β*_{i} is the age of AD onset.

### The sigmoid function fitting

We fit each biomarker data in ADNI to a sigmoid function. Specifically, each biomarker is parameterized by four parameters \({{{{\boldsymbol{\theta }}}}}_{k}={[{a}_{k},{b}_{k},{c}_{k},{d}_{k}]}^{T}\):

where *a*_{k} is a magnitude scale of the function, *b*_{k} is a slope coefficient, and *c*_{k} and *d*_{k} determine function positions. Here we take *g*_{1}(*s*) = *A*_{β}(*s*), *g*_{2}(*s*) = *τ*(*s*), *g*_{3}(*s*) = *N*(*s*), *g*_{4}(*s*) = *C*(*s*) and denote \({{{\boldsymbol{g}}}}={({g}_{1},{g}_{2},{g}_{3},{g}_{4})}^{T}\).

Next, we apply the sparse learning to reveal the causal model in (1) which is re-written as

By taking uniform grid points \({\{{s}_{i}\}}_{i = 1}^{M}\) on *s* ∈ [−10, 20], we denote

where *ℓ*_{1}, ⋯ , *ℓ*_{n} are in the set of ∣*ℓ*∣ ≤ *m*. By expanding

we learn the causal model via the following Lasso regression, namely,

where ∥*w*∥_{1} enforces the sparsity.

Here we keep the polynomial degrees among all the variables in the causal model be consistent and choose *m* = 4 with *λ* = 10^{−7} in (5). By performing Lasso, we find the result is consistent with the causal model when *m* = 2 but different from the one with *m* = 1, which indicates the optimal choice of the causal model is *m* = 2. Then the general causal model of ODEs describing the progression of AD biomarkers is summarized below (All rights to the in-silico model belong to the authors and it cannot be used for any commercial purpose without permission):

with an initial condition *A*_{β}(−10) = *y*_{0} and *τ*(−10) = *N*(−10) = *C*(−10) = 0, where *y*_{0} is also a parameter that we consider as a small positive value to initiate the cascade.

### Population model calibration

First, we calibrate the learned causal model by using the ADNI dataset and rewrite (6) as the following population model

where \({{{\boldsymbol{w}}}}=\{{w}_{A,\ell }^{(1)},{w}_{T,\ell }^{(1)},{w}_{N,\ell }^{(1)},{w}_{C,\ell }^{(1)}\}\) denote the population parameters. We also denote *f*_{1}(*s*) = *A*_{β}(*s*), *f*_{2}(*s*) = *τ*(*s*), *f*_{3}(*s*) = *N*(*s*), *a**n**d* *f*_{4}(*s*) = *C*(*s*) with the initial conditions *f*_{1}(−10) = *y*_{0}, *f*_{2}(−10) = *f*_{3}(−10) = *f*_{4}(−10) = 0. Then the population parameters are calibrated based on the ADNI dataset by minimizing the sum of squared differences between the data and the solution of the causal model, namely

where *y*_{ijk} is the *k*-th biomarker data for *i*-th patient at *j*-th visit and \({{{{\mathcal{I}}}}}_{k}\) is the set of (*i*, *j*) for *k*-th biomarker.

Since the biomarkers for each patient will generally increases or decreases monotonically, we consider fitting DPS as a least square linear regression problem, namely,

where \({{{{\mathcal{I}}}}}_{i}\) is set of (*j*, *k*) for *i*-th patient and *σ*_{k} is the sum of squared error with respect to biomarker *k*, namely,

The detailed procedure to fit the parameters is shown in Algorithm 1. The optimization solver employs the Levenberg-Marquardt method^{34}, which can avoid getting stuck in a local minimum.

### Sensitivity analysis

We assume that the parameters in the population model, \({{{{\boldsymbol{w}}}}}^{(1)}=[{w}_{A0}^{(1)},\,{w}_{A1}^{(1)},\,\cdots \,,\,{w}_{m}^{(1)},\,\cdots \,,\,{w}_{C4}^{(1)},\,{w}_{C5}^{(1)}]\in {{\mathbb{R}}}^{21}\), are independent and identically distributed inputs, where *m* is the index of inputs. For sensitivity analysis, we omit the superscript of the parameters later for simplicity. The range of each input is 90–110% of their values shown in Table 1.

Then we perform Sobol sensitivity analysis, which is also called variance-based sensitivity analysis and is developed from the analysis of variance. As a global sensitivity analysis method, it analyzes the effects of each input by decomposing the variance of the output of the population model into fractions attributed to the inputs. In this paper, we perform both the first-order and second-order sensitivity analyses to the parameters. In particular, the first-order sensitivity index measures the attribution to the variance of the output considering only one input, which is calculated by:

where \({w}_{ \sim m}=\left[{w}_{A1},\,\cdots \,,\,{w}_{m-1},\,{w}_{m+1},\,\cdots \,,\,{w}_{C5}\right]\) includes all inputs except *w*_{m}. Next, the second order sensitivity with respect to *m* and *n* is measured by sum of attributing the variance of the output considering their first order effects and the second-order interaction between inputs *m* and *n*:

Then we measure the total-order sensitivity index, which is calculated by attributing the variance of the output considering both the first-order effect, second-order effect, and other higher-order ones.

When the sensitivity value is positive, the corresponding parameter is positively correlated with the model output. If the value is negative, they are negatively correlated. The absolute value of parameter sensitivities represents the degree of influence on the model output. If the sensitivity value is closer to 0, changing this parameter will have less influence on the model output. Based on the sensitivity values and the number of biomarker measurements, we determine the personalized parameters to fit the longitudinal data points for each patient and keep the remaining parameters the same as the population parameter values. This can avoid overfitting when providing the personalized prediction for each subject.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data availability

Access to the ADNI dataset is publicly available via http://adni.loni.usc.edu^{35}.

## Code availability

The sensitivity analysis code is available at http://salib.readthedocs.io/en/latest/. The simulation study code is available at https://www.pymc.io/welcome. The non-linear optimizer can be found in https://github.com/jjhartmann/Levenberg-Marquardt. Codes for Algorithms 1 and 2 are included in the Supplementary Information.

## References

Cortes-Canteli, M. & Iadecola, C. Alzheimer’s disease and vascular aging: Jacc focus seminar.

*J. Am. College Cardiol.***75**, 942–951 (2020).Batool, A., Kamal, M. A., Rizvi, S. & Rashid, S. Topical discoveries on multi-target approach to manage alzheimer’s disease.

*Curr Drug Metab.***19**, 704–713 (2018).Bertram, L., McQueen, M. B., Mullin, K., Blacker, D. & Tanzi, R. E. Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database.

*Nat. Genet***39**, 17–23 (2007).Lane, C. A., Hardy, J. & Schott, J. M. Alzheimer's disease.

*Eur. J. Neurol.***25**, 59–70 (2018).Aliev, G. et al. Alzheimer’s disease–future therapy based on dendrimers.

*Curr. Neuropharmacol.***17**, 288–294 (2019).Milne, R. et al. At, with and beyond risk: expectations of living with the possibility of future dementia.

*Soc. Health Illness***40**, 969–987 (2018).Sperling, R. A. et al. Toward defining the preclinical stages of Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease.

*Alzheimers Dement***7**, 280–292 (2011).Jack, C. R. et al. NIA-AA research framework: toward a biological definition of Alzheimer’s disease.

*Alzheimers Dement***14**, 535–562 (2018).Hao, W. & Friedman, A. Mathematical model on Alzheimer’s disease.

*BMC Syst Biol***10**, 108 (2016).Petrella, J. R., Hao, W., Rao, A. & Doraiswamy, P. M. Computational causal modeling of the dynamic biomarker cascade in Alzheimer’s disease.

*Comput. Math. Methods Med.***2019**, https://doi.org/10.1155/2019/6216530 (2019).Jack, C. R. & Holtzman, D. M. Biomarker modeling of Alzheimer’s disease.

*Neuron***80**, 1347–1358 (2013).Abeysinghe, A. A. D. T., Deshapriya, R. D. U. S. & Udawatte, C. Alzheimer’s disease; a review of the pathophysiological basis and therapeutic interventions.

*Life Sci*.**256**, 117996 (2020).Guo, T., Korman, D., Baker, S. L., Landau, S. M. & Jagust, W. J. Longitudinal cognitive and biomarker measurements support a unidirectional pathway in Alzheimer’s disease pathophysiology.

*Biol. Psychiatry***89**, 786–794 (2021).Myszczynska, M. A. et al. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases.

*Nat. Rev. Neurol.***16**, 440–456 (2020).Iturria-Medina, Y., Carbonell, F. M., Sotero, R. C., Chouinard-Decorte, F. & Evans, A. C. Multifactorial causal model of brain (dis)organization and therapeutic intervention: Application to Alzheimer’s disease.

*Neuroimage***152**, 60–77 (2017).Friedman, A. & Hao, W. The role of exosomes in pancreatic cancer microenvironment.

*Bull. Math. Biol.***80**, 1111–1133 (2018).Budithi, A., Su, S., Kirshtein, A. & Shahriyari, L. Data driven mathematical model of FOLFIRI treatment for colon cancer.

*Cancers.***13**, https://doi.org/10.3390/cancers13112632 (2021).Hao, W. et al. A mathematical model of aortic aneurysm formation.

*PLoS One***12**, e0170807 (2017).Friedman, A. & Hao, W. A mathematical model of atherosclerosis with reverse cholesterol transport and associated risk factors.

*Bull. Math. Biol*. 77, 758-781 (2015).Wang, X. et al. A bayesian framework for generalized linear mixed modeling identifies new candidate loci for late-onset alzheimer’s disease.

*Genetics***209**, 51–64 (2018).Sun, N. et al. Multi-modal latent factor exploration of atrophy, cognitive and tau heterogeneity in alzheimer’s disease.

*Neuroimage***201**, 116043 (2019).Schäfer, A. et al. Bayesian physics-based modeling of tau propagation in alzheimer’s disease.

*Front. Physiol.*1081, https://doi.org/10.3389/fphys.2021.702975 (2021).Iddi, S. et al. Estimating the evolution of disease in the parkinson’s progression markers initiative.

*Neurodegenerative Dis.***18**, 173–190 (2018).Iddi, S. et al. Predicting the course of alzheimer’s progression.

*Brain Informatics***6**, 1–18 (2019).Li, D. et al. The relative efficiency of time-to-progression and continuous measures of cognition in presymptomatic alzheimer’s disease.

*Alzheimer’s & Dement.***5**, 308–318 (2019).Li, D., Iddi, S., Thompson, W. K., Donohue, M. C. & Initiative, A. D. N. Bayesian latent time joint mixed effect models for multicohort longitudinal data.

*Stat. Methods Med. Res.***28**, 835–845 (2019).Marinescu, R. V. et al. Predicting alzheimer’s disease progression: Results from the tadpole challenge: Neuroimaging: Neuroimaging predictors of cognitive decline.

*Alzheimer’s Dement.***16**, e039538 (2020).Jedynak, B. M. et al. A computational neurodegenerative disease progression score: method and results with the alzheimer’s disease neuroimaging initiative cohort.

*Neuroimage***63**, 1478–1486 (2012).Sobol, I. M. Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates.

*Math. Comput. Simul.***55**, 271–280 (2001).Zhang, S., Ponce, J., Zhang, Z., Lin, G. & Karniadakis, G. An integrated framework for building trustworthy data-driven epidemiological models: Application to the covid-19 outbreak in new york city.

*PLOS Comput. Biol.***17**, 1–29 (2021).Jack, C. R. et al. Serial PIB and MRI in normal, mild cognitive impairment and Alzheimer’s disease: implications for sequence of pathological events in Alzheimer’s disease.

*Brain***132**, 1355–1365 (2009).Shaw, L. M. et al. Qualification of the analytical and clinical performance of CSF biomarker analyses in ADNI.

*Acta Neuropathol***121**, 597–609 (2011).Shaw, L. M. PENN biomarker core of the Alzheimer’s disease Neuroimaging Initiative.

*Neurosignals***16**, 19–23 (2008).Levenberg, K. A method for the solution of certain non-linear problems in least squares.

*Quart. Appl. Math.***2**, 164–168 (1944).Weiner, M. W. et al. The alzheimer’s disease neuroimaging initiative: a review of papers published since its inception.

*Alzheimer’s Dement.***9**, e111–e194 (2013).

## Acknowledgements

G.L. and H.Z. were supported in part by NSF (DMS-1555072, DMS-1736364, DMS-2053746, and DMS-2134209) and DOE DE-SC0021142. JRP was supported in part by NSF DMS-2052676. W.H. was supported in part by NSF DMS-2052685. PMD’s work on this project is supported by the NIA, Karen L Wrenn Trust and Steve Aoki Fund.

## Funding

Funding for data collection was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904, Michael Weiner, PI) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through contributions from the following: AbbVie, Alzheimer ’s Association; Alzheimer ’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (http://www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. ADNI investigators contributed to the design and implementation of the ADNI database and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgment_List.pdf.

## Author information

### Authors and Affiliations

### Consortia

### Contributions

J.R.P. conceived the idea for in-silico modeling of AD biomarkers. P.M.D. initiated the collaboration between J.R.P., P.M.D., and W.H. to further develop this idea. W.H. developed the data-driven modeling idea. J.P. completed the idea by including DPS in consultation with P.M.D. and W.H.. P.M.D. provided the idea to apply this model separately in C.N., M.C.I. and A.D.. J.R.P. and P.M.D. provided data access and clinical constructs. G.L. and W.H. supervised the work. H.Z. implemented the code. All authors contributed to the discussions leading to the perspective presented. All authors contributed to the editing and shaping of the manuscript at various stages of preparation. All authors read and approved the final version.

### Corresponding author

## Ethics declarations

### Competing interests

The authors declare no Competing Non-Financial Interests but the following Competing Financial Interests: PMD is a co-inventor on patents for the diagnosis or treatment of Alzheimer disease. PMD owns shares in several biotechnology companies whose products are not discussed here. P.M.D. has received grants from NIH, DARPA, DOD, ONR, Bausch, Avanir, Avid, Cure Alzheimer’s Fund, Karen L. Wrenn Trust, Steve Aoki Foundation, and advisory fees from Apollo, Brain Forum, Clearview, Lumos, Neuroglee, Otsuka, Verily, Vitakey, Sermo, Lilly, Vivly, and Transposon.

## Additional information

**Publisher’s note** Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Supplementary information

## Rights and permissions

**Open Access** This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

## About this article

### Cite this article

Zheng, H., Petrella, J.R., Doraiswamy, P.M. *et al.* Data-driven causal model discovery and personalized prediction in Alzheimer's disease.
*npj Digit. Med.* **5**, 137 (2022). https://doi.org/10.1038/s41746-022-00632-7

Received:

Accepted:

Published:

DOI: https://doi.org/10.1038/s41746-022-00632-7