The preclinical amyloid sensitive composite to determine subtle cognitive differences in preclinical Alzheimer’s disease

Recently, the focus of Alzheimer’s disease (AD) research has shifted from the clinical stage to the preclinical stage. We, therefore, aimed to develop a cognitive composite score that can detect the subtle cognitive differences between the amyloid positive (Aβ+) and negative (Aβ−) status in cognitively normal (CN) participants. A total of 423 CN participants with Aβ positron emission tomography images were recruited. The multiple-indicators multiple-causes model found the latent mean difference between the Aβ+ and Aβ− groups in the domains of verbal memory, visual memory, and executive functions. The multivariate analysis of covariance (MANCOVA) showed that the Aβ+ group performed worse in tests related to the verbal and visual delayed recall, semantic verbal fluency, and inhibition of cognitive inference within the three cognitive domains. The Preclinical Amyloid Sensitive Composite (PASC) model we developed using the result of MANCOVA and the MMSE presented a good fit with the data. The accuracy of the PASC score when applied with age, sex, education, and APOE ε4 for distinguishing between Aβ+ and Aβ− was adequate (AUC = 0.764; 95% CI = 0.667–0.860) in the external validation set (N = 179). We conclude that the PASC can eventually contribute to facilitating more prevention trials in preclinical AD.

www.nature.com/scientificreports/ Although Aβ PET has such advantage of early detection of Aβ+ biomarker, it is usually challenging to obtain a large number of participants with PET data due to the high cost and safety concerns. However, the importance of Aβ biomarker has kept rising as several prevention trials are currently being conducted in preclinical AD with the expectation to target Aβ. Thus, it will be pragmatic and essential for clinicians to predict who might be at high risk of having an Aβ+ biomarker without the help of neuroimaging techniques. As a result, we need to, instead, investigate the distinct neuropsychological features of CN who have elevated Aβ, which may help clinicians predict preclinical AD by reducing screen failures and monitoring the therapeutic efficacy of prevention.
Previously, there were several attempts to investigate the distinct neuropsychological features of Aβ+ CN individuals. However, the results were inconsistent among studies to date. Multiple studies have consistently reported that high amyloid burden in CN adults is associated with poorer performance in episodic memory 8,9 . On the other hand, a large lifespan study of CN adults found association of amyloid deposition with executive functions, but not with memory burden 10 . This discrepancy may be because not many previous studies considered the effects of measurement errors, although there is always a possibility for the presence of measurement errors in psychometrics 11 . In fact, there was a study considered measurement errors using structural equation modeling (SEM) to examine the associations between amyloid burden or white matter hyperintensity and cognition 9 . However, there are still few studies considering the effects of measurement errors in studying biomarkers and cognition together, and there is a need to build more evidence for this methodology in this regard. Accordingly, we used the multiple-indicators multiple-causes (MIMIC) model in our study to examine Aβ related cognitive functions. By using the principle of factor analysis, the MIMIC model can control for these measurement errors to estimate the latent values. That is, MIMIC may empower a composite model to sensitively detect subtle cognitive differences between Aβ+ and Aβ− in CN elderlies. However, to our knowledge, no Aβ+ CN studies have yet applied this factor structure method to develop a cognitive composite model for identifying preclinical AD in the elderly population.
In the present study, we aimed to determine if there are any distinct cognitive domains and cognitive measures of Aβ+ CN using the MIMIC model, which may yield lower background noise in measurements. We also developed the Preclinical Amyloid Sensitive Composite (PASC) model, a composite model that precisely distinguishes cognitive differences between Aβ+ and Aβ− in CN individuals. Then, we computed the PASC score employing the PASC model we developed. Finally, we validated the PASC score along with age, sex, education, and APOE ε4 to distinguish between Aβ+ and Aβ− CN in an external validation set. Considering that subtle deficits in episodic memory and executive functions appear to be critical in preclinical AD due to their strong association with AD progression 12 , we hypothesized that the PASC score consisting of memory and executive functions might differentiate Aβ+ CN from Aβ− CN with adequate accuracy.

Results
Demographic characteristics of participants. The demographic and neuropsychological characteristics of the study participants are presented in Table 1. The overall mean age of the participants was 69.9 years. Among the 423 participants, 75 were Aβ+ (17.7%). The frequency of APOE ε4 carriers was 24.6%. The Aβ+ group was significantly older than the Aβ− group (71.5 ± 6.8 years vs. 69.5 ± 8.4 years, p < 0.05). The Aβ+ group also displayed a higher percentage of APOE ε4 carriers compared with the Aβ− group (58.0% vs. 17.5%, p < 0.001). However, the two groups did not significantly differ in education level and proportion of female participants.
MIMIC model for latent mean analysis. The confirmatory factor analysis (CFA) model was successfully validated to control measurement errors. Accordingly, error covariance was added between the residual variances associated with the Seoul Verbal Learning Test-Elderly's version (SVLT-E) immediate and delayed recalls and the Rey-Osterrieth Complex Figure Test (RCFT) immediate and delayed recalls. The CFA model with added error covariance fit the data well (χ 2 = 212.181, df = 78, p < 0.001; RMSEA = 0.064; CFI = 0.957; TLI = 0.942; SRMR = 0.056). All factor loadings in the model were significant between 0.49 and 0.89. Next, a latent mean difference between Aβ+ and Aβ− for each cognitive domain was verified. The latent mean model fit the data well (χ 2 = 359.481, df = 128, p < 0.001; RMSEA = 0.065; CFI = 0.944; TLI = 0.919; SRMR = 0.048). The result revealed that the differences between the Aβ+ and Aβ− groups in attention, visuospatial function, and language function were not significant, but the latent means in the Aβ+ group were significantly lower than the Aβ− group in the three domains of verbal memory, visual memory, and executive functions ( Table 2).
MANCOVA. Based on the results above, further statistical analyses were conducted for each neuropsychological assessment within the episodic memory and executive functions. Multivariate analysis of covariance (MANCOVA) was used to see the score differences of the tests under episodic memory and executive functions between the Aβ+ and Aβ− groups when sex, education, and age were controlled. The result of the MANCOVA is shown in Table 3. Few neuropsychological subtests showed meaningful differences between the groups. Primarily, we set the level of significance at 0.1. Regarding episodic memory, the SVLT-E delayed recall showed a difference in score between the Aβ+ and Aβ− groups (F(1, 418) = 3.666, p = 0.056). For the RCFT, the Aβ+ group performed worse, not only on the delayed recall (F(1, 418) = 4.036, p = 0.045), but also on the immediate recall (F(1,418) = 2.898, p = 0.089). However, due to the extremely high correlation between the two subtests (r = 0.935), we considered that it would be reasonable to use only one subset in our composite model. Based on the clinical and statistical significance, the RCFT delayed recall was favored over the RCFT immediate recall for the PASC.  Examination. **p < 0.05 between Aβ− and Aβ+ in both sets. a Values are presented as mean (standard deviation) or number (%). b The Independent sample t-test was used for continuous variables, and the chi-square test was used for categorical variables. c Analysis of covariance was conducted as a statistical analysis to see the difference in test scores of each group. Age, education, and sex were adjusted as covariates in the analysis. d APOE ε4 genotyping: development set N = 395; validation set N = 167.   All factor loadings in the model were significant between 0.56 and 0.73 (Fig. 1). The MIMIC model was used to ensure that the PASC distinguished between Aβ+ and Aβ− (Fig. 2). Our MIMIC model for the PASC fit the data well (χ 2 = 56.526, df = 21, p < 0.001; RMSEA = 0.063; CFI = 0.955; TLI = 0.936; SRMR = 0.036). The result showed that the latent mean in the Aβ+ group was significantly lower than the Aβ− group (t = -2.340, p = 0.019) ( Table 4).
Calculation of the PASC score. In order to create the composite score, we implemented the principal component analysis (PCA) with the z-scores of the 5 tests. As a result, the following composite equation was generated:   www.nature.com/scientificreports/ CI = 0.704-0.837) (Fig. 3a, Table 5). The sensitivity (71%) and specificity (73.5%) were optimal for distinguishing between Aβ+ and Aβ−, and the Youden index was 0.445. The sensitivity and specificity were also optimal when the Youden index was 0.402 (sensitivity = 70.4%; specificity = 69.8%) (Fig. 3b, Table 5). The results of the ROC curve analysis in the external validation set were comparable to those in the development set (Table 5).

Discussion
We investigated the distinct neuropsychological features of Aβ+ CN elderlies in a carefully phenotyped, CN cohort that underwent detailed neuropsychological tests, MRI, and amyloid PET scans with the standardized protocols. Accordingly, there were several significant neuropsychological findings in this study. First, the MIMIC model found the latent mean difference between the Aβ+ and Aβ− groups in the domains of verbal memory, visual memory, and executive functions. Furthermore, MANCOVA showed that the Aβ+ group performed worse in the SVLT-E delayed recall, the RCFT delayed recall, the K-CWST color reading, and the COWAT animal naming within the three cognitive domains. The PASC model that we developed using the result of MANCOVA and the MMSE presented a good fit with the data. Finally, the accuracy of the PASC score when applied with age, sex, education, and APOE ε4 for distinguishing between Aβ+ and Aβ− was adequate (AUC = 0.764; 95% CI = 0.667-0.860) in the external validation set (N = 179). Our results, therefore, suggested that the PASC might contribute to decreasing financial loss due to screen failures in preclinical AD clinical trials and facilitating more prevention trials subsequently. The demographic profile of our participants was extremely close to that of the previously reported Asian society profile. The CN Aβ+ percentage in Asian countries is known to be lower than that in western countries. The percentage of CN amyloid positivity in the Asian population ranged between 18 and 25% according to the Korean Brain Aging Study for the Elderly Diagnosis and Prediction of Alzheimer's disease (KBASE) and Japanese ADNI (J-ADNI) 13,14 . On the other hand, the western population, represented by ADNI, was reported to range approximately from 25 to 45% Aβ positivity rate 15,16 . Our study exhibited approximately 18% Aβ positivity in the 423 CN individuals, which was in line with that in the Asian population. The discrepancy between our results and that of the western society may be explained by the differences in the frequency of APOE ε4 and the age of the study participants. Our cohort seemed to have a lower percentage of APOE ε4 (23%) than that reported by ADNI (27%) 17 . Moreover, the younger age of our cohort (mean, 69.9 years) compared to that of the ADNI CN individuals (mean, 75.8 years) may have affected the lower rate of amyloid positivity 17 . Regardless of these disparities, the APOE ε4 rate and the age of our cohort were still at comparable levels to J-ADNI's APOE ε4 rate (24%) and CN individuals' ages (mean, 67.9) 13 .
Our major finding was that the Aβ+ CN individuals presented a lower performance in verbal memory, visual memory, and executive functions compared to Aβ− CN, which was generally consistent with the findings of previous meta-analyses. In terms of memory, there has been a consensus that episodic memory has a strong association with Aβ burden [18][19][20] . In our study, delayed recall task of both verbal and visual memory tests especially stood out as the performance difference between Aβ+ and Aβ− seem to be more prominent than immediate or recognition tasks. This result is not surprising because previous studies with CN or MCI also suggested that using delayed recall may be good predictor of Aβ positivity 21,22 . Unlike episodic memory, the results regarding executive functions in the previous studies are not entirely consistent. A recent meta-analysis suggested a significant www.nature.com/scientificreports/ difference in executive function 19 , while two others showed either a small effect size or a weak association with Aβ burden 18,20 . This may be because the previous studies did not consider the effects of measurement errors that could impact the individual test scores. However, applying a factor analysis with the latent variables, we controlled for the measurement errors from each test score for more precise measurement of the corresponding cognitive function.
In the present study, we also developed the PASC that is sensitive to the subtle cognitive differences in CN based on amyloid positivity. The PASC comprises the SVLT-E delayed recall, the RCFT delayed recall, the COWAT animal naming, and the K-CWST color reading, which were found to be significantly different between the two CN groups from the three cognitive domains, and the K-MMSE. We included the K-MMSE in the PASC because the Mini-Mental State Exam (MMSE) is a practical neuropsychological test to examine individual cognitive function holistically 23 . Global cognition was previously reported to be associated with amyloidosis 7,24-26 , and has been considered to help with early identification of dementia risk and further cognitive decline 27 . Furthermore, other composite scores such as the Preclinical Alzheimer Cognitive Composite (PACC) and the Alzheimer's Prevention Initiative Composite Cognitive Test Score (APCC) include the MMSE for global functioning and orientation status [28][29][30] . In this regard, we included the MMSE for the convenience of harmonization in the future international collaboration. In fact, the PASC may seem similar to the PACC 29 . However, the PACC has been mainly applied to track cognitive changes in preclinical AD over time 25,31 , whereas the PASC investigated the cognitive difference between Aβ+ CN and Aβ− CN cross-sectionally.
Although AD pathology progression involves the deterioration of multiple cognitive domains, there are a few benefits to observe cognitive differences in CN individuals with a single composite that is a unidimensional outcome. First, it allows comprehensive yet precise cognitive assessment particularly in preclinical AD. Currently, the MMSE 23 and the Clinical Dementia Rating (CDR) 32 are commonly used to assess individual cognitive function holistically. However, they often display ceiling effects in CN individuals 33,34 . Therefore, they are not quite sensitive measures for CN individuals. Furthermore, the ratings of the CDR primarily rely on clinicians' judgments following patient and caregiver interviews. In other words, bias is rarely avoidable in the CDR. As a result, there is a need for a novel and reliable measure to holistically assess cognitive function specific to preclinical AD, and we expect that the PASC can meet the need. Another advantage of obtaining a unidimensional composite is that it induces a more precise result in the outcome measurement. Compared to using multi-outcomes, using a single outcome usually yields lower background noise in the measurement, which derives a lower risk of Type-I error 11,28 . Therefore, applying a single primary outcome has better reliability and sensitivity especially in terms of detecting subtle cognitive differences in preclinical AD.
The major strength of our study is that we considered measurement errors in the test scores when we implemented the analyses. Another strength is the large sample size of the CN cohort who underwent amyloid PET. In spite of these strengths, there are a few limitations to our study. First, the participants went through different types of PET ligands. The variety of the tracers may have affected the visual reads of amyloid deposition. However, this limitation can be somewhat alleviated by the high correlations among the different ligands 35,36 . Second, we did not explore the clinical effects of the PASC. Future studies with clinical impacts of the PASC on other biomarkers like tau or cortical atrophy may be recommended. Another limitation is that our study used dichotomized variable of amyloid burden. The issue about dichotomization of amyloid deposition has been constantly questioned as there were several studies showing longitudinal cognitive decline related to subthreshold amyloid in Aβ− CN individuals [37][38][39] . In future studies, continuous measure of amyloid burden may be used to embrace the issue of subthreshold amyloid. Also, longitudinal studies using the PASC may be needed to examine the clinical applicability related to the issue.
Our study created the PASC which is a sensitive cognitive composite score for Aβ+ in CN elderly individuals, subsequent to investigating some distinct cognitive features of Aβ+ in CN elderly individuals. The PASC, which employed significant tests in episodic memory and executive functions, along with the global cognitive measure of the K-MMSE, showed adequate accuracy when it was applied with age, sex, education, and APOE ε4. Therefore, we expect the PASC to be applied potentially into diverse forms of studies such as trial ready registries 40,41 and to contribute to decreasing financial loss due to screen failures and facilitating more prevention trials subsequently. Moreover, given that the cognitive tests that reflect the characteristics of early preclinical AD are anticipated to reflect the later cognitive change, we expect the PASC to be used for monitoring of disease progression or therapeutic efficacy.

Methods
Study participants. A total of 423 CN participants were recruited from September 2015 to December 2018 at the Samsung Medical Center in Seoul, South Korea. All the participants met the following criteria to be qualified as CN: (a) the K-MMSE 24 or above -1.5 standard deviation (SD) from the age-, sex-, and educationadjusted norms if the education period was less than 9 years; (b) above -1 SD from the age-, sex-, and educationadjusted norms on the delayed recall of the SVLT-E; (c) above -2 SD from the age-, sex-, and education-adjusted norms on the Korean version of the Boston Naming Test (K-BNT), the RCFT copy, and the K-CWST color reading; and (d) an absence of other neurological disorders. The screenings were conducted by trained clinicians and neuropsychologists. Brain MRI confirmed the absence of structural lesions, including territorial cerebral infarction, brain tumors, hippocampal sclerosis, vascular malformation, and cerebral amyloid angiopathy (CAA).
The external validation sample involved 91 CN participants who were recruited from December 2018 to April 2020 at the Samsung Medical Center and 88 CN participants who were recruited from May 2017 to April 2020 at Gangnam Severance Hospital. None of the participants in the external validation sample was included in the original study sample. www.nature.com/scientificreports/ Written informed consents were obtained from each participant. This study was approved by the Institutional Review Board at the Samsung Medical Center. All methods were implemented in accordance with the approved guidelines. Visual assessment was done by three experienced raters (two nuclear medicine doctors and one neurologist) who were blinded to patient information, and the assessment was dichotomized as Aβ+ or Aβ− using visual reads. The visual assessments for 18 F-florbetaben PET, 18 F-flutemetamol PET, and 18 F-florbetapir PET were performed with the scoring system that was used in the previous studies [42][43][44][45][46] . Inter-rater agreement was excellent for both  Statistical analyses. Demographic characteristics were compared between the Aβ+ and Aβ− groups using the independent sample t-test if the variables were continuous and the chi-square test if the variables were categorical. CFA was yielded to validate the structure of the five cognitive domains. CFA is one of the multiple forms of SEM, which confirms whether a pre-specified factor structure fits the data well 11 . We validated the CFA model for the neuropsychological test battery to control for measurement errors. The tests included in each cognitive domain were the same as those described earlier, and the language domain consisted of a single test score. The subtests of the SVLT-E and the RCFT in the memory domain were measured respectively using the same method. Therefore, it was considered acceptable to add an error covariance between the residual variances associated with the SVLT-E immediate and delayed recalls and the RCFT immediate and delayed recalls. Since our factor structure included both reflective and causal indicators, we used the MIMIC model to compare the latent means in the cognitive domains between the Aβ+ and Aβ− groups (Fig. 4).
MANCOVA was performed to see if any neuropsychological tests showed a significant difference between the two groups. Since the measurement errors were not treated in the MANCOVA, we deliberately set the cutoff for significance to be less conservative in order to increase the power and reduce the risk of type II errors. Thus, the tests with p-value < 0.1 were selected to be included in the composite model. The MIMIC model was repeated to identify whether these tests were sensitive to differences between Aβ+ and Aβ− in CN elderlies as a composite. For the PASC score equation, the PCA was used to obtain the weight for each test score. Accuracy, sensitivity, and specificity of the PASC score combined with age, sex, education, and APOE ε4 for distinguishing between Aβ+ and Aβ− were tested by ROC analysis.
The CFA and the MIMIC models of the PASC were validated in the cross-validation sample. The accuracy for distinguishing between Aβ+ and Aβ− of the PASC score combined with age, sex, education, and APOE ε4 was investigated by the ROC curve analysis in the external validation sample.
Raw score of each test was used in the statistical analyses for development of the PASC model. Z-scores were used to compute the PASC score. The K-TMT-E part B was log-transformed for accuracy of the estimate due to its large range (0-300) and non-normality. Multiple imputation and full information maximum likelihood estimations were used to treat missing values.
IBM SPSS (version 25.0, SPSS Statistics/IBM Corp, Armonk NY, USA) was used for the statistical analyses. For comparisons of latent means between the groups, maximum likelihood estimation was analyzed by Mplus (version 8.0) 50 . However, due to the violation of normality, bias-corrected bootstraps were performed together.