Genetic data and cognitively-defined late-onset Alzheimer’s disease subgroups

Categorizing people with late-onset Alzheimer’s disease into biologically coherent subgroups is important for personalized medicine. We evaluated data from five studies (total n=4 050, of whom 2 431 had genome-wide single nucleotide polymorphism (SNP) data). We assigned people to cognitively-defined subgroups on the basis of relative performance in memory, executive functioning, visuospatial functioning, and language at the time of Alzheimer’s disease diagnosis. We compared genotype frequencies for each subgroup to those from cognitively normal elderly controls. We focused on APOE and on SNPs with p<10-5 and odds ratios more extreme than those previously reported for Alzheimer’s disease (<0.77 or >1.30). There was substantial variation across studies in the proportions of people in each subgroup. In each study, higher proportions of people with isolated substantial relative memory impairment had ≥1 APOE e4 allele than any other subgroup (overall p= 1.5 × 10-27). Across subgroups, there were 33 novel suggestive loci across the genome with p<10-5 and an extreme OR compared to controls, of which none had statistical evidence of heterogeneity and 30 had ORs in the same direction across all datasets. These data support the biological coherence of cognitively-defined subgroups and nominate novel genetic loci.


Introduction
Clinical heterogeneity is common among people with late-onset Alzheimer's disease; see 1 for a review. Categorizing people with a condition into biologically coherent subgroups is an important personalized medicine strategy. 2 This strategy is particularly recommended for neurodegenerative conditions. 3 Once biologically coherent subgroups are identified, further investigations may elucidate subgroupspecific treatments.
Genetic data may be useful for determining whether a proposed categorization strategy results in biologically coherent subgroups; see the Box.
We have developed an approach for categorizing people with late-onset Alzheimer's disease based on relative performance across cognitive domains. We determine each person's average performance at diagnosis across memory, executive functioning, language, and visuospatial ability, and consider relative impairments in each domain from that average. We previously evaluated one study's data and showed that our strategy identified a subgroup with higher degrees of amyloid angiopathy and higher proportions with ≥1 APOE ε4 allele. 4 Here we evaluate data from five studies with people with late-onset Alzheimer's disease 5 and cognitively normal elderly controls. 6 We used modern psychometric approaches to co-calibrate cognitive scores. We used scores to identify subgroups.
We used genetic data to determine whether our categorization identifies biologically coherent subgroups.

Study design and participants
We used data from the Adult Changes in Thought (ACT) study, the Alzheimer's Disease Neuroimaging Initiative (ADNI), the Rush Memory and Aging Project (MAP) and Religious Orders Study (ROS), and the University of Pittsburgh Alzheimer Disease Research Center (PITT). Each study has published widely, and their genetic data are included in large analyses of late-onset Alzheimer's disease. 6,7 All five studies use the same research criteria to define clinical Alzheimer's disease. 5 Three studies (ACT, MAP, and ROS) are prospective cohort studies that enroll cognitively normal elderly individuals and follow them to identify incident dementia cases. For these, we analyzed cognitive data from the visit with the incident clinical Alzheimer's disease diagnosis. Two studies (ADNI and PITT) are clinic-based research cohort studies. For those studies we analyzed cognitive data from the first study visit for people with prevalent Alzheimer's disease; we limited inclusion to those with Clinical Dementia Rating Scale 8 of 0.5 or 1, indicating mild Alzheimer's disease. For people from ADNI or PITT who did not initially have Alzheimer's disease, we analyzed cognitive data from the incident Alzheimer's disease visit.
In each study, we included people diagnosed with late-onset Alzheimer's disease as defined by research criteria. 5 We used data from everyone for all analyses other than genetic analyses; we limited those to self-reported whites. For genetic analyses we also used data from self-reported white cognitively normal elderly controls from each study. Details on those cohorts are included in reports from the parent studies and in Supplemental Text 6 (derived from Lambert et al. 6 ).

Cognitive data procedures
Staff from each study administered a comprehensive neuropsychological test battery that included assessment of memory, executive functioning, language, and visuospatial functioning. We obtained granular ("item-level") data from each parent study. Each stimulus administered to a participant was deemed an "item". As outlined in our previous paper 2 , every item administered in each study was considered by our panel of experts (JM, ET, AJS). Our panel designated each item as primarily a measure of memory, executive functioning, language, visuospatial functioning, or none of these. They also assigned items to theory-driven subdomains.
We carefully considered items where the same stimulus was administered to participants across different studies to identify "anchor items" that could anchor metrics across studies. We reviewed response categories recorded by each study for these overlapping items to ensure that consistent scoring was used across studies.
We identified anchor items as those overlapping items with identical scoring across studies. We used bifactor models in Mplus 9 to co-calibrate separate scores for memory, executive functioning, language, and visuospatial functioning.
Details of item assignments, psychometric analyses in each study, and co-calibration methodology across studies are provided in Supplemental Texts 1, 2, and 5. Code is available from the authors upon request.
We used the ACT sample of people with incident Alzheimer's disease as our reference population for the purpose of scaling domain scores; ACT was our largest sample from a prospective cohort study of people with late-onset Alzheimer's disease. We z-transformed scores from other studies to ACT-defined metrics for each domain.

Assignment to subgroups
We assigned people to subgroups as we have done previously. 4 Briefly, for each person we determined their average score across memory, executive functioning, language, and visuospatial functioning. We determined differences between each domain score and this average score. We identified domains with substantial relative impairments as those with relative impairments at least as large as 0.80 standard deviation units as explained in Supplemental Text 3. We categorized people by the number of domains with substantial relative impairments (0, 1, or ≥2) and further categorized those with substantial relative impairments in a single domain by the domain with a substantial relative impairment. This approach results in six potential subgroups: those with no domain with a substantial relative impairment; those with an isolated substantial relative impairment in one of four domains (e.g., isolated substantial relative memory impairment, isolated substantial relative language impairment, etc.), and those with multiple domains with substantial relative impairments.

Statistical analyses
As in our previous study, 4 we compared the proportion of people with late-onset Alzheimer's disease in each subgroup with at least one APOE ε4 allele. For other genetic analyses we combined data from ROS and MAP, as has been done many times previously, and evaluated data separately in four genetic datasets. Each dataset was imputed using IMPUTE2 and samples of European ancestry from the 2012 build of the 1000 Genomes project. We excluded SNPs with R 2 or information scores < 0.5, and SNPs with a minor allele frequency <3%. Further details are provided in Supplemental Text 6 and in Lambert et al. 6 We used KING-Robust 10 to obtain study-specific principal components to account for population stratification. We used logistic regression in PLINK v 1.9 11 to evaluate associations at each SNP for each cognitively-defined subgroup. Cognitively normal elderly controls from each study were the comparison group for all of these analyses. We included covariates for age, sex, and principal components. We used METAL 12 for meta-analysis.
IGAP's most extreme odds ratio (OR) outside of chromosome 19 was for rs11218343 associated with SORL1, which had an OR of 0.77 reported in the Stage 1 and 2 meta-analysis. 6 We focused attention on SNPs where meta-analysis ORs for any cognitively-defined subgroup were <0.77 (or, equivalently, ORs >[1/0.77], which is >1.30). As presented in the Box, more extreme ORs in a single subgroup, with strong replication across genetic datasets, would represent strong support of biologically coherent categorization.
For genetic loci previously identified as associated with risk of late-onset Alzheimer's disease, we used the methods described in 13 applied to IGAP's previously reported ORs and confidence intervals to determine significance of subgroup associations compared to IGAP.
We used genetic data from all cognitively normal elderly controls and all people with Alzheimer's disease to generate Alzheimer's disease and subgroup genetic risk scores. We used 1) IGAP SNPs and effect sizes to generate Alzheimer's disease risk scores, and 2) our results to generate risk scores for each of the five subgroups.
To evaluate these risk scores, we used logistic regression with Alzheimer's disease case vs. control status as the outcome, and included terms for age and sex. We compared a model with just the addition of the IGAP genetic risk score to a model that incorporated that score plus the five subgroup risk scores. Finally, we compared area under the receiver operator characteristic (ROC) curves for the model with the IGAP risk score to a model that did not include that term but included terms for the five subgroup risk scores. Further details are shown in Supplemental Text 12.

Data sharing
Co-calibrated scores for each domain are available from the parent studies. GWAS meta-analysis summary statistics for each subgroup will be available on the National Institute on Aging Genetics of Alzheimer's Disease Storage Site (NIAGADS; www.niagads.org).

Role of the funding source
The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. All authors had full access to all the data in the study. The corresponding author had final responsibility for the decision to submit the publication.

Results
In all, there were 4 050 people with sufficient cognitive data to be classified into a subgroup. Demographic characteristics and average cognitive domain scores by study are shown in Table 1. Participants in the prospective cohort studies (ACT, MAP, and ROS) were older on average than those from ADNI and PITT. Most participants in each study self-reported white race (90% in ACT to 96% in MAP).
There was some variation in cognitive performance across studies. The most notable differences from ACT (our reference for scaling) were for executive functioning in ADNI and ROS (average 0.8 units higher), and for language in MAP (average 0.8 units lower).
Demographic characteristics of people in each subgroup were similar to those for people with Alzheimer's disease overall (Supplemental Table 29). The proportion who were female ranged from 51% for isolated substantial relative executive functioning impairment to 63% of those with isolated substantial relative visuospatial impairment. Mean age at diagnosis ranged from 79 for those with isolated substantial relative visuospatial or memory impairment to 82 for those with isolated substantial relative language impairment. Mean years of education did not vary substantially across subgroups.
There were 3 701 people with APOE genotype data ( Table 2). We published APOE results from ACT 4 ; the proportion of those with isolated substantial memory impairment with ≥1 APOE ε4 allele was 12% higher than overall in that study. This finding was consistent across all five studies. Overall, the proportion of people with ≥1 APOE ε4 allele was 15% higher those with isolated substantial memory impairment (65%) compared with the entire sample (50%). The differences in proportions with ≥1 APOE ε4 allele were highly significant (p=1.  Table 2a and 2b. All ORs were in the same direction for all these SNPs except rs28715896 on chromosome 2 near ERBB4, rs61835453 on chromosome 10, and rs365521 on chromosome 17. Heterogeneity p values did not suggest heterogeneity for any of these SNPs. No SNP outside the APOE region reached p<5×10 -8 , the traditional level of genome-wide significance (Figure 2  The area under the ROC curve for the model with age, sex, and the IGAP genetic risk score was 0.60 (95% CI 0.58, 0.61), while for the model with age, sex, and five subgroup genetic risk scores, the area under the ROC curve was 0.62 (95% CI 0.61, 0.64). This difference was statistically significant (χ 2 df=1 = 11.15, p=0.0008). Further analyses are described in Supplemental Text 12.

Discussion
We used modern psychometric methods to co-calibrate cognitive data to generate domain scores across five different studies of older adults with research-quality clinical Alzheimer's disease diagnoses. We obtained scores on the same metric, so scores from different studies were directly comparable to each other. The proportion of people with Alzheimer's disease in each study categorized in each subgroup varied across studies (Figure 1). We used genetic data to determine whether our categorization scheme resulted in biologically coherent subgroups. Top genetic associations from each subgroup were consistent across studies, suggesting our findings were not due to idiosyncrasies of any particular study. Gene scores for subgroups performed better in predicting case vs. control status than gene scores derived from Alzheimer's disease case-control analyses.
APOE genotype was significantly different across subgroups. We previously showed in ACT that more people with isolated substantial relative memory impairment had at least one APOE ε4 allele than people in other subgroups 4 . We robustly replicated that finding here ( Table 2). Associations between APOE ε4 alleles and memory impairment among people with Alzheimer's disease have been previously noted 14 .
Outside of chromosome 19 which was dominated by APOE-related signals for all subgroups, we identified 33 loci with p<10 -5 for at least one subgroup. All of these had ORs <0.77 or >1.30 (Figure 2). There were consistent findings across datasets for nearly all these loci (Supplemental Text 7-11).
Gene scores from IGAP SNPs explained a modest amount of risk for case-control status. Adding gene scores from our cognitively defined subgroups improved prediction of Alzheimer's disease. Furthermore, in a head-to-head comparison, gene scores for cognitively defined subgroups did a better job predicting Alzheimer's disease status than did gene scores from IGAP SNPs.
These data provide strong support for the biological coherence of subgroups produced by our categorization scheme. Each subgroup we analyzed has extreme ORs at novel SNPs that were consistent across multiple independent samples. Even with the relatively small sample sizes from these studies, the large effect sizes at common SNPs produced p values that are close to genome-wide significance.
Others have used different data sources to categorize people with Alzheimer's disease. Sweet and colleagues compared people with Alzheimer's disease who developed psychosis to those who did not. 15 They identified a few interesting loci, but effect sizes were much smaller than those reported here (Supplemental Text 13).
We were part of a consortium evaluating rates of decline among people with Alzheimer's disease. 16 The evidence in support of rates of decline, as an organizing characteristic among people with Alzheimer's disease, is not nearly as strong as that shown here for cognitively-defined subgroups.
Others have used cluster analysis approaches applied to neuropsychological 17,18 or imaging 19,20 data to categorize people with Alzheimer's disease. There are very important distinctions between those approaches and the approach adopted here.
In cluster analysis, the computer maximizes some distance across groups in a way that may not make clinical or biological sense. Disease severity is an important consideration; see 21 for a nice discussion.
Our approach began with theory and focused exclusively on cognitive data. An early paper considered differences between memory and executive functioning among people with Alzheimer's disease. 22 Differences between these scores enables memory to serve as something of a proxy for disease severity. This framework is useful for considering dysexecutive Alzheimer's disease. [23][24][25][26] We have extended that framework to incorporate additional cognitive domains. The field has increasingly emphasized the importance of Alzheimer's disease variants including primary progressive aphasia (PPA) and posterior cortical atrophy (PCA); 27 these rare subtypes are described as typically having early onset. Clinical descriptions of the cognitive patterns of these variants emphasize relative deficits between language (PPA) or visuospatial functioning (PCA) and other domains. We thus incorporate average performance across domains, and differences from that average, to more fully capture the range of clinical heterogeneity described in late onset Alzheimer's disease. 1 Our results should be considered mindful of limitations of our study. Data evaluated here are from studies with well-educated people of European ancestry. It will be important to replicate this approach among people with diverse genetic backgrounds. While we combined data from five large studies, the resulting subgroups were underpowered to reach genome-wide significance, and one subgroup (isolated substantial relative executive functioning impairment) was too small to analyze at all. It will be important to incorporate additional data sets to see whether novel suggestive loci reach genome-wide significance, and to identify additional loci. We used a large threshold of 0.80 SD to characterize "substantial" relative impairments, which may be too conservative. Our categorization approach relies exclusively on cognitive data. We could imagine a more optimal approach that also incorporates imaging and/or fluid biomarkers.
In conclusion, genome-wide genetic data enabled us to determine that a cognitivelydefined categorization scheme produced biologically coherent subgroups of people with Alzheimer's disease. This is an important result on the road towards personalized medicine.

Box. Schematic representation of incoherent vs. coherent subgrouping
The large group at the top represents a heterogeneous group of individuals. A strategy is applied to categorize individuals into subgroups. For a precision medicine approach to work, the categorization should reduce heterogeneity. In the lower left figure, the method did not reduce heterogeneity and thus, we refer to this as an incoherent subgrouping strategy. In contrast, the lower right figure was produced by a different method which resulted in relatively homogenous subgroups; this method would represent a coherent subgrouping strategy.
For incoherent subgroup comparisons with controls, top genetic hits and effect sizes would not be expected to be different than those observed in the entire group.
Further, for a given incoherent subgroup, spurious genetic associations at a locus would not be expected to replicate in that subgroup in other datasets. In contrast, for coherent subgroup comparisons with controls, there is improved potential for identification of novel loci and effect sizes could be stronger than those seen for the original ungrouped data. Further replication of subgroup associations in other datasets would occur more often than expected by chance.
Genetic data may serve a useful role in determining whether a categorization strategy produces biologically coherent subgroups.