Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# New insights into the genetic etiology of Alzheimer’s disease and related dementias

## Abstract

Characterization of the genetic landscape of Alzheimer’s disease (AD) and related dementias (ADD) provides a unique opportunity for a better understanding of the associated pathophysiological processes. We performed a two-stage genome-wide association study totaling 111,326 clinically diagnosed/‘proxy’ AD cases and 677,663 controls. We found 75 risk loci, of which 42 were new at the time of analysis. Pathway enrichment analyses confirmed the involvement of amyloid/tau pathways and highlighted microglia implication. Gene prioritization in the new loci identified 31 genes that were suggestive of new genetically associated processes, including the tumor necrosis factor alpha pathway through the linear ubiquitin chain assembly complex. We also built a new genetic risk score associated with the risk of future AD/dementia or progression from mild cognitive impairment to AD/dementia. The improvement in prediction led to a 1.6- to 1.9-fold increase in AD risk from the lowest to the highest decile, in addition to effects of age and the APOE ε4 allele.

## Main

AD is the most common form of dementia. The heritability is high, estimated to be between 60% and 80%1. This strong genetic component provides an opportunity to determine the pathophysiological processes in AD and to identify new biological features, new prognostic/diagnostic markers and new therapeutic targets through translational genomics. Characterizing the genetic risk factors in AD is therefore a major objective; with the advent of high-throughput genomic techniques, a large number of putative AD-associated loci/genes have been reported2. However, much of the underlying heritability remains unexplained. Hence, increasing the sample size of genome-wide association studies (GWASs) is an obvious solution that has already been used to characterize new genetic risk factors in other common, complex diseases (e.g., diabetes).

## Genetic overlap with other neurodegenerative diseases

We tested the association of the lead variants within our new loci with the risk of developing other neurodegenerative diseases or AD-related disorders (Supplementary Fig. 33 and Supplementary Tables 1012). We also performed more precise colocalization analyses (using Coloc R package, https://cran.r-project.org/web/packages/coloc/index.html) for five loci known to be associated with Parkinson’s disease (IDUA and CTSB), types of frontotemporal dementia (TMEM106B and GRN) and amyotrophic lateral sclerosis (TNIP1) (Supplementary Tables 13 and 14). The IDUA signal for Parkinson’s disease was independent of the signal in ADD (coloc posterior probability (PP)3 = 99.9%), but we were not able to determine whether the CTSB signals colocalized. The TMEM106B and GRN signals in frontotemporal lobar degeneration with TAR DNA-binding protein (TDP-43) inclusions (frontotemporal lobar degeneration TDP) probably share causal variants with ADD (coloc PP4 = 99.8% and coloc PP4 = 80.1%, respectively). Lastly, we were not able to determine whether the TNIP1 signals colocalized for ADD and amyotrophic lateral sclerosis.

## Pathway analyses

Next, we sought to perform a pathway enrichment analysis on the stage I association results to gain better biological understanding of this newly expanded genetic landscape for ADD. Ninety-three gene sets were still statistically significant after correction for multiple testing (q ≤ 0.05; Methods and Supplementary Table 15). As described previously, the most significant gene sets are related to amyloid and tau5; other significant gene sets are related to lipids, endocytosis and immunity (including macrophage and microglial cell activation). When restricting this analysis to the meta-analysis based on the clinically diagnosed AD cases, 54 gene sets were significant (q ≤ 0.05). Of these 54 gene sets, 33 reached q ≤ 0.05 in the stage I analysis and all reached P ≤ 0.05. This indicates that the inclusion of proxy-ADD cases does not cause disease-relevant biological information to be missed and underlines the additional power of this type of analysis.

We next performed a single-cell expression enrichment analysis by using the average gene expression per nucleus (Av. Exp.) data in the human Allen Brain Atlas (49,495 nuclei from 8 human brains). Only the microglial expression reached a high level of significance (P = 1.7 × 10−8; Supplementary Table 16); greater expression corresponded to a more significant association with ADD. After adjusting for microglial Av. Exp., the remaining associations became nonsignificant; this indicates that microglial Av. Exp. drives all the other cell-type associations. These results were observed whatever the brain region studied (Supplementary Table 16). A similar result was observed using a mouse single-cell dataset14 (Supplementary Table 17 and Supplementary Note).

Lastly, we looked at whether the relationship between an elevated microglia Av. Exp. and a genetic association with the ADD risk was specific to particular biological processes (Supplementary Table 18) by analyzing the interaction between microglia Av. Exp. and pathway membership in MAGMA15. Of the five most significant interaction signals (q ≤ 10−3), two were directly associated with endocytosis processes (GO:0006898 and GO:0031623); this suggested a functional relationship between microglia and endocytosis, which is known to be involved in phagocytosis (Supplementary Table 18). It is noteworthy that we also detected an interaction between GO:1902991 (regulation of amyloid precursor protein (APP) catabolic process) and the gene expression level in microglia (q = 1.4 × 10−3; Supplementary Table 18). Even though these data suggest a functional relationship between microglia and APP/amyloid beta (Aβ) peptide pathways, this observation reinforces the likely involvement of microglial endocytosis in AD, a mechanism that is also strongly involved in APP metabolism16. Of note, there are overall similarities in the interaction effects of human and mouse microglia expression with genes in biological pathways of relevance to the AD genetic risk (Supplementary Table 18 and Supplementary Note).

## Gene prioritization

We next attempted to identify the genes most likely to be responsible for the association signal with ADD at each new locus. To this end, we studied the downstream effects of ADD-associated variants on molecular phenotypes (i.e., expression, splicing, protein expression, methylation and histone acetylation) in various cis-quantitative trait locus (cis-QTL) catalogues from AD-relevant tissues, cell types and brain regions. We investigated the genetic colocalization between association signals for the ADD risk and those for the molecular phenotypes and the association between the ADD risk and these phenotypes by integrating cis-QTL information into our ADD GWAS. Moreover, we considered the lead variant annotation (the allele frequency, protein-altering effects and nearest protein-coding gene) and a genome-wide, high-content short interfering RNA screen for APP metabolism17. Based on this evidence, we developed a systematic gene prioritization strategy that yielded a total weighted score of between 0 and 100 for each gene (Supplementary Fig. 34 and Supplementary Note). This score was used to compare and prioritize genes in the new loci within 1 Mb upstream and 1 Mb downstream of the lead variants. Genes either were ranked as tier 1 (greater likelihood of being the causal risk gene responsible for the ADD signal) or tier 2 (lower likelihood and the absence of a minimum level of evidence as a causal risk gene) or were not ranked.

From all newly identified loci, this gene prioritization yielded 31 tier 1 genes and 24 tier 2. The 55 prioritized genes, the details of the analyses and the supporting evidence are summarized in Fig. 2a and the Supplementary Note (Supplementary Tables 1930 and Supplementary Figs. 3545). Among the 31 tier 1 genes, we observed that 25 of these genes were the only prioritized gene in their respective locus. For the remaining 6 tier 1 genes, we also found tier 2 genes in their respective locus. We also identified five loci containing several tier 2 prioritized genes. In one of these loci, locus 39 (L39), the tier 2 prioritized gene LILRB2 had strong additional support from published literature (Supplementary Note). In five loci, our prioritization score did not identify sufficient molecular evidence to prioritize genes with exception of being the nearest gene (L10, L12, L13, L14 and L32). Finally, we excluded the complex IGH cluster (L27) from gene prioritization analyses due to genomic complexity of the telomeric locus as a consequence of known fusion events18.

We highlight two examples, L18 and L23. In L18, the lead variant, rs76928645 (MAF = 10%), is intergenic and is located more than 100 kb downstream or upstream of the two nearest protein-coding genes (SEC61G and EGFR, respectively). Our gene prioritization analyses suggested that EGFR was the only risk gene (Fig. 3). We found that both the lead variant (rs76928645) and the other nearby variants in linkage disequilibrium (LD) are significant expression QTLs (eQTLs) for regulating EGFR expression downstream. The eQTL signals in brain strongly colocalized with the GWAS signal (with eQTL coloc PP4s of 98.3% in the temporal cortex (TCX) and 99.5% in the dorsolateral prefrontal cortex (DLPFC)). Accordingly, the fine-mapped expression transcriptome-wide association study (eTWAS) associations (Fine-mapping Of CaUsal gene Sets (FOCUS) posterior inclusion probability (PIP) = 1; eTWAS P = 6.9 × 10−9, eTWAS Z = + 5.8 in the TCX; eTWAS P = 3.1 × 10−11, eTWAS Z = + 6.6 in the DLPFC) indicated that genetic downregulation of EGFR expression is associated with a lower ADD risk (Fig. 3; Supplementary Tables 22, 24 and 26; and Supplementary Figs. 36a, 39 and 41).

In L23, we observed numerous eQTL-GWAS and methylation QTL (mQTL)-GWAS hits for TSPAN14 that support the hypothesis that increased brain expression of TSPAN14 is associated with increased ADD risk. We also identified several splice junctions in TSPAN14 whose genetic regulation signals in lymphoblastoid cell lines (LCLs) and brain colocalized with the ADD association signal. These splice junctions were also associated with ADD risk (Fig. 4, Supplementary Tables 2228 and Supplementary Figs. 3642 and 44c). As three of these splice junctions were related to new complex cryptic splicing events that were predicted to result in two cryptic exons not previously described in known TSPAN14 transcripts (based on GENCODE v38), we designed a long-read single-molecule (Nanopore) sequencing experiment (Supplementary Note) to validate these cryptic exons on a total of 93 complementary DNA (cDNA) samples derived from LCLs, frontal cortex and hippocampus and consequently validated those cryptic exons (Fig. 4). All three of the validated cryptic splicing events occur within the ADAM10-interacting domain of TSPAN14. Cryptic exon 1 is at least 45 bp long, and cryptic exon 2 is 118 bp long.

Lastly, we used STRING v11 (ref. 19) to analyze protein–protein interaction for (1) previously known AD genes from GWASs, (2) our prioritized new genes (tier 1 in Fig. 2a and Supplementary Table 20) and (3) a combination of the two (Supplementary Note). The largest networks contained 14, 8 and 30 proteins, respectively (Supplementary Fig. 46). These networks were larger than would be expected by chance (respectively, P < 2 × 10−5, P = 2.8 × 10−3 and P < 2 × 10−5 based on comparison with 50,000 randomly simulated protein lists matched for the number of proteins and the total number of interactions for each protein). Notably, the number of interactions between our prioritized genes and previously known genes is also significantly greater than would be expected (P < 1 × 10−4), indicating that the newly prioritized genes are biologically relevant in AD. No such enrichment (P = 0.88) was observed for the remaining genes in the new loci, again highlighting the value of our prioritization approach.

We next performed a pathway enrichment analysis of the tier 1 genes using STRING. We found that several gene sets linked to the immune system remained statistically significant after correction for multiple testing (Fig. 2b and Supplementary Table 31), especially regulation of the tumor necrosis factor (TNF)-mediated signaling pathway (GO:0010803). We report the potential genetic implication of the linear ubiquitin chain assembly complex (LUBAC), which is a major regulator of the aforementioned signaling pathway20. Two of the LUBAC’s three complements are encoded by the new tier 1 prioritized genes SHARPIN and RBCK1, and the complex’s function is directly regulated by OTULIN (also a new tier 1 prioritized gene).

## GRS

We next looked at whether the genetic ADD burden (as measured by a genetic risk score (GRS)) generated from our genome-wide significant variants (n = 83, excluding APOE; Supplementary Table 32) might influence the rate of conversion to AD in (1) individuals from several prospective, population-based cohorts and (2) patients with mild cognitive impairment (MCI) in prospective memory clinic studies (Supplementary Table 33). We used Cox regression models to assess the association after adjustment for age at baseline, sex, the number of APOE-ε4 and APOE-ε2 alleles, and genetic principal components (PCs).

In population-based cohorts with clinically diagnosed AD cases, the GRS was significantly associated with conversion to AD; this was shown in a fixed-effect meta-analysis (hazard ratio (HR) (95%CI) per average risk allele = 1.076 (1.064–1.088), P = 9.2 × 10−40; Fig. 5 and Supplementary Table 34). Likewise, the GRS was significantly associated with AD conversion in patients with MCI (HR = 1.056 (1.040–1.072), P = 2.8 × 10−12; Fig. 5 and Supplementary Table 35). Furthermore, we found that the GRS association increased significantly when the new variants discovered in the present study were added to the previously described variants (Supplementary Table 36) for both population-based studies (HR = 1.052 (1.037–1.068), P = 1.5 × 10−11) and MCI cohorts (HR = 1.034 (1.013–1.055), P = 1.4 × 10−3).

Importantly, the results of our meta-analysis suggest that the risk of conversion to AD rises with the number of risk alleles from non-APOE risk variants in the GRS by 1.9-fold in population-based cohorts (HR = 1.93 (1.75–2.13); Fig. 5) and 1.6-fold in MCI cohorts (HR = 1.63 (1.42–1.87); Fig. 6) on top of effects of age and the APOE ε4 allele. These observations result from the comparison of hypothetical individuals with a GRS value at the first decile of the distribution versus those with a GRS value at the ninth decile (Fig. 6). With regard to APOE, carrying an additional APOE-ε4 allele was associated with a slightly higher increase in the AD risk in population-based cohorts (HR = 2.19 (2.03–2.37)) and MCI cohorts (HR = 1.90 (1.73–2.07)). There was no interaction between the GRS and the number of APOE-ε4 alleles (Supplementary Table 37).

In an MCI cohort setting, this effect of the GRS corresponds to a median AD conversion probability within 3 years of 21.9% in patients with a GRS below the first decile (range, 4.1–34.9%) and 37.5% (range, 10.8–56.2%) in patients with a GRS above the ninth decile. There was a consistent increase in probability between these deciles in all cohorts (median (range), 13.8% (6.6–25.0%); Supplementary Table 38).

To better define the GRS discriminative ability regarding AD conversion, we assessed the improvements in three indices of predictive performance after adding the GRS to a Cox model containing age, sex, PCs and the number of APOE-ε4 and APOE-ε2 alleles as covariates (Supplementary Tables 34 and 35). We found a small but consistent increase in the discrimination between AD converters and nonconverters, as indicated by the concordance index (C-index) in population-based cohorts (Δ5years-C-indexfixed-effects = 0.002 (0.0004–0.004)) and MCI cohorts (Δ3years-C-indexfixed-effects = 0.007 (0.001–0.012)). This finding was further supported by small-to-moderate increases in the continuous NRI (net reclassification improvement) index in population-based cohorts (NRI5year-fixed-effects = 0.248 (0.159–0.336)) and MCI cohorts (NRI3year-fixed-effects = 0.232 (0.140–0.325)); this indicates that the risk assignment is more appropriate to individuals when the GRS is taken into account21. Furthermore, an increase in the index of prediction accuracy (IPA) was observed in all of the population-based cohorts (average Δ5years-IPAfixed-effects = 0.29% (0.23%–0.35%)) and all but one of the MCI cohorts (average Δ3years-IPAfixed-effects = 1.53% (1.31%–1.76%)), indicating an overall improvement in predictive performance. As expected, the amount of improvement in this index varied greatly from one cohort to another, given its dependency on incidence rates. The value of adding the new genetic variants was emphasized by the fact that effect sizes (as measured by the indices of predictive ability) were lower when only previously known AD risk variants were included in the GRS (Supplementary Table 39).

The results were similar when we (1) computed indices for other follow-up time points, (2) applied a random effects meta-analysis, (3) considered conversion to all-cause-dementia as the outcome and (4) excluded the Framingham Heart Study (FHS), as it was part of the stage II of the GWAS from which ORs for PRS computation were extracted (Supplementary Tables 3444 and Supplementary Fig. 47).

## Discussion

Our meta-analysis combined a large, new case–control study with previous GWASs. We identified 75 independent loci for ADD; 33 had been reported previously, and 42 correspond to new signals at the time of this analysis. The prioritized genes and their potential impact on the pathophysiology of AD are described in the Supplementary Note.

Our pathway enrichment analyses removed ambiguities concerning the involvement of tau-binding proteins and APP/Aβ peptide metabolism in late-onset AD processes at a much higher level than had been described previously5. It is noteworthy that new genetic risk factors are often first evaluated in the context of known pathways; many new research approaches were developed to systematically characterize putative links among APP metabolism, tau function and ADD genetic risk factors22,23. This approach can lead to circular reasoning and thus artificial enrichment in specific processes. However, we implicate ADAM17, a gene whose protein product is known to carry α-secretase activity as ADAM10 (ref. 24). This observation suggests that the nonamyloidogenic pathway for APP metabolism might be deregulated in AD. In addition to APP, we also identified six highly plausible prioritized (tier 1) genes (ICA1L, DGKQ, ICA1, DOC2A, WDR81 and LIME1) that are likely to modulate the metabolism of APP.

These pathway enrichment analyses also confirmed the involvement of innate immunity and microglial activation in ADD (Supplementary Table 15). Our single-cell expression enrichment analysis also highlighted genes expressed in microglia (Supplementary Tables 16 and 17). Indeed, three of our prioritized (tier 1) genes (RHOH, BLNK and SIGLEC11) and two of our tier 2 genes (LILRB2 and RASGE1FC) appeared to be mainly expressed in microglia (>90% relative to the total expression summed across cell types; Fig. 2a and Supplementary Table 45). Importantly, SIGLEC11 and LILRB2 have already been linked to Aβ peptides/amyloid plaques25,26.

Here, we also provide genetic evidence of the LUBAC’s potential implication in ADD. Two of the LUBAC’s three complements are encoded by SHARPIN and RBCK1, and the LUBAC is regulated by OTULIN; all three genes were found to be high-confidence, prioritized risk genes in our study. The LUBAC is the only E3 ligase known to form linear ubiquitin chains de novo through ubiquitin’s N-terminal methionine. The complex has mostly been studied in the context of inflammation, innate immunity and defense against intracellular pathogens. For instance, the LUBAC is reportedly essential for NLRP3 inflammasome activation27 and thus acts as a key innate immune regulator28. In turn, the NLRP3 inflammasome is essential for the development and progression of Aβ pathology in mice29 and may drive tau pathology through Aβ-induced microglial activation30. The LUBAC is also reportedly involved in autophagy, and linear ubiquitin chain modifications of TDP-43-positive neuronal cytoplasmic inclusions have been described as potential inducers of autophagic clearance31. Lastly, the LUBAC has been studied as a regulator of TNF-α signaling in particular20.

Interestingly, the TNF-α signaling pathway was also flagged by other genetic findings in our study (Supplementary Fig. 48). For example, ADAM17 (also known as TNF-α-converting enzyme) is of pivotal importance in the activation of TNF-α signaling32. For TNIP1, its gene product (TNF-α-induced protein 3-interacting protein 1) is involved in the inhibition of the TNF-α signaling pathway and nuclear factor κB activation/translocation33. Additional signal related to TNF-α is the one found at SPPL2A (one of the 33 confirmed loci). The protein encoded by SPPL2A is involved in noncanonical shedding of TNF-α34, and PGRN has been described as a TNF receptor ligand and an antagonist of TNF-α signaling35. Several lines of evidence had linked the inhibition of TNF-α signaling with reduction of both Aβ and tau pathologies in vivo36,37. Although a potential inflammatory connection has been suggested for TNF-α through the activation of NLRP3 inflammasome38, the TNF-α signaling pathway is also involved in many other brain physiological functions (e.g., synaptic plasticity in neurons) and pathophysiological processes (e.g., synapse loss) in the brain39. Furthermore, the involvement of the TNF-α signaling pathway and the LUBAC might be important in cell types other than microglia in AD. It is important to note that six of our prioritized (tier 1) genes (ICA1L, EGFR, RITA1, MYO15A, LIME1 and APP) are expressed at a low level in microglia (<10%, relative to the total expression summed across cell types; Supplementary Table 45), emphasizing that ADD results from complex crosstalk between different cell types in the brain23,40. It is also noteworthy that the EGFR pathway is known to interact with the TNF-α signaling pathway41, which suggests interplay between the two signaling pathways during the ADD development.

A better understanding of the etiology of ADD might also result from the observation that the risks of developing ADD and frontotemporal dementia are associated with the same causal variants in GRN and TMEM106B. This association might be due to the misclassification of clinical diagnosis of AD and the presence of proxy-ADD cases in the UKBB. However, GRN and TMEM106B have also been linked to brain health and many other neurodegenerative diseases. For instance, GRN and TMEM106B are reportedly potential genetic risk factors for differential aging in the cerebral cortex42 and cognitive impairment in amyotrophic lateral sclerosis43 and Parkinson’s disease44,45. Lastly, both GRN and TMEM106B have already been associated with neuropathological features of AD46,47,48. Taken as a whole, these data may thus emphasize a potential continuum between neurodegenerative diseases in which common pathological mechanisms are driven by GRN and TMEM106B. Interestingly, both GRN and TMEM106B are reported to be involved in defective endosome/lysosome trafficking/function49,50, a defect that is also observed in AD.

In conclusion, we have validated 33 previous loci, doubled the total number of genetic loci associated with the ADD risk, expanded our current knowledge of the pathophysiology of ADD, identified new opportunities for the development of GRSs and gene-specific treatments and opened up a pathway to translational genomics and personalized medicine.

## Methods

### Samples

All of our stage I meta-analysis samples came from the following consortia/datasets: EADB, GR@ACE, EADI, GERAD/PERADES, DemGene, Bonn, the Rotterdam study, the CCHS study, NxC and the UKBB. In the UKBB, individuals who did not report dementia or any family history of dementia were used as controls; the analysis included 2,447 diagnosed cases, 46,828 proxy cases of dementia and 338,440 controls. All individuals included in stage I are of European ancestry; demographic data on these case–control studies are summarized in Supplementary Table 1, and more detailed descriptions are available in the Supplementary Note. Stage II samples are from the ADGC, CHARGE and FinnGen consortia (Supplementary Table 1 and Supplementary Note) and are described in detail elsewhere5,6,9,10,54,55,56. Written informed consent was obtained from study participants or, for those with substantial cognitive impairment, a caregiver, legal guardian or other proxy. Study protocols for all cohorts were reviewed and approved by the appropriate institutional review boards.

### Quality control and imputation

A standard quality control was performed on variants and samples from all datasets individually. The samples were then imputed with the TOPMed reference panel57,58. The Haplotype Reference Consortium (HRC) panel59 was also used for some datasets (Supplementary Table 2). For the UKBB, we used the provided imputed data generated from a combination of the 1000 Genomes, HRC and UK10K reference panels (Supplementary Note).

### Stage I analyses

Tests of the association between clinical or proxy-ADD status and autosomal genetic variants were conducted separately in each dataset by using logistic regression and an additive genetic model, as implemented in SNPTEST 2.5.4-beta3 (ref. 60) or PLINK v1.90 (ref. 4). However, a logistic mixed model (as implemented in SAIGE v0.36.4 (ref. 61)) was considered for the UKBB data. We analyzed the genotype probabilities in SNPTEST (using the newml method) and dosages in PLINK and SAIGE. Analyses were adjusted for PCs and genotyping centers, when necessary (Supplementary Table 2). For the UKBB dataset, only variants with a MAF above 0.01% and a minor allele count (MAC) above 3 were analyzed, and effect sizes and standard errors were corrected by a factor of two, because proxy cases were analyzed7. This approach is appropriate for variants with a moderate-to-high frequency and a small effect size. For all datasets, we filtered out duplicated variants and variants with (1) missing data on the effect size, standard error or P value; (2) an absolute effect size above 5; (3) an imputation quality below 0.3; and (4) a value below 20 for the product of the MAC and the imputation quality (MAC-info score). For datasets not imputed with the TOPMed reference panel, we also excluded (1) variants for which conversion of position or alleles from the GRCh37 assembly to the GRCh38 assembly was not possible or problematic or (2) variants with very large difference of frequency between the TOPMed reference panel and the reference panels used to perform imputation.

Results were then combined across studies in a fixed-effect meta-analysis with an inverse-variance weighted approach, as implemented in METAL v2011-03-25 software62. We filtered out (1) variants with a heterogeneity P value below 5 × 10−8, (2) variants analyzed in less than 20% of the total number of cases and (3) variants with frequency amplitude above 0.4 (defined as the difference between the maximum and minimum frequencies across all the studies). We also excluded variants not analyzed in the EADB-TOPMed dataset.

The genomic inflation factor lambda was computed with the GenABEL 1.8-0 R package63 and a median approach after exclusion of the APOE region (44–46 Mb on chromosome 19 in GRCh38). The LD score regression intercept was computed with LDSC v1.0.1 software using the ‘baselineLD’ LD scores built from 1000 Genomes phase 3 (ref. 64). The analysis was restricted to HapMap 3 variants and excluded multiallelic variants, variants without an rs ID and variants in the APOE region.

### Definition of associated loci

A region of ±500 kb was defined around each variant with a stage I P value below 1 × 10−5. These regions were then merged (using bedtools v2.27.0 software; https://bedtools.readthedocs.io/en/latest/) to define nonoverlapping regions. The region corresponding to the APOE locus was excluded. We then used the PLINK clumping procedure to define independent hits in each region. An iterative clumping procedure was applied to all variants with a stage I P value below 1 × 10−5, starting with the variant with the lowest P value (referred to as the index variant). Variants with a stage I P value below 1 × 10−5, located within 500 kb of this index variant and in LD with the index variant (r2 above 0.001) were assigned to the index variant’s clump. The clumping procedure was then applied until all the variants had been clumped. LD in the EADB-TOPMed dataset was computed using high-quality (probability ≥0.8) imputed genotypes.

### Stage II analyses

Variants with a stage I P value below 1 × 10−5 were followed up (Supplementary Note). Results were combined across all stage I and II studies in a fixed-effect meta-analysis with an inverse variance weighted approach, as implemented in METAL. In each clump, we then reported the variants with positive follow-up results (i.e., the same direction of effect in stage I and stage II, and a stage II P value below 0.05) and the lowest P value in the meta-analysis. Those variants were considered to be associated at the genome-wide significance level if they had a P value below 5 × 10−8 in the stage I and II meta-analysis. However, we excluded the chr6:32657066:G:A variant, because its frequency amplitude was high.

### Pathway analysis

A total of 10,271 gene sets were considered for analysis (Supplementary Note). Gene set enrichment analyses were performed in MAGMA v1.08 (refs. 65,66), with correction for the number of variants in each gene, LD between variants and LD between genes. LD was computed from the EADB-TOPMed dataset using high-quality (probability ≥0.9) imputed genotypes. The measure of pathway enrichment was the MAGMA ‘competitive’ test (in which the association statistic for genes in the pathway is compared with those for all other protein-coding genes), as recommended by De Leeuw et al.67. We used the ‘mean’ test statistic, which uses the sum of −log(variant P value) across all genes. The primary analysis assigned variants to genes if they lay within the gene boundaries, although a secondary analysis used a window of 35 kb upstream and 10 kb downstream to assign variants to genes (as in Kunkle et al.5). The primary analysis included all variants with an imputation quality above 0.8. We used q values68 to account for multiple testing.

### Expression in various cell types

The expression of genes was assigned to specific cell classes of the adult brain, as described previously69. Briefly, middle temporal gyrus single-nucleus transcriptomes from the Allen Brain Atlas dataset (49,555 total nuclei derived from 8 human tissue donors aged 24–66 years) were used to annotate and select six main cell classes using Seurat 3.1.1 (ref. 70): glutamatergic neurons, GABAergic neurons, astrocytes, oligodendrocytes, microglia and endothelial cells. Enrichment analyses were performed by using the mean gene expression per nucleus for each cell type relative to the total expression summed across cell types as a quantitative covariate in a MAGMA gene property analysis.

### Functional interpretation of GWAS signals and gene prioritization

To prioritize candidate genes in the new loci, we systematically searched for evidence for these genes in seven different domains: (1) variant annotation, (2) eQTL-GWAS integration, (3) sQTL-GWAS integration, (4) protein QTL (pQTL)-GWAS integration, (5) mQTL-GWAS integration, (6) histone acetylation QTL (haQTL)-GWAS integration and (7) APP metabolism. On the basis of this evidence, we then defined a gene prioritization score of between 0 and 100 for each candidate gene (Supplementary Fig. 34). Detailed information on the domains, categories (e.g., the tissue or cell type for QTL-GWAS integration domains) and subcategories (for the type of evidence) is given in Supplementary Table 19. A brief summary of how evidence was assessed in each domain is provided below, together with a detailed description of the gene prioritization strategy.

#### Candidate genes

We considered protein-coding candidate genes within a ±1-Mb window of the new lead variants. The genes in overlapping loci (i.e., L28, L30 and L37) were assigned to their respective loci based on proximity to the lead variants, and the distal genes were not considered for gene prioritization in the investigated loci. Moreover, we did not perform gene prioritization in the complex IGH gene cluster locus (L27), as this telomeric region contains complex splicing events (spanning a high number of IGH genes) that probably result from known fusion events18.

#### The variant annotation domain

In this domain, we determined whether the candidate gene was the nearest protein-coding gene to the lead variant and/or whether the lead variant was a rare variant (MAF < 1%) and/or protein-altering variant of the investigated candidate gene.

#### APP metabolism domain

We assessed the functional impact of gene underexpression on APP metabolism for all candidate genes based on a genome-wide high-content short interfering RNA screen17 (Supplementary Note).

#### Gene prioritization score

We computed a gene prioritization score for each candidate gene as the weighted sum of the evidence identified in the seven domains. We specified a weight for each type of evidence, as detailed in Supplementary Table 19. For the molecular QTL-GWAS integration domains, we gave more weight to replicated hits (i.e., evidence in several datasets) than to single hits. We also gave more weight to hits observed in brain (the bulk brain and microglia datasets) than to hits observed in other tissues/cell types (LCLs, monocytes, macrophages and blood). To avoid score inflation, several specific rules were applied: (1) for the results of sQTL- and mQTL-based analyses, multiple splice junctions or CpGs annotated for the same genes were aggregated prior to weighting due to correlated data; (2) if we observed a fine-mapped eTWAS association for a gene, its other significant (but not fine-mapped) eTWAS associations were not considered; (3) for genes having several significant CpGs (prior to aggregation) in MetaMeth analyses, the associated CpGs with a low (<75% percentile) blood–brain methylation correlation estimate were not considered if the gene also had associated CpGs with a high (≥75% percentile) blood–brain methylation correlation estimate.

#### Gene prioritization strategy

After obtaining a total weighted score per gene, we ranked genes per locus according to their prioritization scores and compared the relative score differences between the highest ranked gene and other genes in the investigated locus. If this relative difference was at least 20% and the gene prioritization score for the highest ranked gene was ≥4, then we classified this gene as a tier 1 prioritized gene in the investigated locus (i.e., a greater likelihood of being the true risk gene responsible for the ADD signal). If this absolute threshold was not met, then the highest ranked gene was classified as a tier 2 prioritized gene (i.e., a lower level of confidence and absence of the minimum level of evidence for a true risk gene). Furthermore, other genes in a locus harboring a tier 1 gene were classified as tier 2 prioritized genes if the relative score difference versus the highest ranked (tier 1) gene was between 20% and 50%. Lastly, when the relative score difference between the highest ranked gene and other genes in the same locus was <20%, then both the highest ranked gene and all genes with a score difference <20% were classified as tier 2 prioritized genes in the investigated locus; based on the current evidence, it is difficult to prioritize two or more similarly scored genes. The gene prioritization strategy is summarized in Supplementary Fig. 34. Detailed descriptions and discussions of prioritized genes and tier levels in each investigated new locus can be found in the Supplementary Note.

#### GRS analysis

Eight longitudinal MCI cohorts and seven population-based studies were included in the analysis and are fully described in the Supplementary Note and Supplementary Table 33. The GRSs were calculated as previously described90. Briefly, we considered variants with genome-wide significant evidence of association with ADD in our study. We did not include any APOE variants in the GRS. Variants were directly genotyped or imputed (R² ≥ 0.3). Imputation was performed using the HRC panel59 for subcohorts from the Rotterdam study and the TOPMed panel for the other cohorts57. For HRC-imputed data, LD proxies were considered for variants that were not available in this reference panel. The GRS was calculated as the weighted average of the number of risk-increasing alleles for each variant, using dosages. Weights were based on the respective log(OR) obtained in stage II. The GRS was then multiplied by the number of included variants. Thus, the HR measured the effect of carrying one additional average risk allele.

To assess whether the new variants in this study contribute to the risk of conversion to AD (in addition to known AD genes), we calculated two GRSs: one based solely on variants known before this study (GRSknown, n = 39; Table 1) and another based on variants identified in the present study (GRSnovel, n = 44; Table 2). These GRSs were calculated in the same way as the GRS encompassing all the variants.

The association between the GRS and the risk of progression to dementia in individuals from population-based cohorts or patients with MCI from memory clinics was tested statistically using Cox proportional hazards models. The models were adjusted for age, sex, the first four PCs (to correct for potential population stratification) and the number of APOE-ε4 and APOE- ε2 alleles (assuming an additive effect). In the FHS study, the generation was used as an additional covariate. In the 3C study, the analysis was adjusted for age, sex, the number of APOE alleles, the two first PCs and center. The PCs used were generated for each cohort, using the same variants as in the case/control study’s PC analysis. The number of APOE-ε4 alleles was obtained from direct genotyping or, if missing, the genotypes (with probability >0.8) derived from the TOPMed imputations. The interaction between the GRS and the number of APOE-ε4 alleles was tested on the multiplicative scale. In the primary analysis, conversion to AD was used as the outcome (conversions to non-AD dementias were coded as being censored at time of conversion), but analyses were repeated using all-cause dementia as the outcome.

To quantify the effect size of the potential association between the GRS and conversion to dementia regarding predictive performance, we computed three different indices measuring different aspects of the predictive performance of the GRS in our prospective, longitudinal cohort studies91: the continuous version of the C-index,92,93 the continuous NRI94 and IPA95 (Supplementary Note). For all indices, we provide point estimates and 95% CIs.

In the main analysis, indices were computed at the time point for which all cohorts in a specific setting (i.e., population-based studies or memory clinics, respectively) provided follow-up observations (that is 5 years for population-based cohorts and 3 years for MCI cohorts). In a sensitivity analysis, indices for longer or shorter follow-up periods were also derived (that is 3 years and 10 years for population-based cohorts and 5 years for MCI cohorts). Standard errors for indices were derived by non-parametric bootstrapping with 1,000 samples.

To determine the average effect of the GRS across the various cohorts examined, individual cohort results were subjected to both inverse-variance weighted meta-analyses (primary analyses) and random effects meta-analysis (Supplementary Note). To facilitate comparisons of results for different time points, cohorts with longer follow-up periods were meta-analyzed separately. Furthermore, two memory clinic cohorts with a limited sample size (N < 50) were excluded to assess their impact on the final meta-analysis results. Meta-analyses were performed using the ‘metafor’ (3.0.2) R package96.

To further illustrate the clinical relevance of the GRS, we pooled computed GRSs across four population-based cohorts (3C, AgeCoDe, VITA and MAS) and computed deciles of the GRS distribution for use as a common reference for all cohorts. We then computed the increase in risk when augmenting the GRS value from the first decile (GRS = 50.76) to the ninth decile (GRS = 59.74) of the distribution. To represent this risk increase in the HR, we rescaled the HR derived from our meta-analyses results using the equation $$e^{\log\left( {\rm HR} \right) \ast \left( {{\rm{GRS9th}}_{\rm{decile}}} - {{\rm{GRS1st}}_{\rm{decile}}} \right)}$$. Importantly, this approach yields exactly the same results as transforming the GRS so that a one unit increment corresponds to the increase from the lowest decile to the highest decile.

Furthermore, we approximated the probability of conversion to AD at 3 and 5 years in memory clinic patients with MCI by using Cox models implemented in the ‘PredictCox’ function from the ‘riskRegression’ (2020.12.8) R package97. We did not derive AD conversion probabilities for two cohorts with very small sample sizes (N < 50). Predicted AD conversion probabilities were derived and averaged for all patients in each of the groups formed by the decile of the GRS distribution in each cohort. The difference between the groups with the highest and lowest GRSs was computed in each cohort. We report the median (range) results in each group formed by the GRS deciles.

### Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

## Data Availability

Genome-wide summary statistics have been deposited to the European Bioinformatics Institute GWAS Catalog (https://www.ebi.ac.uk/gwas/) under accession no. GCST90027158.

The significant eQTLs/sQTLs mapped and eTWAS/sTWAS functional reference panel weights generated for this study (in AD-relevant bulk brain regions from AMP-AD cohorts and in LCLs from the EADB Belgian cohort) are publicly available at https://doi.org/10.5281/zenodo.5745927 and https://doi.org/10.5281/zenodo.5745929.

Anonymized aligned reads of the amplicon-based long-read Nanopore cDNA sequencing experiment conducted for the TSPAN14 splicing analysis are available through the European Nucleotide Archive under accession PRJEB49234.

Moreover, the following data used in the gene prioritization are publicly available:

MayoRNAseq whole-genome sequencing variant call formats (WGS VCFs) (https://www.synapse.org/#!Synapse:syn11724002);

ROSMAP WGS VCFs (https://www.synapse.org/#!Synapse:syn11724057);

MSBB WGS VCFs (https://www.synapse.org/#!Synapse:syn11723899);

eQTLGen (https://www.eqtlgen.org/);

eQTL Catalogue database (https://www.ebi.ac.uk/eqtl/);

Brain xQTL serve (http://mostafavilab.stat.ubc.ca/xqtl/);

GTEx v8 eQTL and sQTL catalogs (https://www.gtexportal.org/);

GTEx v8 expression and splicing prediction models (http://predictdb.org/);

MiGA eQTLs (https://doi.org/10.5281/zenodo.4118605);

MiGA sQTLs (https://doi.org/10.5281/zenodo.4118403);

MiGA meta-analysis (https://doi.org/10.5281/zenodo.4118676); and

Wingo et al.84 pQTL data (https://www.synapse.org/#!Synapse:syn23627957).

## Code availability

We used publicly available software for all analyses. The software are listed in the Supplementary Note with their appropriate citations and/or URLs.

## References

1. Gatz, M. et al. Role of genes and environments for explaining Alzheimer disease. Arch. Gen. Psychiatry 63, 168–174 (2006).

2. Bellenguez, C., Grenier-Boley, B. & Lambert, J. C. Genetics of Alzheimer’s disease: where we are, and where we are going. Curr. Opin. Neurobiol. 61, 40–48 (2020).

3. Jansen, I. E. et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat. Genet. 51, 404–413 (2019).

4. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

5. Kunkle, B. W. et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet. 51, 414–430 (2019).

6. Lambert, J. C. et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat. Genet. 45, 1452–1458 (2013).

7. Liu, J. Z., Erlich, Y. & Pickrell, J. K. Case-control association mapping by proxy using family history of disease. Nat. Genet. 49, 325–331 (2017).

8. Marioni, R. E. et al. GWAS on family history of Alzheimer’s disease. Transl. Psychiatry 8, 99 (2018).

9. Sims, R. et al. Rare coding variants in PLCG2, ABI3, and TREM2 implicate microglial-mediated innate immunity in Alzheimer’s disease.Nat. Genet. 49, 1373–1384 (2017).

10. Jun, G. et al. A novel Alzheimer disease locus located near the gene encoding tau protein. Mol. Psychiatry 21, 108–117 (2016).

11. Schwartzentruber, J. et al. Genome-wide meta-analysis, fine-mapping and integrative prioritization implicate new Alzheimer’s disease risk genes. Nat. Genet. 53, 392–402 (2021).

12. de Rojas, I. et al. Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores. Nat. Commun. 12, 3417 (2021).

13. Wightman, D. P. et al. A genome-wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer’s disease. Nat. Genet. 53, 1276–1282 (2021).

14. Skene, N. G. et al. Genetic identification of brain cell types underlying schizophrenia. Nat. Genet. 50, 825–833 (2018).

15. de Leeuw, C. A., Stringer, S., Dekkers, I. A., Heskes, T. & Posthuma, D. Conditional and interaction gene-set analysis reveals novel functional pathways for blood pressure. Nat. Commun. 9, 3768 (2018).

16. Haass, C., Kaether, C., Thinakaran, G. & Sisodia, S. Trafficking and proteolytic processing of APP. Cold Spring Harb. Perspect. Med. 2, a006270 (2012).

17. Chapuis, J. et al. Genome-wide, high-content siRNA screening identifies the Alzheimer’s genetic risk factor FERMT2 as a major modulator of APP metabolism. Acta Neuropathol. 133, 955–966 (2017).

18. Cleynen, A. et al. Expressed fusion gene landscape and its impact in multiple myeloma. Nat. Commun. 8, 1893 (2017).

19. Szklarczyk, D. et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).

20. Spit, M., Rieser, E. & Walczak, H. Linear ubiquitination at a glance. J. Cell Sci. 132, jcs208512 (2019).

21. Pencina, M. J., D’Agostino, R. B., Pencina, K. M., Janssens, A. C. J. W. & Greenland, P. Interpreting incremental value of markers added to risk prediction models. Am. J. Epidemiol. 176, 473–481 (2012).

22. Dourlen, P., Chapuis, J. & Lambert, J.-C. Using high-throughput animal or cell-based models to functionally characterize GWAS signals. Curr. Genet. Med. Rep. 6, 107–115 (2018).

23. Dourlen, P., Kilinc, D., Malmanche, N., Chapuis, J. & Lambert, J. C. The new genetic landscape of Alzheimer’s disease: from amyloid cascade to genetically driven synaptic failure hypothesis? Acta Neuropathol. 138, 221–236 (2019).

24. Deuss, M., Reiss, K. & Hartmann, D. Part-time α-secretases: the functional biology of ADAM 9, 10 and 17. Curr. Alzheimer Res. 5, 187–201 (2008).

25. Kim, T. et al. Human LilrB2 is a β-amyloid receptor and its murine homolog PirB regulates synaptic plasticity in an Alzheimer’s model. Science 341, 1399–1404 (2013).

26. Salminen, A. & Kaarniranta, K. Siglec receptors and hiding plaques in Alzheimer’s disease. J. Mol. Med. 87, 697–701 (2009).

27. Rodgers, M. A. et al. The linear ubiquitin assembly complex (LUBAC) is essential for NLRP3 inflammasome activation. J. Exp. Med. 211, 1333–1347 (2014).

28. Iwai, K. LUBAC-mediated linear ubiquitination: a crucial regulator of immune signaling. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 97, 120–133 (2021).

29. Venegas, C. et al. Microglia-derived ASC specks crossseed amyloid-β in Alzheimer’s disease. Nature 552, 355–361 (2017).

30. Ising, C. et al. NLRP3 inflammasome activation drives tau pathology. Nature 575, 669–673 (2019).

31. Nakayama, Y. et al. Linear polyubiquitin chain modification of TDP-43-positive neuronal cytoplasmic inclusions in amyotrophic lateral sclerosis. J. Neuropathol. Exp. Neurol. 79, 256–265 (2020).

32. Black, R. A. et al. A metalloproteinase disintegrin that releases tumour-necrosis factor- from cells. Nature 385, 729–733 (1997).

33. Verstrepen, L., Carpentier, I., Verhelst, K. & Beyaert, R. ABINs: A20 binding inhibitors of NF-κB and apoptosis signaling. Biochem. Pharmacol. 78, 105–114 (2009).

34. Spitz, C. et al. Non-canonical Shedding of TNFα by SPPL2a is determined by the conformational flexibility of its transmembrane helix. iScience 23, 101775 (2020).

35. Tang, W. et al. The growth factor progranulin binds to tnf receptors and is therapeutic against inflammatory arthritis in mice. Science 332, 478–484 (2011).

36. He, P. et al. Deletion of tumor necrosis factor death receptor inhibits amyloid β generation and prevents learning and memory deficits in Alzheimer’s mice. J. Cell Biol. 178, 829–841 (2007).

37. Shi, J. Q. et al. Anti-TNF-α reduces amyloid plaques and tau phosphorylation and induces CD11c-positive dendritic-like cell in the APP/PS1 transgenic mouse brains. Brain Res. 1368, 239–247 (2011).

38. Bezbradica, J. S., Coll, R. C. & Schroder, K. Sterile signals generate weaker and delayed macrophage NLRP3 inflammasome responses relative to microbial signals. Cell. Mol. Immunol. 14, 118–126 (2017).

39. Decourt, B., Lahiri, D. K. & Sabbagh, M. N. Targeting tumor necrosis factor alpha for Alzheimer’s disease. Curr. Alzheimer Res. 14, 412–425 (2016).

40. De Strooper, B. & Karran, E. The cellular phase of Alzheimer’s disease. Cell 164, 603–615 (2016).

41. Gong, K. et al. TNF-driven adaptive response mediates resistance to EGFR inhibition in lung cancer. J. Clin. Invest. 128, 2500–2518 (2018).

42. Rhinn, H. & Abeliovich, A. Differential aging analysis in human cerebral cortex identifies variants in TMEM106B and GRN that regulate aging phenotypes. Cell Syst. 4, 404–415 (2017).

43. Vass, R. et al. Risk genotypes at TMEM106B are associated with cognitive impairment in amyotrophic lateral sclerosis. Acta Neuropathol. 121, 373–380 (2011).

44. Baizabal-Carvallo, J. F. & Jankovic, J. Parkinsonism, movement disorders and genetics in frontotemporal dementia. Nat. Rev. Neurol. 12, 175–185 (2016).

45. Tropea, T. F. et al. TMEM106B Effect on cognition in Parkinson disease and frontotemporal dementia. Ann. Neurol. 85, 801–811 (2019).

46. Mendsaikhan, A., Tooyama, I. & Walker, D. G. Microglial progranulin: involvement in Alzheimer’s disease and neurodegenerative diseases. Cells 8, 230 (2019).

47. Li, Z. et al. The TMEM106B FTLD-protective variant, rs1990621, is also associated with increased neuronal proportion. Acta Neuropathol. 139, 45–61 (2020).

48. Yang, H. S. et al. Genetics of gene expression in the aging human brain reveal TDP-43 proteinopathy pathophysiology. Neuron 107, 496–508.e6 (2020).

49. Paushter, D. H., Du, H., Feng, T. & Hu, F. The lysosomal function of progranulin, a guardian against neurodegeneration. Acta Neuropathol. 136, 1–17 (2018).

50. Feng, T., Lacrampe, A. & Hu, F. Physiological and pathological functions of TMEM106B: a gene associated with brain aging and multiple brain disorders. Acta Neuropathol. 141, 327–339 (2021).

51. Lacour, A. et al. Genome-wide significant risk factors for Alzheimer’s disease: role in progression to dementia due to Alzheimer’s disease among subjects with mild cognitive impairment. Mol. Psychiatry 22, 153–160 (2017).

52. Zhang, Q. et al. Risk prediction of late-onset Alzheimer’s disease implies an oligogenic architecture. Nat. Commun. 11, 1–11 (2020).

53. Holstege, H. et al. Exome sequencing identifies novel AD-associated genes. medRxiv 18, 24 (2020).

54. Psaty, B. M. et al. Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium design of prospective meta-analyses of genome-wide association studies from 5 cohorts. Circulation: Cardiovasc. Genet. 2, 73–80 (2009).

55. Naj, A. C. et al. Common variants at MS4A4/MS4A6E, CD2AP, CD33 and EPHA1 are associated with late-onset Alzheimer’s disease. Nat. Genet. 43, 436–441 (2011).

56. Jun, G. et al. Meta-analysis confirms CR1, CLU, and PICALM as Alzheimer disease risk loci and reveals interactions with APOE genotypes. Arch. Neurol. 67, 1473–1484 (2010).

57. Taliun, D. et al. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature 590, 290–299 (2021).

58. Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).

59. McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

60. Marchini, J., Howie, B., Myers, S., McVean, G. & Donnelly, P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat. Genet. 39, 906–913 (2007).

61. Zhou, W. et al. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat. Genet. 50, 1335–1341 (2018).

62. Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).

63. Aulchenko, Y. S., Ripke, S., Isaacs, A. & van Duijn, C. M. GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294–1296 (2007).

64. Bulik-Sullivan, B. et al. LD score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).

65. de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).

66. Yurko, R., Roeder, K., Devlin, B. & G’Sell, M. H‐MAGMA, inheriting a shaky statistical foundation, yields excess false positives. Ann. Hum. Genet. 85, 97–100 (2021).

67. De Leeuw, C. A., Neale, B. M., Heskes, T. & Posthuma, D. The statistical properties of gene-set analysis. Nat. Rev. Genet. 17, 353–364 (2016).

68. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. U. S. A. 100, 9440–9445 (2003).

69. Coelho, D. M., L. I. da Cruz Carvalho, Melo-de-Farias, A. R., Lambert, J.-C. & Costa, M. R. Differential transcript usage unravels gene expression alterations in Alzheimer’s disease human brains. NPJ Aging Mech. Dis. 7, 2 (2021).

70. Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

71. Allen, M. et al. Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases. Sci. Data 3, 160089 (2016).

72. Mostafavi, S. et al. A molecular network of the aging human brain provides insights into the pathology and cognitive decline of Alzheimer’s disease. Nat. Neurosci. 21, 811–819 (2018).

73. Bennett, D. A. et al. Religious orders study and rush memory and aging project. J. Alzheimer’s Dis. 64, S161–S189 (2018).

74. Wang, M. et al. The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci. Data 5, 180185 (2018).

75. GTEx Consortium, The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

76. De, K. et al. Atlas of genetic effects in human microglia transcriptome across brain regions, aging and disease pathologies. Preprint at bioRxiv https://doi.org/10.1101/2020.10.27.356113 (2020).

77. Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431 (2018).

78. Nédélec, Y. et al. Genetic ancestry and natural selection drive population differences in immune responses to pathogens. Cell 167, 657–669.e21 (2016).

79. Chen, L. et al. Genetic drivers of epigenetic and transcriptional variation in human immune cells. Cell 167, 1398–1414.e24 (2016).

80. Momozawa, Y. et al. IBD risk loci are enriched in multigenic regulatory modules encompassing putative causative genes. Nat. Commun. 9, 2427 (2018).

81. Fairfax, B. P. et al. Innate immune activity conditions the effect of regulatory variants upon monocyte gene expression. Sci. (80-.). 343, 1246949 (2014).

82. Quach, H. et al. Genetic adaptation and neandertal admixture shaped the immune system of human populations. Cell 167, 643–656.e17 (2016).

83. Kerimov, N. et al. eQTL catalogue: a compendium of uniformly processed human gene expression and splicing QTLs. Preprint at bioRxiv https://doi.org/10.1101/2020.01.29.924266 (2020).

84. Wingo, A. P. et al. Integrating human brain proteomes with genome-wide association data implicates new proteins in Alzheimer’s disease pathogenesis. Nat. Genet. 53, 143–146 (2021).

85. Ng, B. et al. An xQTL map integrates the genetic architecture of the human brain’s transcriptome and epigenome. Nat. Neurosci. 20, 1418–1426 (2017).

86. Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22, 49 (2021).

87. Edgar, R. D., Jones, M. J., Meaney, M. J., Turecki, G. & Kobor, M. S. BECon: a tool for interpreting DNA methylation findings from blood in the context of brain. Transl. Psychiatry 7, e1187 (2017).

88. Freytag, V. et al. Genetic estimators of DNA methylation provide insights into the molecular basis of polygenic traits. Transl. Psychiatry 8, 31 (2018).

89. Barbeira, A. N. et al. Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics. Nat. Commun. 9, 1825 (2018).

90. Chouraki, V. et al. Evaluation of a genetic risk score to improve risk prediction for Alzheimer’s disease. J. Alzheimers Dis. 53, 921–932 (2016).

91. Steyerberg, E. W. et al. Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 21, 128–138 (2010).

92. Harrell, F. E., Califf, R. M., Pryor, D. B., Lee, K. L. & Rosati, R. A. Evaluating the yield of medical tests. JAMA 247, 2543–2546 (1982).

93. Gerds, T. A., Kattan, M. W., Schumacher, M. & Yu, C. Estimating a time-dependentconcordance index for survival prediction models with covariate dependent censoring. Stat. Med. 32, 2173–2184 (2013).

94. Pencina, M. J., D’Agostino, R. B. & Steyerberg, E. W. Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat. Med. 30, 11–21 (2011).

95. Kattan, M. W. & Gerds, T. A. The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models. Diagnostic Progn. Res. 2, 7 (2018).

96. Viechtbauer, W. Conducting meta-analyses in R with the metafor. J. Stat. Softw. 36, 1–48 (2010).

97. Ozenne, B., Sørensen, A. L., Scheike, T., Torp-Pedersen, C. & Gerds, T. riskRegression: predicting the risk of an event using Cox regression models. R. J. 9, 440–460 (2017).