Introduction

Alzheimer’s disease (AD) is the most prevalent form of dementia, characterised by neurodegeneration and a progressive decline in cognitive ability1,2. The disorder ranks as a subject of increasing global public health importance with consequences for wide-ranging social and economic adverse impacts on sufferers, their families, and the society at large1. By the year 2030, over 82 million people—and about 152 million by 2050—are projected to suffer from AD1,2. While AD has no known curative treatments, and its pathogenesis is yet to be clearly understood, a comprehensive assessment of its shared genetics with other diseases (comorbidities) can provide a deeper understanding of its underlying biological mechanisms and enhance potential therapy development efforts.

Several studies have reported a pattern of co-occurrence of dementia (and AD in particular) with certain gastrointestinal tract (GIT) disorders, microbiota, dysbiosis or medications commonly used in the treatment of peptic ulcer disease (PUD)3,4,5,6,7,8,9,10. For example, an observational study reported more than twice the odds of dementia in individuals with gastritis (adjusted odds ratio [AOR]: 2.42, P < 0.001, 95% confidence interval [CI]: 1.68–3.49)3. Another observational study found a significant association between regular use of proton-pump inhibitors (PPI, medications for gastritis duodenitis, gastroesophageal reflux disease [GERD] or PUD) and increased risk of incident dementia (hazard ratio [HR]: 1.44 [95% CI, 1.36–1.52]; P < 0.001)4. Similarly, lansoprazole (a PPI) was reported to promote amyloid-beta (Aβ) production5, the accumulation of which is central to one of the core hypotheses for the development of AD11. More recently, a longitudinal study reported more than a sixfold increased risk of AD in individuals with inflammatory bowel disease (IBD) [HR: 6.19, 95%CI: 3.31–11.57], predicting over five-fold increased incidence across all forms of dementia7.

The available evidence, thus, suggests comorbidity or some forms of association between AD and GIT disorders, although it is not clear whether GIT traits are risks for AD or vice versa. Regardless, these findings agree with the concept of the ‘gut–brain’ axis or the ‘gastric mucosa–brain’ relationship, which has been implicated between GIT-related traits and central nervous system (CNS) disorders including depression and Parkinson’s disease12,13,14,15,16,17. A relationship between AD and GIT disorders or their comorbidity can worsen the quality of life of sufferers while contributing to increased healthcare costs.

Despite the increasing number of studies reporting an association between AD and GIT traits, the biological mechanism(s) underlying this potential association remains unclear. Moreover, contrasting evidence exists7,18,19, leading to a longstanding debate on the potential links of GIT traits to the risk of AD15,18,19,20. Large-scale genome-wide association studies (GWAS), identifying an increasing number of single nucleotide polymorphism (SNPs), genes, and susceptibility loci, have been conducted separately for AD and a range of GIT traits21,22,23,24. Findings from these GWAS provide compelling evidence for the roles of genetics in the aetiologies of AD and GIT disorders including GERD, PUD, PGM (a combination of disease-diagnosis of PUD and/or GERD and/or corresponding medications and treatments—a potential proxy for PUD or GERD), gastritis-duodenitis, irritable bowel syndrome (IBS), diverticular disease, and IBD21,22,23,24. However, to the best of our knowledge, no study has leveraged the possible pleiotropy between AD and GIT disorders as a basis for discovering their shared SNPs, genes and/or susceptibility loci.

In this study, we analyse well-powered GWAS summary data to comprehensively assess the genetic relationship and potential causal association between AD and GIT disorders. We demonstrate a positive significant genetic overlap and correlation between AD and GERD, PUD, PGM, IBS, gastritis-duodenitis, and diverticular disease. Also, in a cross-trait GWAS meta-analysis, we identify many loci shared by AD and GIT disorders. Causality assessment reveals no evidence for a significant causal association between AD and GIT disorders. However, we identify shared genes reaching genome-wide significance for AD and GIT disorders in gene-based association analyses. Lastly, pathway-based analyses show significant enrichment of lipid metabolism, autoimmunity, lipase inhibitors, PD-1 signalling and statin mechanisms, among others, for AD and GIT traits.

Results

Figure 1 presents a schematic workflow for this study. Briefly, we performed three broad levels of analyses—SNP-level, gene-level, and pathway-based analyses. First, we used the linkage disequilibrium score regression (LDSC)25 to estimate the genetic correlation between AD and GIT traits, and the ‘SNP effect concordance analysis’ (SECA)26 method for concordance in SNP risk effect assessment. Second, to identify SNPs and susceptibility loci shared by AD and GIT disorders, we carried out GWAS meta-analyses. We also applied the pairwise GWAS (colocalisation) method27 to identify independent genomic loci with shared genetic influence on AD and GIT disorders. Third, using the Mendelian randomisation (MR)28 and the Latent Causal Variable (LCV)29 methods, we assessed potential (and partial) causal associations between AD and GIT disorders. Lastly, we performed gene and pathway-based analyses to identify shared genes reaching genome-wide significance and biological pathways for AD and GIT disorders. The largest publicly available AD summary statistics and GIT summary data from research consortia or public repositories were utilised for analysis (Table 1 and Supplementary Data 1).

Fig. 1: Study design and workflow: examining shared genetic and causality of GIT disorders with the risk of Alzheimer’s disease.
figure 1

GWAS genome-wide association studies, SNP single-nucleotide polymorphism, SECA SNP effect concordance analysis, LDSC linkage disequilibrium score regression, LCV latent causal variable, MAGMA multi-marker analysis of genomic annotation, MR Mendelian randomisation, MR-PRESSO Mendelian randomisation pleiotropy residual sum and outlier, KEGG Kyoto Encyclopedia of Genes and Genomes.

Table 1 Summary of GWAS data sets analysed.

Genetic correlation between AD and GIT disorders

We assessed and quantified the SNP-level genetic correlation between AD and GIT disorders using the LDSC25 analysis method. The apolipoprotein E (APOE) region has a large effect on the risk of AD; hence, we excluded APOE and the 500 kilobase (kb) flanking region (hg19, 19:44,909,039–45,912,650) from the AD GWAS. We also excluded SNPs in the 26 to 36 megabase region of chromosome six from the data given the complex LD structure in the human major histocompatibility complex (MHC). Notably, in analyses both with and without the APOE region, LDSC reveals a significant genetic correlation between AD and GIT traits (Table 2). Genetic covariance intercept estimates were not significantly different from zero (Supplementary Data 2), indicating no sample overlap between our AD and GIT GWAS.

Table 2 Genetic correlation between AD and GIT disorders.

We found a positive and significant genetic correlation (rg) of AD (excluding APOE region) with GERD (rg = 0.25, P = 8.19 × 10−18), PUD (rg = 0.28, P = 3.70 × 10−7), PGM (rg = 0.22, P = 2.38 × 10−14), gastritis-duodenitis (rg = 0.24, P = 2.40 × 10−8), IBS (rg = 0.19, P = 1.10 × 10−4), and diverticular disease (rg = 0.15, P = 2.97 × 10−5). However, we found no evidence of a significant genetic correlation between AD and IBD (rg = 0.07, P = 9.94 × 10−2) [Table 2], which may be because of the relatively small cases and sample size of the IBD GWAS (Table 1 and Supplementary Data 1). Our estimates of effective sample size (Supplementary Data 1) suggest the IBD GWAS was underpowered compared to other GIT data sets. We reproduced a pattern of a positive and significant genetic correlation between AD21 and the replication set of GIT traits with or without the APOE region, except for IBD (Supplementary Data 3).

SNP effect concordance analysis (SECA) results

Using the SECA method26, we assessed the directions of SNP-level genetic overlap between AD and GIT disorders. We provide a more comprehensive description of SECA in the methods section. Briefly, SECA performs a bi-directional analysis, assessing concordance in the direction of the effect of AD-associated SNPs (data set 1) on each of the GIT disorders (data set 2) and vice versa. First, we conducted two rounds of P-value informed LD clumping (first clumping: -clump-r2 0.1, -clump-kb 1000; second clumping: -clump-r2 0.1, -clump-kb 10000) using PLINK 1.9030. SECA subsequently assesses (using Fisher’s test) the presence of excess SNPs in which the direction of effects is concordant across 144 subsets of data set 1 (AD GWAS) and data set 2 (each of the GIT traits GWAS).

We found a positive and significant concordance of SNP risk effect across the AD (data set 1) and each of the GIT GWAS (data set 2) including IBD (Table 3). For example, of the total 144 SNP subsets tested with AD as data set 1 (Table 3), all 144 (for GERD, PGM and gastritis-duodenitis), 139 (PUD), 133 (IBS), 130 (diverticulosis) and 42 (IBD) produced Fisher’s exact tests with at least nominally significant effect concordance (odds ratio [OR] > 1 and P < 0.05). The empirical P values (Ppermuted) for the significant associations, adjusting for the 144 SNP subsets tested (using permutations of 1000 replicates), range from 0.001 to 0.018 (Table 3). These results are significantly more than expected by chance, supporting evidence of genetic overlap between AD and the GIT traits.

Table 3 SECA results: primary test for concordant SNP effects.

By changing the direction of the analysis (in a bidirectional assessment), we tested each of the GIT traits as data set 1 against AD as data set 2 (Table 3). The results indicate evidence of a strong genetic overlap between AD and GERD, PUD, PGM, gastritis-duodenitis, IBS and diverticulosis. The results also suggest (except for IBD) that SNPs that are strongly associated with AD influence the named GIT traits and vice versa. Overall, findings in SECA are largely consistent with those of LDSC, except in the case of IBD—highlighting how SECA differs from (capacity for a bidirectional assessment) as well as complements LDSC. Notably, and like LDSC, SECA found a significant association between AD and GIT traits with or without the APOE region (Table 3 and Supplementary Data 4). Further, replication analyses in SECA produced largely consistent findings as with LDSC (Supplementary Data 5 and 6).

SNPs and loci shared by AD and GIT disorders

Leveraging the significant genetic overlap and correlation as well as the substantial GWAS sample sizes, we performed cross-disorder meta-analyses of AD with GERD and PUD. The GWAS for PGM has many cases and overall large sample size (Table 1) and is strongly correlated with GERD (rg = 0.99, P = 0.000) and PUD (rg = 0.76, P = 4.41 × 10−101) [Supplementary Data 7], hence, we also utilised it in a meta-analysis with AD. We aimed at identifying SNPs and loci which were not genome-wide significant in the individual AD or GIT disorder GWAS (i.e., 5 × 10−8 < PGWAS-data <  0.05) but reached the status (Pmeta-analysis < 5 × 10−8) following a meta-analysis. We additionally identified SNPs and loci which were already established (PGWAS-data < 5 × 10−8) in AD (Sentinel AD SNPs/loci), but which, following GWAS meta-analyses, were similarly associated with a GIT disorder, and vice versa. Briefly, our GWAS meta-analyses identified shared SNPs and susceptibility loci, some of which are putatively novel for AD or GIT disorders.

First, a meta-analysis of AD and GERD identified a total of 119 SNPs reaching genome-wide significant association (Pmeta-analysis < 5 × 10−8, Supplementary Data 8), from which we characterised seven independent (r2 < 0.1) genomic loci—1p31.3, 1q31.1, 3p21.31, 6p21.32, 17q21.32, 17q21.33, 19q13.32 (Table 4). Many SNPs reaching genome-wide significance in these loci were not genome-wide significant in the individual AD and GIT GWAS we analysed but reached the status in the cross-trait meta-analyses (Table 3). Given this premise (that is, PGWAS-data > 5 × 10−8 < Pmeta-analysis), the observation that some of the identified loci are known for AD or GIT traits (from other studies) provides support for our cross-trait analysis findings. Specifically, two of the identified loci: (1p31.3 [near PDE4B], and 3p21.31 [near SEMA3F]) were not previously genome-wide significant for AD (to our knowledge), indicating they are putatively novel for the disorder. Similarly, three of the seven loci: (17q21.32 [ZNF652], 17q21.33 [PHB], and 19q13.32 [TOMM40, APOC2, KLC3, ERCC2]) are putatively novel for GERD given we have no evidence they were previously genome-wide significant for the disorder. A locus at 1q31.1 (near BRINP3) was putatively novel for both AD and GERD at the time of our analysis but has now been reported in a recent GERD multi-trait analysis31—providing support for our finding. The remaining locus, 6p21.32 (near genes HLA-DQA2 and HLA-DRA) is known for both AD32 and GIT disorders—IBD33, ulcerative colitis34 and Crohn’s disease33—and now (in our study), GERD.

Table 4 Genome-wide significant independent SNPs and loci for AD and GIT disorders.

An additional 175 independent SNPs at 121 loci reached a genome-wide suggestive association (Pmeta-analysis < 1 × 10−5, Supplementary Data 9), replicating some of the genome-wide significant loci, including: 1p31.3 (PDE4B, lead SNP: rs2840677) and 1q31.1 (BRINP3, rs10753964) for AD and GERD. Also, some of the well-established (sentinel) loci for AD in our GWAS showed evidence of association with GERD (Supplementary Data 10) at 8p21.2 (near gene PTK2B, and CHRNA2, rs28834970). Other AD sentinel loci shared with GERD include: 19q13.32 (near NECTIN2, lead SNP: rs12980613), and 19q13.32 (near KLC3, rs77988534) [Supplementary Data 10]. Known (sentinel) GERD loci were similarly associated with AD as summarised in Supplementary Data 10.

Second, following a meta-analysis of AD and PUD GWAS, a total of 22 SNPs, at six genomic loci, reached a genome-wide significance (Pmeta-analysis < 5 × 10−8, Supplementary Data 11). The identified loci here include 2q37.1, 6p21.32, 8p21.1, 17p13.2, 19q13.32 and 19q13.41 (Table 4). Of the loci found in the AD and GERD meta-analysis, four were replicated in the AD and PUD meta-analysis. Two of these four loci, the 19q13.32 (near BCL3, rs28363848), and the 6p21.32 (HLA-DRA, rs9270599), were replicated at a genome-wide level of significance. The remaining two loci—HYAL2, 3p21.31, P(FE) = 5.24 × 10−3, rs709210; and PDE4B, 1p31.3, P(FE) = 2.94 × 10−4, rs6695557 (Supplementary Data 12)—were replicated at 7.14 × 10−3 level. In addition to the 6p21.32 (HLA-DRA, rs9270599), two of the identified loci: at 8p21.1 (near SCARA3), and 2q37.1 (near ATG16L1) have been reported for AD (SCARA335, ATG16L121,32,36), and GIT traits (SCARA3: gastric or stomach ulcer37, ATG16L1: IBD38, ulcerative colitis and Crohn’s disease33,39). Supplementary Data 13 presents 24 independent SNPs, at 21 genomic loci, reaching genome-wide suggestive association (Pmeta-analysis < 1 × 10−5) for AD and PUD.

Third, given its large sample size and strong genetic correlation with GERD and PUD, we performed a meta-analysis of PGM with AD thereby identifying 42 SNPs (Supplementary Data 14) at seven independent loci (Table 4) reaching a genome-wide significance level. This analysis replicated, at a genome-wide level (Pmeta-analysis < 5 × 10−8), five of the seven genome-wide loci found in the AD and GERD meta-analysis including 1p31.3, 3p21.31, 6p21.32, 17q21.33 and 19q13.32. Additional loci found in the AD and PGM meta-analysis such as 16q22.1 and 1q32.2 were at least genome-wide suggestive (Pmeta-analysis < 1 × 10−5) in the AD and GERD analysis, supporting their involvement in the disorders. An additional 23 SNPs, at three loci, were genome-wide suggestive (Pmeta-analysis < 1 × 10−5) in the AD and PGM meta-analysis (Supplementary Data 15). Of these, the rs33998678 SNP (16q22.1, IL34) is in strong LD (r2 = 0.91) with a genome-wide significant locus found in the AD vs PGM analysis (rs34644948, at 16q22.1, MTSS2, Table 4), providing more support for its involvement in AD and GIT traits (GERD and PUD). Similarly, the rs663576 SNP (at 17q21.32, PHOSPHO1) is moderately correlated (r2 = 0.41) with a genome-wide significant SNP (rs2584662 at 17q21.33, PHB, Table 4), identified in the meta-analysis. This locus (17q21.33) was found in AD and GERD meta-analysis (SNP rs2584662 near PHB), supporting its involvement in AD and the GIT traits. Supplementary Data 10 summarises the sentinel AD loci associated with PGM and vice versa.

Association of identified loci with other traits

Seven loci reached a genome-wide significance in the meta-analysis of AD and GERD GWAS; most of these loci were replicated in the AD vs PUD and/or AD vs PGM meta-analysis. We queried each of the associated loci for pleiotropic associations with other traits using the GWAS catalogue (https://www.ebi.ac.uk/gwas) and the Open Targets Genetics (https://genetics.opentargets.org) platforms. For three of the loci—1p31.3 (near PDE4B), 3p21.31 (near SEMA3F), and 1q31.1 (near BRINP3)—we have no evidence of their previous association with AD, at a genome-wide level (P < 5 × 10−8). However, and potentially supportive of our findings, the loci have been reported for AD-related phenotypes such as cognitive traits.

For example, PDE4B has pleiotropic associations with intelligence40, educational attainment41, and sleep-related traits such as insomnia42. The locus is also known for other disorders including major depression, stress disorders, schizophrenia, and multiple sclerosis43—putative comorbidities of AD44,45—among other traits. The loci harbouring SEMA3F and BRINP3 have similarly been reported for intelligence (SEMA3F46), general cognitive ability (SEMA3F40), educational attainment (SEMA3F47, BRINP341), insomnia (SEMA3F and BRINP342) and BMI (SEMA3F and BRINP3). Sex hormone-binding globulin levels48 and multi-site chronic pain are some of the traits that have also been linked with SEMA3F. Interestingly, BMI, cognitive traits such as intelligence, cognitive performance and even sleep-related traits have been associated with GERD31. Taken together, and in further support of their relationship, this observation, suggests that GERD may share genetic links with certain AD-related phenotypes including cognitive and sleep-related traits.

Further, our analysis consistently identified and replicated the 19q13.32 locus (mapped genes: TOMM40, APOC2, KLC3, ERCC2, BCL3, and CD33) as shared by AD and GIT disorders. While this locus is well known for AD, it has also been linked with GIT traits including IBD49 (SYMPK, lead SNP: rs16980051, GRCh37: 19:46,345,886), and gut microbiota50, thus, highlighting an association of AD with not only GIT disorders, but also the gut microbiome. This premise is important given previous evidence of genetic links between dysbiosis, neurological (AD, for instance) and GIT disorders15,22,51,52, and may underscore the need for a renewed focus on the genetics of gut-brain connection (including the gut microbiome) to better understand the underlying mechanisms of AD. Similar to other identified loci, the 19q13.32 locus also displays pleiotropic association with many AD-related phenotypes: intelligence53, cognitive impairment test score54, t-tau and beta-amyloid 1–42 measurements, hippocampal atrophy rate, memory performance, and educational attainment41. Supplementary Data 16 and 17 summarise other traits previously reported for loci at 6p21.32 (near HLA-DRA) and 17q21.32 (near ZNF652 and PHB).

Shared genomic regions identified in GWAS-PW analysis

Using a colocalization analysis in GWAS-PW27, we assessed shared genomic regions between AD and each of GERD and PGM (Supplementary Data 18). The results of this analysis confirm all the loci identified in the meta-analyses (except on chromosome 3) are shared by AD and the respective GIT traits (model 4 posterior probability [PPA 4] > 0.9, Supplementary Data 18). While the findings also suggest that the causal variants might be different (PPA 3 < 0.5), we note that when variants in a locus are in strong LD, which may be the case in this study, GWAS-PW is limited in its ability to correctly distinguish model 3 (PPA 3) from model 4 (PPA 4)27. Additional shared genomic regions, in chromosomes 1, 6, 16, 17 and 19 having PPA 4 > 0.90 were identified for AD and the GIT traits (Supplementary Data 18). Also, we identified a locus on chromosome 17, having PPA 3 > 0.80, and implicating the SNP rs2526380 (17q22, near TSPOAP1) in both AD and GERD. The posterior probability that this SNP is a causal variant for both AD and GERD under model 327 is high at 0.99 (Supplementary Data 18).

Results of causal association analysis between AD and GIT disorders

We assessed the potential causal relationship between AD (as the outcome variable) and GERD (as the exposure variable) using the two-sample MR method. We found no evidence of a causal relationship between AD and GERD, irrespective of the direction of the analysis (AD or GERD as the outcome or exposure variable) [Table 5]. For sensitivity testing, we implemented three additional models of MR analysis—MR-Egger, weighted median, and the MR-PRESSO (Mendelian Randomization Pleiotropy RESidual Sum and Outlier). Results from these methods agree with those of the Inverse Variance Weighted (IVW) model supporting a lack of evidence for a causal association between AD and GERD (Table 5 and Supplementary Data 19). We carried out further MR analysis assessing AD against each of PUD, PGM, IBS, diverticular disease, and IBD, and vice versa. Findings similarly reveal no evidence for a causal relationship between AD and each of the GIT disorders assessed (Supplementary Data 19).

Table 5 Summary of MR analysis results for AD and GIT disorders.

We also used the Latent Causal Variable (LCV) approach29 to test for a causal relationship between AD and each of the GIT disorders. The results of LCV suggest a partial causal influence of gastritis-duodenitis (genetic causal proportion [GCP] = −0.69, P = 0.0026), on AD (Table 6). The result was in the reverse direction for diverticular disease (GCP = 0.23, P = 0.000272), suggesting AD may partially cause diverticular disease. Using another set of GWAS (Table 6), we tested the reproducibility of the partial causal association results for gastritis-duodenitis and diverticular disease, neither of which was reproduced, hence, the need for the findings to be further assessed in future studies. Conversely, we found a significant association between AD and lansoprazole use (GCP = −0.38, P = 0.001129).

Table 6 Partial causality assessment using the Latent Causal Variable approach.

Gene-based association analysis

Using SNPs that overlapped AD and GERD GWAS, we performed gene-based analyses in  MAGMA (implemented in the FUMA55 platform), thereby identifying a total of 18,929 protein-coding genes for each of the traits. Applying a threshold P-value of 2.64 × 10−6 (0.05/18929—Bonferroni correction for testing 18,929 genes), we identified 64 genome-wide significant (Pgene < 2.64 × 10−6) genes for AD (Supplementary Data 20), 44 for GERD (Supplementary Data 21) and 75 for PGM (Supplementary Data 22). Using the Fisher’s Combined P-value (FCP) method, a total of 46 genome-wide significant (PFCP < 2.64 × 10−6) genes shared by AD and GERD were identified (Supplementary Data 23), 10 of which were not previously significant in our AD or GERD GWAS, at the Pgene < 2.64 × 10−6 threshold, adjusting for multiple testing (Table 7), but are in known AD or GIT trait loci. It is noteworthy that some of the identified AD and GERD shared genes are in chromosomal locations found in our meta-analysis, including 1p31.3 (PDE4B), 3p21.31, (SEMA3F, HYAL2, IP6K1), 6p21.32 (HLA-DRA) and 19q13.32 (Supplementary Data 23). Combining P-values by weighting based on sample size (the weighted Stouffer's method) produced a similar pattern of results (as the FCP) for AD and GERD (Supplementary Data 24). We also replicated a similar pattern of findings in gene-based analysis (and FCP) using the AD and the PGM GWAS (Table 7 Supplementary Data 25).

Table 7 Shared genes reaching genome-wide significance for AD and GIT traits.

Biological pathways and mechanisms shared by AD and GIT disorders

We performed pathway-based functional enrichment analyses in the g: Profiler platform56 to functionally interpret genes overlapping AD and GIT disorders and gain biological insight from their commonalities. First, we investigated genes overlapping AD and GERD (at Pgene < 0.05, FCP < 0.02) and identified several biological pathways that were overrepresented (Fig. 2 and Supplementary Data 26), implying they have a role in the mechanisms underlying both AD and GERD. Pathways related to membrane trafficking and metabolism, alteration, lowering or inhibition of lipids were significantly enriched (Supplementary Data 26). These included plasma lipoprotein assembly, remodelling, and clearance (Padjusted = 2.01 × 10−3), cholesterol metabolism (Padjusted = 4.99 × 10−2), plasma lipoprotein assembly (Padjusted = 3.45 × 10−5), and triglyceride-rich plasma lipoprotein particle (Padjusted = 5.23 × 10−9), among others. Also, lipase inhibitors (Padjusted = 6.08 × 10−3) and the statin (3-hydroxy-3-methylglutaryl-coenzyme A reductase inhibitors) pathway (Padjusted = 3.99 × 10−2) were significantly enriched for AD and GERD (Supplementary Data 27), suggesting mechanisms of these medications may find therapeutic application in AD and GIT disorders.

Fig. 2: Clusters of significantly enriched biological pathways for AD and GERD.
figure 2

a KEGG: Kyoto Encyclopedia of Genes and Genomes pathways: intestinal immune network (allograft rejection, intestinal immune network for IGA production, type 1 diabetes mellitus, systemic lupus erythematous, antigen processing and presentation, graft-versus-host disease, asthma), and cholesterol metabolism (cholesterol metabolism). b Gene Ontology: Cellular Components: side membrane vesicle (lumenal side of membrane, MHC class II protein complex, integral component of lumenal side of endoplasmic reticulum [ER] membrane, clathrin-coated endocytic vesicle membrane, late endosome, ER to Golgi transport vesicle membrane, coated vesicle membrane, lumenal side of ER membrane, MHC protein complex, COPII-coated ER to Golgi transport vesicle, transport vesicle membrane, late endosome membrane), and plasma lipoprotein particle (chylomicron, very low-density lipoprotein [VLDL] particle, triglyceride-rich plasma lipoprotein particle, plasma lipoprotein particle, lipoprotein particle, LDL lipoprotein particle). c Gene Ontology: Molecular Function: peptide antigen binding (peptide binding, peptide antigen binding, MHC class II receptor activity) and lipase inhibitor activity (lipase inhibitor activity). d Gene Ontology: Biological Pathway: lipoprotein particle clearance (phospholipid efflux, VLDL particle clearance, regulation of plasma lipoprotein particle levels, plasma lipoprotein particle clearance, chylomicron remnant clearance, regulation of lipid catabolic process, regulation of VLDL particle clearance, protein-lipid complex assembly, plasma lipoprotein particle organisation, regulation of phospholipid catabolic process, VLDL particle assembly, regulation of lipid localisation, glycolipid catabolic process, triglyceride-rich lipoprotein particle clearance, high density lipoprotein particle remodelling), receptor signalling pathway (T cell receptor signalling pathway, interferon-gamma-mediated signalling pathway, antigen receptor-mediated signalling pathway), membrane adhesion cell (cell-cell adhesion via plasma membrane adhesion molecules, homophilic cell adhesion via plasma membrane adhesion molecules), and negative regulation type (negative regulation of type I interferon production). e Reactome, Wiki pathway and Transcription Factor Binding site: assembly clearance plasma (statin pathway, NR1H2 and NR1H3-mediated signalling, plasma lipoprotein assembly, remodelling, and clearance, plasma lipoprotein clearance, NR1H3 and NR1H2 regulated gene expression linked to cholesterol transport and efflux, VLDL assembly, VLDL clearance, plasma lipoprotein assembly), interferon-gamma signalling (PD-1 signalling, generation of second messenger molecules, interferon-gamma signalling phosphorylation of CD3 and TCR ZETA chains, translocation of ZAP-70 to Immunological synapse), Factor: ZNF2 motif, and ZNF582 motif. Supplementary Data 26 provides additional details about these biological pathways. AD Alzheimer’s disease, GERD gastroesophageal reflux disease.

Pathways related to the immune system were also overrepresented for both AD and GERD as evidenced by the identification of immune or autoimmune-related disorders such as asthma (Padjusted = 3.53 × 10−3), systemic lupus erythematosus (Padjusted = 7.88 × 10−3), and type I diabetes mellitus (Padjusted = 2.47 × 10−2). Other immune-related pathways identified include the intestinal immune network for IgA production (Padjusted = 4.07 × 10−2), programmed cell death protein 1 (PD-1) signalling (Padjusted = 5.24 × 10−3), translocation of ZAP-70 to immunological synapse (Padjusted = 2.44 × 10−3) and interferon-gamma signalling pathways (Padjusted = 2.45 × 10−2) [Supplementary Data 26].

Following enrichment mapping and auto-annotation, the identified biological pathways were clustered into six themes of biological mechanisms, namely: ‘lipoprotein particle clearance,’ ‘receptor signalling pathway,’ ‘side membrane vesicle and cell adhesion,’ ‘peptide antigen binding,’ ‘intestinal immune network,’ and ‘interferon-gamma signalling’ (Fig. 2). Moreover, a pathway-based analysis using genes that overlapped AD and PGM GWAS (at Pgene < 0.05) replicated some of the pathways identified for AD and GERD, including ‘plasma lipoprotein assembly, remodelling, and clearance’ (Padjusted = 3.01 × 10−4), ‘peptide antigen binding’ (Padjusted = 2.28 × 10−3), and ‘triglyceride-rich plasma lipoprotein particle’ (Padjusted = 6.60 × 10−8) [Supplementary Data 27]. Also, we performed pathway-based analysis separately for GERD and AD GWAS, the full results of which are presented in Supplementary Data 28 and 29, respectively.

Discussion

We present the first comprehensive assessment (to the best of our knowledge) of the shared genetics of AD with GIT disorders by analysing large-scale GWAS summary data using multiple statistical genetic approaches. Consistent with previous conventional observational studies3,4,5,6,7,8,9, our findings confirm a risk-increasing relationship between AD and GIT disorders and provide insights into their underlying biological mechanisms. In contrast to the positive genetic correlation between AD and other GIT disorders, LDSC found no significant genetic correlation between AD and IBD, which may be due to the relatively small number of cases and sample size of the IBD GWAS. Based on the effective sample size estimates, the IBD GWAS is underpowered compared to other GIT data sets. Supporting this premise, SECA revealed a significant association between AD (as data set 1) against IBD (as data set 2), but not the other way around. The AD GWAS has a larger sample size, providing a more robust association on which to condition (select independent) SNPs for concordance analysis which may explain why the significant association was not bi-directional unlike the case for other GIT traits. Future studies, nonetheless, need to confirm this relationship, as more powerful IBD GWAS becomes available.

Evidence of significant genetic overlap and correlation reflects not only shared genetic aetiologies (biological pleiotropy) but also suggests a possible causal association between AD and the GIT traits (vertical pleiotropy). Using LCV, we detected a partial causal association between AD and gastritis-duodenitis, lansoprazole, and diverticular disease. However, this partial causal association was not evident in reproducibility testing. The inconclusive LCV findings should be cautiously interpreted, and a reassessment of the results, in future studies, is warranted. Conversely, all MR analyses provided no evidence for a significant causal relationship between AD and GIT traits, indicating that shared genetics and common biological pathways may best explain the association between AD and these GIT disorders.

We performed GWAS meta-analysis, thereby identifying seven shared loci reaching genome-wide significance for AD and GERD. The loci, including 1p31.3 (PDE4B), 3p21.31 (SEMA3F, HYAL2), 6p21.32 (HLA-DRA), and 19q13.32 (TOMM40, APOC2, ERCC2, BCL3, and KLC3), were replicated in AD vs PUD and AD vs PGM meta-analyses and largely reinforced in colocalisation (GWAS-PW) as well as gene-based association analyses. Notably, the independent SNP rs12058296 (1p31.3), mapped to the phosphodiesterase 4B (PDE4B) gene. Inhibition of PDE4B (or its subtypes) has shown promise for inflammatory diseases57,58,59,60. Indeed, recent evidence supports the potent anti-inflammatory, pro-cognitive, neuro-regenerative, and memory-enhancing properties of PDE4 inhibitors (PDE4B, in particular61), making them plausible therapeutic targets for AD59,60 and GIT disorders58. Other identified independent genome-wide significant SNPs and loci mapped to genes including CD46, SEMA3F, HLA-DRA, MTSS2, PHB, and TOMM40. The CD46 gene is a complement regulator which is bactericidal to Helicobacter (H) pylori62 and was also recently identified for AD in a transcriptome analysis63, making it a plausible candidate in both AD and GIT disorders.

We identified biological pathways, significantly enriched for genes overlapping AD and GIT disorder (GERD, and PUD) GWAS in pathway-based analyses. Notably, lipid-related, and autoimmune pathways were overrepresented. There is a close link between autoimmunity and lipid abnormalities64, and consistent with previous studies65,66,67,68,69, our findings highlight the importance of lipids homoeostasis in AD and GIT traits. In AD, for example, hypercholesterolaemia is believed to increase the permeability of the blood-brain barrier system, facilitating the entry of peripheral cholesterol into the CNS, and resulting in abnormal cholesterol metabolism in the brain65,66. Amyloidogenesis, alteration of the amyloid precursor protein degradation, accumulation of Aβ, and subsequent cognitive impairment have all been linked with elevated cholesterol in the brain66,70,71,72. Similarly, while the exact roles of lipids in GIT disorders are unclear, H. pylori is believed to cause or worsen abnormal serum lipid profiles through chronic inflammatory processes, and eradication of the infection enhances lipid homoeostasis68,69.

The mechanisms of association between AD and lipid dysregulation relate to the ‘gut–brain axis’, alterations in GIT microbiota and the immune system10,66. Moreover, lipid dysregulation is central to the interplay of AD, gut microbiota, and GIT disorders10,66, thus, suggesting the therapeutic potential of lipid-lowering medications such as lipase inhibitors and statins (identified in our study) in AD and GIT disorders. Lipase inhibitors (orlistat) prevent intestinal dietary lipid absorption, and lower total plasma triglycerides and cholesterol levels73,74, making them a preferred pharmacological treatment for obesity73. The connection between AD, lipid dysregulation, dysbiosis and the ‘gut-brain axis’10,66, may, thus, support the potential utility of lipase inhibitors in AD. Lipases, including monoacylglycerol, diacylglycerol, and lipoprotein lipases are involved in AD pathology, and can also effectively be inhibited by orlistat74. Similarly, statins possess anti-inflammatory, immune-modulating and gastroprotective properties75,76, and their active use significantly reduced PUD risk76 as well as enhanced H. pylori eradication77. Statins also improve cognitive ability and reduce neurodegeneration risks, making them potentially beneficial in AD78,79. However, there is evidence suggesting a paradoxical predisposition to reversible dementia for statins78,79. While this finding has been challenged78, it may highlight a need to identify AD patients for whom statins will be beneficial, consistent with the model of personalised health.

Our findings have implications for practice and further studies. First, results highlighting lipid-related mechanisms support the roles of abnormal lipid profiles in the aetiologies of the disorders, which may be potential biomarkers for AD and GIT disorders (or their comorbidity). Second, our findings underscore the importance of lipid homoeostasis. The dietary approach is one effective preventive as well as non-pharmacologic approach for the management of hyperlipidaemia, and overall, this is consistent with findings in this study. Indeed, adherence to a ‘Mediterranean’ diet (low in lipids) is recognised as beneficial both in AD80 and GIT disorders81. Thus, a recommendation for healthy diets, early in life, may form part of the lifestyle modifications for preventing AD and GIT disorders. The clinical utility of these recommendations will need to be further investigated and validated. Third, our study identifies lipase inhibitors and statin pathways in the mechanisms of AD and GIT disorders, which may be a potential therapeutic avenue to explore in the disorders. We hypothesise that individuals with comorbid AD and GIT traits may gain benefits from these therapies. There is a need to test this hypothesis using appropriate study designs including randomised control trials. Fourth, our study implicates the PDE4B, and given the evidence in the literature58,59,60,61, we propose that treatment targeted at its inhibition may be promising in comorbid AD and GIT traits. Lastly, while our findings do not necessarily indicate that AD and GIT disorders will always co-occur, they support their shared biology; thus, early detection of AD may benefit from probing impaired cognition in GIT disorders.

The use of multiple, complementary statistical genetic approaches enables a comprehensive analysis of the genetic associations between AD and GIT disorders and is a major strength of this study. Also, we analysed well-powered GWAS data, meaning our findings are generally not affected by small sample size, possible reverse causality, or confounders that conventional observational studies often suffer from. Nonetheless, our study has limitations that should be considered alongside the present findings. First, the GWAS for AD combined clinically diagnosed cases of AD with proxies (AD-by-proxy—individuals whose parents were diagnosed with AD). Given the high correlation between the GWAS with and without the ‘AD-by-proxy’ cases21, we argue as did others21 that combining them is valid, especially for sample size improvement, which is critical to ensuring adequately powered GWAS analysis. Second, analyses were restricted to participants of mainly European ancestry in our study, thus, findings may not be generalisable to other ancestries. Third, GIT traits GWAS were combinations of several data sources: primary care, hospital admission, medication use, and self-reported records. While there is a potential for misdiagnosis or accuracy of self-reported data, their use is well justified given the correlation in effect sizes of the data with other sources22. Moreover, additional data from other sources including ICD-10 were utilised with consistent results across these GWAS.

In conclusion, this study provides genetic insights into the long-standing debate and the observed relationship of AD with GIT disorders, implicating shared genetic susceptibility. Our findings support a significant risk increasing (but non-causal) genetic association between AD and GIT traits (GERD, PUD, PGM, gastritis-duodenitis, IBS, and diverticular disease). Also, we identified genomic regions and genes, shared by AD and GIT disorders that may potentially be targeted for further investigation, particularly, the PDE4B gene (or its subtypes) which has shown promise in inflammatory diseases57,58,59,60. Our study also underscores the importance of lipid homoeostasis and the potential relevance of statins and lipase inhibitors in AD, GIT disorders or their comorbidity. To our knowledge, this is the first comprehensive study to assess these relationships using statistical genetic approaches. Overall, these findings advance our understanding of the genetic architecture of AD, GIT disorders, and their observed co-occurring relationship.

Methods

GWAS summary statistics

The GWAS data utilised in the present study are summarised in Table 1 with further cohort-specific details, including effective sample size estimates, provided in Supplementary Data 1. The data were sourced from popular GWAS databases, repositories, and large research consortia/groups. The GWAS summary data for ‘clinically diagnosed AD and AD-by-proxy’21 (the largest publicly available AD GWAS) was used as our AD GWAS data. This GWAS has large sample size (cases = 71,880, controls = 383,378, sample size [N] = 455,258) and, thus, increased power for detecting genetic variants of small to moderate effect sizes. More specific details about the data have been published21. GIT traits including PUD (cases = 16,666, controls = 439,661, N = 456,327), IBS (cases = 28,518, controls = 426,803, N = 455,321), and IBD (cases = 7045, controls = 449,282, N = 456,327) were assessed against AD. The GWAS for the traits were obtained from the recently published GIT GWAS22 and other sources located through the GWAS Atlas24 (Supplementary Data 1). Clinically, PUD medications are indicated in GERD and gastritis, accordingly, GWAS combining diagnosis for PUD and/or GERD and/or medications commonly used for these disorders (PGM) have been conducted22, potentially identifying people with PUD or GERD. This GWAS has a large sample size (cases = 90,175, controls = 366,152, N = 456,327), and as was the case in the original publication22, we utilised the data for analysis in the present study, as a proxy for PUD or GERD. These GIT GWAS were well characterised and, where possible, validated as described in the original publication22.

Additionally, we utilised a well-characterised GWAS for GERD (cases = 71,522, controls = 261,079, N = 332,601), which combined data sets from the UK Biobank and the QSKIN study23. Gastritis-duodenitis (cases = 28,941, controls = 378,124, N = 407,065) and diverticular disease (cases = 27,311, controls = 334,783, N = 362,094) GWAS from the Lee Lab (https://www.leelabsg.org/resources) were also used in this study. We utilised additional (available) GWAS summary data (Table 1 and Supplementary Data 1) sourced from public repositories used for possible replication of our genetic overlap and correlation (LDSC and SECA) findings. A comprehensive description of the quality control procedures for each of the GWAS data and their analysis are available through the corresponding publications (Table 1 and Supplementary Data 1). Our preliminary analysis indicates that there is no significant sample overlap between the AD GWAS and each of the GIT GWAS assessed in this study (Supplementary Data 2), ruling out the possibility of bias from such occurrence.

Linkage disequilibrium score regression analysis (LDSC)

We assessed and quantified SNP-level genetic correlation between AD and GIT disorders using the LDSC25 analysis method (https://github.com/bulik/ldsc/wiki/Heritability-and-Genetic-Correlation). LDSC assesses and distinguishes the contributions of polygenicity, sample overlaps, and population stratification to the heritability and genetic correlation between traits25. In the present study, we performed LDSC analysis using the standalone version of the software and by following the procedures provided by the program developer (https://github.com/bulik/ldsc). The apolipoprotein E (APOE) region has a large effect on the risk of AD; hence, we excluded APOE and the 500 kilobase (kb) flanking region (hg19, 19:44,909,039–45,912,650) from the AD GWAS for this analysis. We also excluded SNPs in the 26–36 megabase region of chromosome six from the data given the complex LD structure in the human major histocompatibility complex (MHC). To assess possible sample overlap between AD GWAS and each of the GIT GWAS, we performed LDSC correlation analysis with the genetic covariance intercept unconstrained. The result of this analysis indicates that the estimated genetic covariance intercepts were not significantly different from zero (Supplementary Data 2), indicating no significant sample overlap between our AD and GIT GWAS. Thus, we constrained the intercept in the reported genetic correlation analysis. We applied Bonferroni adjustment for testing the effects of seven GIT traits on AD (0.05/7 = 7.1 × 10−3), and all genetic correlation results surviving this adjustment were considered significant while those having P < 0.05 were regarded as nominally significant.

SNP effect concordance analysis (SECA)

We used the standalone version of the SECA software pipeline to perform SNP-level genetic overlap assessment and statistical tests between AD and GIT disorders. A detailed description of the SECA software and methods has been published26. Briefly, SECA accepts a pair of GWAS data (data set 1 and data set 2) as input and performs a range of analyses to assess concordance in effect direction between a pair of traits—AD and GIT disorders in the present study. First, we carried out quality control to exclude all non-rsID(s) and duplicate variants in data set 1 and align SNP effects to the same effect allele across data set 1 and data set 2. Second we performed two rounds of P-value informed LD clumping in data set 1 (first clumping: -clump-r2 0.1, -clump-kb 1000; second clumping: -clump-r2 0.1, -clump-kb 10000) using PLINK 1.9030.

Third, SECA partitions independent SNPs resulting from LD clumping into 12 subsets of SNPs according to the P value for data set 1 as follows: P1 ≤ (0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0). SECA subsequently performs Fisher’s exact tests to assess the presence of excess SNPs in which the direction of effects is concordant across data set 1 and data set 2 (that is, for the corresponding P value derived 12 subsets of SNPs associated in data set 2, P2). Hence, a total of 144 SNP subsets (a 12 by 12 matrix from data set 1 and data set 2) were assessed for SNP effect concordance. SECA calculates permuted P value for the number of significant associations with adjustment for testing 144 associations (based on permutations of 1000 replicates).

In the present study, we first assessed AD GWAS as data set 1 and each of the GIT disorders as data set 2. For comparison, we also assessed each of the GIT disorders as data set 1 against AD as data set 2. Thus, using SECA, we assessed the effects of AD-associated SNPs on each of the GIT disorders and vice versa. Since SECA is conditioned on data set 1, the bi-directional assessment is an important analysis step to account for instances where SNPs that are strongly associated with AD do not affect GIT traits and vice versa. Further, the bi-directional analysis (which is not possible with LDSC, for example) enables the assessment of whether the observed genetic overlap is driven primarily by only one of the traits or both thereby enhancing a better understanding of their association.

GWAS cross-traits meta-analysis

GWAS meta-analysis pools the results of GWAS data, thereby increasing the sample sizes and augmenting the detection of genetic variants with small to moderate effect sizes. In the present study, we used the GWAS meta-analysis method of pooling AD GWAS with each of the GIT traits (cross-disorder or cross-trait meta-analysis). We used two models of meta-analysis: the Fixed Effect (FE), and the modified Random Effect (RE2)82 models. The FE model estimates the FE P-value using the inverse‐variance weighted method, which assumes that the AD and each of the GIT disorders’ GWAS are assessing the same (fixed) effect. The presence of effect heterogeneity is a limitation of the model. On the other hand, by estimating P-values using the modified random effects, the RE2 model82 allows for differences in SNP effects and the method is powerful in the presence of SNP effect heterogeneity.

Genomic loci characterisation

Using the outputs of our cross-trait meta-analyses for AD and each of the GIT disorders, we carried out some downstream analyses including functional annotation of SNPs, and genomic loci characterisation in line with practice in the previous studies13,55,83,84. Briefly, SNPs that were not genome-wide significant in the individual AD and GIT disorder GWAS, but which reached genome-wide significance following the meta-analysis were identified. From these, we characterised independent SNPs at r2  <  0.6, and lead SNPs at r2  <  0.1. We defined the genomic locus as the region within 250 kb of each lead SNP. We assigned lead SNPs within this region to the same locus, meaning two or more lead SNPs may be present in one locus. We performed these downstream analyses using the Functional Mapping and Annotation (FUMA) software (an online platform)55. We subsequently queried identified loci in the GWAS catalogue (https://www.ebi.ac.uk/gwas) and Open Targets Genetics (https://genetics.opentargets.org) to assess their previous identification for AD, GIT disorders or other traits.

Pairwise GWAS analysis

We performed a co-localisation analysis utilising the pairwise GWAS (GWAS-PW) method27 to further assess the regions in the genome shared by AD and GIT disorders. Briefly, GWAS-PW software implements the Bayesian pleiotropy association test and identifies genomic regions that influence a pair of correlated traits27. We used this method to assess whether the loci reaching genome-wide significance in our GWAS meta-analyses were truly shared by AD and the GIT disorders. Also, we investigated other shared genomic regions which may not have been found in the GWAS meta-analysis. We combined the summary data for AD with the data for each of the GIT disorders and estimated the posterior probability of association (PPA) of a genomic region using the GWAS-PW software. We modelled four PPAs: (i) that a genomic region is associated with AD only (PPA-1), (ii) that a genomic region is associated with the GIT trait only (PPA-2), (iii) that a genomic region is associated with both AD and the GIT trait and the causal variant is the same (PPA-3), and (iv) that a specific genomic region is associated with both AD and the GIT trait but through separate causal variants (PPA-4)27.

Causal relationship assessment

Using MR28 analysis methods, we assessed the causal association between AD and each of the GIT disorders in this study. Mimicking randomised control trials (RCTs), MR analysis incorporates genetics into epidemiological study designs to assess causality28. The method is based on the principle of instrumental variables and underpinned by three primary assumptions. First is the relevance assumption which requires that the chosen instruments are robustly associated with the exposure variable85. Second is the independence assumption which states that the instruments must not be associated with confounders of the exposure-outcome variables85. Last is the assumption of exclusion which demands that the instruments influence the outcome only through their relationship with the exposure variable85.

In the present study, we used the two-sample MR method (https://mrcieu.github.io/TwoSampleMR/articles/introduction.html) for a bidirectional association assessment between AD and each of the GIT disorders. In the first round of analysis (AD as exposure variable), independent (r2 < 0.001) genome-wide significant SNPs (P < 5 × 10−8) associated with AD were utilised as instrumental variables (IVs) and assessed against each of the GIT disorders’ GWAS (outcome variables) analysed in this study. This analysis assesses whether genetic predisposition to AD is causally associated with any of the GIT traits included in the present study. Reversing the direction of analysis, independent SNPs robustly associated with each of the GIT disorders’ GWAS (exposure variables) were similarly utilised as IVs and assessed against AD (as the outcome variable). In this instance, we assessed the potential causal effects of GIT traits on AD.

We used the inverse variance weighted (IVW) model of MR as the primary method for causal association assessment, and for validity testing, we performed a heterogeneity test (Cochran’s Q-test), a ‘leave-one-out’ analysis, a horizontal pleiotropy check (MR-Egger intercept) and individual SNP MR analyses. Also, we used other MR analysis models including the MR-Egger, weighted median86,87, and the ‘Mendelian randomisation pleiotropy residual sum and outlier’ (MR-PRESSO)88 methods for sensitivity testing. The MR-Egger and weighted median models operate under weaker assumptions of MR and are designed to provide valid causal estimates even when horizontal pleiotropy is present in all (MR-Egger) or as much as 50% (weighted median) of selected IVs86,87. Conversely, the MR-PRESSO method can detect and correct horizontal pleiotropy by excluding outlier IVs thereby improving valid causal estimates88. All MR analyses were performed in R (4.0.2).

We performed an additional assessment of the causal or partial causal association between AD and each of the GIT disorders using the Latent Causal Variable (LCV) method29. LCV estimates causality proportion (GCP) ranging from −1 to 1 where a value close to 1 indicates a potential causal association between two traits in the forward direction and −1 in the backward direction29. LCV corrects for heritability and genetic correlation between traits and is not limited by sample overlap29. This analysis was performed in the online platform of the Genetics of Complex Traits (CTG) virtual laboratory (https://vl.genoma.io/analyses/lcv)29,89.

Gene-based association analysis

We performed gene-based association analyses to identify genome-wide significant genes shared by both AD and each of the GIT disorders assessed in this study. This analysis complements the SNP-based studies. However, beyond the SNP level, gene-based association analysis provides greater power for identifying genetic risk variants since it aggregates the effects of multiple SNPs, and it is generally not limited by small effect sizes or correlations among SNPs. Moreover, genes are more closely related to biology than SNPs, meaning gene-level analysis can provide better insights into the underlying biological mechanisms of complex traits.

In the present study, we carried out gene-based association analysis separately for AD and GERD using the multi-marker analysis of genomic annotation (MAGMA) software, implemented in the FUMA (https://fuma.ctglab.nl/)55 platform. We defined gene boundaries length within ±0 kb outside the gene, and to ensure that equivalent gene-based tests were performed, we utilised SNPs overlapping AD and GERD GWAS in analysis separately for each of the traits. Following a similar procedure, we also performed gene-based analysis using SNPs overlapping AD and PGM GWAS.

Based on the results of the gene-based analysis, we identified genome-wide significant genes for each of the traits—AD, GERD and PGM—at an adjusted P value of 2.64 × 10−6 (0.05/18929: Bonferroni adjustment for testing 18,929 genes). Further, to identify genes shared by AD and each of GERD and PGM, we extracted their overlapping genes at gene P value <0.1 (Pgene < 0.1). We combined the respective P values for AD and the GIT traits using Fisher’s Combined P-value (FCP) method and thereafter identified shared genes reaching genome-wide significance for AD and each of GERD and PGM in the FCP analyses.

Pathway-based functional enrichment analysis

For a better understanding of the potential biological mechanisms underlying AD and GIT disorders or their comorbidity, we carried out pathway-based functional enrichment analyses using the online platform of the g:GOst tool in the g-profiler software56. The g:GOst tool performs analysis on the list of user-inputted genes and queries relevant databases including Gene Ontology, Human Protein Atlas, WikiPathway, Human Phenotype Ontology, CORUM, Kyoto Encyclopedia of Genes (KEGG), and Reactome. This analysis enables us to functionally interpret genes overlapping AD and GIT disorders. We included genes that were overlapping between AD and each of GERD and PGM at Pgene < 0.05 (FCP < 0.02) in this analysis, and followed established protocols90. Functional category term sizes were restricted to values from 5 to 35090. For multiple testing corrections, we applied the default ‘g: SCS algorithm’ recommended in the protocol90 and reported the significantly enriched biological pathways at the multiple testing adjusted P value [Padjusted] < 0.05.

Statistics and reproducibility

We performed statistical analysis mainly in the Unix environment and the R (https://www.r-project.org/) software. Additional software including Python (https://www.python.org/), Plink (https://www.cog-genomics.org/plink/) and online platforms (CTG virtual lab: https://vl.genoma.io/updates, G-profiler: https://biit.cs.ut.ee/gprofiler, and FUMA: https://fuma.ctglab.nl) were utilised. Adjustment for multiple testing was carried out using the Bonferroni approach in LDSC, gene-based and meta-analyses. In G-profiler, we applied the recommended inbuilt ‘g: SCS algorithm’ for multiple testing corrections. To enable us to test the reproducibility of AD and GIT association, we used available GIT data for further analysis.

Ethics approval and consent to participate

This study is a secondary analysis of existing GWAS summary data from public repositories, and international research consortia. Specific and relevant ethics approval for each of the data utilised is presented in the associated publications described in the section for GWAS summary data. No additional ethics approval is required for the conduct of the present study.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.