Genome-wide analysis of 53,400 people with irritable bowel syndrome highlights shared genetic pathways with mood and anxiety disorders

Irritable bowel syndrome (IBS) results from disordered brain–gut interactions. Identifying susceptibility genes could highlight the underlying pathophysiological mechanisms. We designed a digestive health questionnaire for UK Biobank and combined identified cases with IBS with independent cohorts. We conducted a genome-wide association study with 53,400 cases and 433,201 controls and replicated significant associations in a 23andMe panel (205,252 cases and 1,384,055 controls). Our study identified and confirmed six genetic susceptibility loci for IBS. Implicated genes included NCAM1, CADM2, PHF2/FAM120A, DOCK9, CKAP2/TPTE2P3 and BAG6. The first four are associated with mood and anxiety disorders, expressed in the nervous system, or both. Mirroring this, we also found strong genome-wide correlation between the risk of IBS and anxiety, neuroticism and depression (rg > 0.5). Additional analyses suggested this arises due to shared pathogenic pathways rather than, for example, anxiety causing abdominal symptoms. Implicated mechanisms require further exploration to help understand the altered brain–gut interactions underlying IBS.

Regarding comorbidities, as documented previously, the rates of appendicectomy, cholecystectomy and hysterectomy were all increased in IBS (Supplementary Table 5), as were the rates of atopic disease (Table 1). Anxiety and depression were each approximately twice as common (Table 1 and Supplementary Table 4); 34.3% of cases reported treatment for anxiety compared with 16.1% of controls. This effect was more prominent in individuals medically diagnosed with IBS.
The respective prevalence for functional constipation and functional diarrhea (that is, bowel disturbance without abdominal pain or discomfort; see 'Definitions of IBS cases' in the Supplementary Note) were 6.4% and 11.7% (Table 1). Somatic symptoms (PHQ-12) and treatment for anxiety or depression were less strongly associated with functional diarrhea than with IBS-D (excess OR and 95% CI = 1.24 (1.22-1.26) per PHQ-12 unit and 1.58 (1.46-1.72), respectively), with similar effects for functional constipation and IBS-C (Table 1).

Genetics.
We identified six independent IBS susceptibility loci at genome-wide significance (P < 5 × 10 −8 ) in a discovery cohort totaling 53,400 cases and 433,201 controls ( Fig. 2a and Supplementary  Fig. 5). This resulted from pooling IBS cases across all case definitions to maximize power, in a meta-analysis of data from UKB (40,548 cases and 293,220 controls; Supplementary Tables 1  and 2) and the international collaborative Bellygenes initiative  (12,852 cases and 139,981 controls; Methods and Supplementary  Table 9). Using data from an independent panel from 23andMe (Supplementary Note), all six loci were replicated at Bonferroni significance (P < 0.0083) with the same direction of effect (Table 2). All were found on autosomal chromosomes (none on the X chromosome) and conferred modest ORs < 1.05. Three out of six loci also had reported associations with mood and anxiety disorders and related phenotypes [21][22][23][24][25] .
We undertook genetic fine-mapping to establish plausible causal variants ( Supplementary Fig. 5) and used several techniques to identify candidate causal genes within IBS risk loci (Supplementary  Table 10; see 'Gene mapping' in the Supplementary Note). Among the genes implicated (Table 2) were two encoding neural adhesion molecules: neural cell adhesion molecule 1 (NCAM1) and cell adhesion molecule 2 (CADM2). Ranking tissues according to enrichment for risk gene expression ( Supplementary Fig. 6), the brain came top of the list (LDSC applied to specifically expressed genes 26 coefficient = 8.32 × 10 −10 , s.e.m. = 4.5 × 10 −10 , P = 0.03). However, this result was not statistically significant after correcting for multiple testing, which may in part reflect lack of power due to low SNP heritability. Using expression colocalization analysis as a separate method to implicate specific gene-tissue combinations, we found evidence that the six IBS-associated variants regulate gene expression across a number of tissues, with many genes particularly expressed in the brain (Fig. 2b).
One association mapped to the major histocompatibility complex (MHC) class 3 region close to BAG cochaperone 6 (BAG6). The signal is not driven by human leukocyte antigen (HLA) alleles and is independent of known MHC associations with ulcerative colitis, celiac disease or microscopic colitis (Supplementary Fig. 7 and Supplementary Tables 11 and 12) (refs. [27][28][29][30]. It is also independent of lead variants for neuroticism at this locus (highest r 2 = 0.51) 23 .
Eight additional loci showed genome-wide significant association with various IBS definitions (Methods) but not the whole discovery cohort, of which five were replicated in the 23andMe data ( Supplementary Fig. 8 and Supplementary Tables 13 and  14). These require further study. The female-specific signal identified previously 14 for unprompted self-reported IBS in the UKB was also observed in our female-specific analysis of unprompted self-reported data but was not detected in female-specific analyses of any other case definitions from UKB or Bellygenes initiative, nor replicated in the 23andMe unstratified analyses of both sexes (Supplementary Table 15), possibly suggesting survey-specific factors playing a role. Specific candidate gene associations previously reviewed in the literature 31,32 also did not show significant evidence of association after multiple testing correction (all P > 0.015).
LDSC estimated a modest but significant genome-wide SNP heritability for IBS of 5.77% (s.e.m. = 0.35%) in the discovery cohort, with no evidence of population stratification (LDSC intercept = 0.9951, s.e.m.= 0.007). This was consistent across case definitions within UKB (h 2 range of 5.42-7.71%), with similar values seen in the Bellygenes (h 2 = 3.14%, s.e.m. = 0.74%) and 23andMe cohorts (h 2 = 5.39%, s.e.m.= 0.02%).  Fig. 1 | Diagnostic modalities and comorbidities of IBS. a, Venn diagram of overlap between UKb IbS cases by different diagnostic modality, split by DHQ respondents and nonrespondents. The areas and numbers indicate the sample size. most participants with current symptoms (DHQ rome III, yellow) did not report being diagnosed with IbS either when listing medical conditions unprompted at UKb enrollment (unprompted self-report, green) or when asked specifically about a previous IbS diagnosis when completing the DHQ (DHQ self-report, blue). Conversely, many participants previously diagnosed with IbS, even those formally recorded during a hospital admission (hospital ICD-10, red), did not have symptoms sufficient for rome III criteria IbS diagnosis at the time of their DHQ response. b, Among individuals experiencing IbS symptoms (DHQ rome III positive), those previously diagnosed by a clinician had greater symptom severity, with an increase in the number of IbS diagnostic modalities (connected dots, middle; top: sample size is shown) being associated with an increase in symptom severity score (IbS-SSS, bottom). Distributions are colored by the number of diagnoses and the groups shown are mutually exclusive. For post-hoc statistics, see Supplementary Table 3. c, Severity of different somatic symptoms in the past three months among digestively healthy controls and IbS cases (classified as mild, moderate and severe based on IbS-SSS). mean scores for PHQ-12 items ranked from 0 (not bothered at all) to 2 (bothered a lot) are shown. Pooled refers to all UKb cases in the discovery cohort. d, As above, for symptoms of anxiety in the last two weeks, measured using average scores for GAD-7 items ranked from 0 (never bothered) to 3 (bothered nearly every day).
IBS-C showed weak genetic correlation with functional constipation, as did IBS-D with functional diarrhea (Supplementary Fig.  9). IBS-C and IBS-D correlated with each other but there were no cross-correlations, that is, IBS-C did not correlate with functional diarrhea. Heritability for the IBS subtypes was comparable with IBS overall; IBS subtypes showed similar genetic correlation with mental health and personality traits (Supplementary Table 16).
We compared the overlap between susceptibility with IBS and 751 other traits and diseases listed in the LD Hub 33 . The strongest correlations in genome-wide risk were with mood and anxiety disorders and related phenotypes, including anxiety (r g = 0.58, s.e.m. = 0.10), neuroticism (r g = 0.54, s.e.m. = 0.04), depression (r g = 0.53, s.e.m. = 0.05) and insomnia (r g = 0.42, s.e.m. = 0.05) 33 . Across the genome, the same alleles that predisposed to IBS also predisposed to mood and anxiety disorders. The correlations were consistent regardless of the mode of diagnosis of anxiety or depression ( Supplementary Fig. 10) (refs. 34,35 ). We calculated phenotypic correlations for these traits on a comparable liability scale ( Fig. 3 and Supplementary Table 16). Mostly, the phenotypic and genotypic correlations mirrored each other, although genetic correlations were often larger. Notably, other digestive diseases presenting with similar symptoms, including celiac disease (r g = 0.03, s.e.m. = 0.08, P = 0.69) and Crohn's disease (r g = 0.08, s.e.m. = 0.04, P = 0.06), were not genetically correlated with IBS.
We also ran higher-specificity (IBS cases meeting at least 2 of the 4 UKB case definitions, 11,201 cases and 293,220 controls) and high-severity (IBS-SSS > 300, 4,296 cases and 72,356 controls) analyses in UKB. The former produced no new associations. The latter, while being more heritable (liability scale h 2 = 0.42, s.e.m. = 0.05, Cochran's Q = 51.7, P = 6.31 × 10 −13 compared with the discovery cohort IBS), produced one association (rs9947289, P = 2.80 × 10 −8 ) that did not replicate (P = 0.57 in the 23andMe data; Supplementary Table 13). Both of these phenotypes recapitulated the same genetic correlation with mood and anxiety disorders as found in the discovery cohort ( Supplementary Fig. 11).
To explore the role of shared genetic risk versus direct phenotypic overlap, we compared the genome-wide association study (GWAS) results for IBS having removed participants with anxiety to the GWAS results for anxiety having removed participants with IBS (for anxiety definitions, see Supplementary Tables 17 and 18). The genetic correlation between IBS and anxiety attenuated but remained strong (r g = 0.31, s.e.m. = 0.06; Supplementary Fig. 12). We next used bidirectional Mendelian randomization 36 with an independent anxiety GWAS 37 , as well as genome-wide latent vari- Digestively healthy controls and functional constipation as well as diarrhea groups are shown for reference. Gastrointestinal symptoms are captured by the IbS-SSS (range 0-500), while somatic symptoms are captured by the (modified) PHQ-12 (range 0-22). The GAD-7 score captures symptoms of anxiety (range 0-21). The single asterisk marks significant differences from the control group after adjusting for age, sex, DHQ participation and (bonferroni) multiple testing at P < 0.05/108 (two-sided logistic regression test). Age and sex differences were not tested.
able Mendelian randomization 38 , to explore directionality. Multiple models could explain our data (Supplementary Table 19) but they were best explained by shared genetic risk pathways rather than causal effects between the two traits. Similar complex causal relationships were evident between IBS and mental health and personality traits other than anxiety (Supplementary Table 19).

Discussion
The importance of this study lies in its scale and therefore the robustness of its genetic results. We have identified replicable genetic associations for IBS, providing new biological insights, while demonstrating that overall its heritability is modest. Two observations are particularly striking: the genetic overlap between IBS and mood and anxiety disorders and the lack of signals implicating genes expressed specifically in the gut or overlapping other intestinal disorders. Our findings suggest that, with respect to the genetically determined risk for IBS, neuronal pathways play a dominant role. Increasing abdominal symptom severity correlated with increasing PHQ-12 somatic symptom scores, particularly for the domains of tiredness, back pain, limb pain and headache (Fig. 1). Multifocal pain suggests either poor coping skills, perhaps relating to psychological comorbidity, or visceral hypersensitivity from aberrant antinociceptive mechanisms 39 . By contrast, the painless bowel disorders, functional constipation and functional diarrhea, were less strongly associated with raised PHQ-12 scores or psychological comorbidity. IBS showed the strongest genome-wide overlap with psychological traits: anxiety, neuroticism, depression and schizophrenia (Fig.  3). GAD-7 anxiety scores correlated with IBS severity (Fig. 1) and 34.3% of cases with IBS had sought or had been treated for anxiety versus 16.1% of controls (Table 1). Although the phenotypic correlation was strong, the genetic correlation appeared quantitatively even greater (Fig. 3). Furthermore, this genetic correlation between IBS and anxiety persisted even after eliminating data from individuals with phenotypic overlap (that is, between GWAS for 'IBS excluding anxiety' and 'anxiety excluding IBS'; Supplementary Fig.  12). Thus, their co-occurrence probably reflects shared etiologic pathways between IBS and anxiety rather than one condition simply causing the other. This conclusion was supported by the Mendelian randomization analysis.
Four out of six of the confirmed IBS loci implicated genes influencing mood or anxiety disorders, genes expressed in the nervous system or both. These include NCAM1 (also associated with neuroticism, anxiety, mood disorders and anorexia nervosa) 23,25,40 , CADM2 (also associated with neuroticism, anxiety and cannabis use) 21,22 , PHD finger protein 2 (PHF2)/family with sequence similarity 120A (FAM120A) (also associated with neuroticism, depression and autism) 23,24 and dedicator of cytokinesis 9 (DOCK9). Brain expression of NCAM1 and CADM2 was implicated in our colocalization analysis (Fig. 2b and Supplementary Table 10): both regulate neural circuit formation and influence changes in white matter microstructure found in both mood disorders and IBS 25,41,42 . PHF2 and DOCK9 also play key roles in brain development 43,44 . Of note, NCAM1, PHF2 and DOCK9 are also expressed in the rich network of nerve fibers and ganglia of the gut, while CADM2 is not 45 . Predominant brain expression, combined with the coassociation of IBS with several psychological traits, perhaps most strongly implicates the central nervous system as the site where these gene variants exert their action. However, the genetic variants may also be acting peripherally for the subset expressed in the enteric nervous system, which shares many neurotransmitters, signaling pathways and anatomical properties as well as rich communication with the brain.
The MHC signal is independent of known HLA associations with ulcerative colitis and celiac disease; in fact, it localizes to BAG6 (Supplementary Fig. 7 and Supplementary Tables 10-12). BAG6 is known to chaperone misfolded proteins, regulate membrane protein dynamics and affect diverse processes from apoptosis to antigen presentation 46,47 . Functional exploration of BAG6 may yield new IBS pathophysiological insights unconnected to the nervous system. IBS genome-wide SNP heritability was just 5.8% (s.e.m. < 0.01) in the European ancestry population in this study and the effect sizes of our susceptibility loci were modest (OR < 1.05). Earlier genetic studies of IBS were underpowered to detect such small effects. By comparison, SNP heritability estimates for Crohn's disease, ulcerative colitis and anxiety are 41%, 23% and 26%, respectively 48,49 . Previous IBS heritability estimates, from family and twin studies, varied widely at 0-57% (ref. 11 ). Our results indicate that the genetic contribution to IBS heritability is modest and imply that additional environmental factors, including dysbiosis, diet, stress and learned behaviors, all potentially shared within families, play a more prominent role.
Regarding dysbiosis, we noted increased childhood exposure to antibiotics among IBS cases (20.0%) versus controls (9.6%). While there are clearly biases inherent to recall of events from childhood, this result is corroborated by previous studies specifically set up to address this question 50 . Interestingly, we saw the same association with anxiety (18.4%). Among possible explanations, childhood antibiotics might increase the risk of IBS (and perhaps anxiety) by embedding a dysbiotic gut flora and disturbing the balance of short-chain fatty acid metabolites known to influence microglial development and mood 50,51 . Equally, anxiety in late adulthood might influence recall of childhood antibiotic exposure, and familial anxiety might lead parents to take their offspring to the doctor repeatedly for minor ailments, resulting in recurrent antibiotic exposure. While enteric infection can alter the baseline gut microbiota and trigger PI-IBS, in the UKB PI-IBS closely mirrored 'conventional' IBS in terms of symptom severity, frequency of family history and association with psychological traits, suggesting that the infectious 'seed' falls on fertile ground to trigger IBS in predisposed individuals.
One question is whether the neuronal emphasis of our results derives from our strategy of combining multiple IBS definitions Table 2 | Variants associated with IBS, their effect measured in the discovery cohort and P values for association in the discovery cohort, the replication cohort and the meta-analysis of these two to increase statistical power, including pooling 'opposite' subtypes (for example, IBS-C and IBS-D), that is, whether gut-specific effects might be lost in the pooling such that the brain remains the common link between these. However, the heritability of IBS subtypes is comparable with IBS overall; IBS-C and IBS-D share approximately 50% of their genetic susceptibility and each of the subgroups also individually genetically correlates with mental health and personality traits (Supplementary Table 16). Furthermore, subtype GWAS identified only one significant signal in IBS-C and none in IBS-D, suggesting an absence of strong subtype-specific, possibly gut-focused genetic effects. Aside from the pooling strategy, justified by our LDSC analysis (Methods), other potential weaknesses include the use of Rome III criteria instead of the more restricted Rome IV criteria, since the former were the standard at the time of study design 52 , the fact the IBS diagnosis was made based on Rome III symptoms reported via the DHQ rather than by medical review for nearly half of cases in the UKB cohort and the limited age range and ancestry of UKB. However, we believe that the fact that all of the loci identified at genome-wide significance thresholds in the discovery panel replicated in the independent 23andMe panel validates both the findings and the approach taken.
Our GWAS and the results of our polygenic analyses provide important new insights. Individual loci identified by the GWAS implicate new target genes within previously under-researched pathways (for example, neuronal adhesion). Mendelian randomization and genome-wide correlation analyses demonstrate shared genetic risk pathways between anxiety and IBS that are independent of the comorbidity between these two traits. This may point toward a mechanistic rationale for the efficacy of psychoactive medications and behavioral therapies and suggest that more attention should be paid to identifying new therapeutics that target neuronal function. We anticipate that future research will build on our discoveries, both by investigating the target genes identified and exploring the shared genetic risk across traits to improve our understanding of the disordered brain-gut interactions that characterize IBS.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41588-021-00950-8.  Participants with conditions such as celiac disease, inflammatory bowel disease or previous intestinal resectional surgery that could result in IBS-like symptoms were excluded from both cases and digestively healthy controls to avoid signal contamination. For detailed case and control inclusion and exclusion criteria, see Supplementary Tables 1 and 2, respectively. To maximize sample size, cases from the 4 UKB groups were pooled (n = 40,548). This approach was supported by demonstrating high genetic correlations between them using LDSC 48 following a separate GWAS on each (minimum pairwise r g = 0.70, s.e.m. = 0.06; Supplementary  Fig. 13) and by previous literature on the consistency of genetic results obtained from different diagnostic definitions in UKB 16 .
We then meta-analyzed IBS GWAS data from UKB (40,548 cases) and Bellygenes initiative (12,852 cases and 139,981 controls; Supplementary Table  9), an international collaboration studying IBS genetics based on electronic medical records, specialist diagnoses form tertiary clinics and questionnaire data (including Rome III criteria) across multiple cohorts, having again demonstrated high genetic correlation between them (r g = 0.998, s.e.m. = 0.129). This produced a total discovery cohort of 53,400 cases and 433,201 controls. Evidence of replication was sought in a large 23andMe dataset (Supplementary Note). 23andMe cases (n = 205,252) self-reported being diagnosed or treated for IBS while controls (n = 1,384,055) did not.
Analyses of IBS subtypes were conducted solely using UKB DHQ data based on standard definitions of IBS-C, IBS-D, IBS-M and IBS-U according to the frequency of hard or lumpy stools versus loose, mushy or watery stools. Functional constipation and functional diarrhea cases were identified similarly, and with the same exclusions per IBS cases, but (in contrast to the Rome III definition of IBS) needed to have responded 'never' when asked about the frequency of abdominal pain in the last three months. Likewise, analyses of IBS severity (using the IBS-SSS) and associated somatic symptoms (using the PHQ-12) were restricted to DHQ respondents. Anxiety and depression were identified among UKB participants based on previously surveyed responses to GAD-7 anxiety and PHQ-9 depression questionnaires, self-report of diagnosis with depression or anxiety/panic attack, diagnostic codes for major depression and phobic or generalized anxiety disorder in electronic healthcare records or reporting of treatment being sought or offered for these conditions in our DHQ (Supplementary Note).

Statistical analysis.
Association between IBS and nongenetic risk factors, including risk factors assayed by recall from the DHQ, was tested using logistic regression conditioning on age and sex (Supplementary Note, 'Nongenetic associations').
Standard genetic quality control was carried out to remove samples with poor genotype quality and variants with poor genotyping or imputation performance. Only participants of European ancestry were included in the discovery dataset due to the limited number of non-European ancestry participants. GWAS were conducted using a linear mixed model (BOLT-LMM v.2.3.2) 53 to control for population stratification and relatedness. Meta-analysis of GWAS summary statistics was carried out using METAL (March 2011 release) 54 . The UKB GWAS was stratified into DHQ respondents and nonrespondents, with results meta-analyzed to avoid genetic confounding with questionnaire response (Supplementary Fig. 14).