Cross-tissue transcriptome-wide association studies identify susceptibility genes shared between schizophrenia and inflammatory bowel disease

Genetic correlations and an increased incidence of psychiatric disorders in inflammatory-bowel disease have been reported, but shared molecular mechanisms are unknown. We performed cross-tissue and multiple-gene conditioned transcriptome-wide association studies for 23 tissues of the gut-brain-axis using genome-wide association studies data sets (total 180,592 patients) for Crohn’s disease, ulcerative colitis, primary sclerosing cholangitis, schizophrenia, bipolar disorder, major depressive disorder and attention-deficit/hyperactivity disorder. We identified NR5A2, SATB2, and PPP3CA (encoding a target for calcineurin inhibitors in refractory ulcerative colitis) as shared susceptibility genes with transcriptome-wide significance both for Crohn’s disease, ulcerative colitis and schizophrenia, largely explaining fine-mapped association signals at nearby genome-wide association study susceptibility loci. Analysis of bulk and single-cell RNA-sequencing data showed that PPP3CA expression was strongest in neurons and in enteroendocrine and Paneth-like cells of the ileum, colon, and rectum, indicating a possible link to the gut-brain-axis. PPP3CA together with three further suggestive loci can be linked to calcineurin-related signaling pathways such as NFAT activation or Wnt.

T he direct link (gut-brain-axis [GBA]) of intestinal dysfunction and inflammation with brain development and mental illness is one of the most current topics in biomedical research [1][2][3] . Gastrointestinal symptoms have been observed in patients with neurobehavioral, neurodevelopmental, and mental diseases, including attention-deficit/hyperactivity disorder (ADHD) 4 , schizophrenia (SCZ) 5 , and major depressive disorder (MDD) 6 . Vice versa, psychiatric comorbidity, including depression, anxiety, bipolar disorder (BD), and SCZ, is known to be more prevalent in inflammatory bowel disease (IBD) patients [7][8][9] . Given the polygenic nature of most psychiatric and chronic inflammatory diseases, some of the (potentially shared) phenotypic variation in disease risk is expected to be explained by expression quantitative trait loci (eQTL) active in disease-relevant tissues of the GBA 10 . A significant shared proportion of genomic correlation was detected between inflammatory (Crohn's disease [CD], ulcerative colitis [UC]), and psychiatric traits (SCZ, BD) using LD score regression (LDSC) 11 . However, it remains unclear, whether the correlation is due to a few loci or is distributed across the genome.
In this study, we performed cross-tissue transcriptome-wide association studies (TWAS) for 23 tissues of the GBA using genome-wide association study (GWAS) summary statistics from ten case-control GWAS data sets (total 180,592 patients and 290,737 nonoverlapping controls, Supplementary Data 1; Methods) of three clinically related inflammatory diseases of the gastrointestinal tract (CD, UC and primary sclerosing cholangitis (PSC) because PSC patients suffer from a highly increased frequency (62-83%) of IBD called PSC-IBD 12 ) and four diseases of the mind and brain (SCZ, MDD, BD, and ADHD) for which moderate to strong genetic correlation (r g > 0.3) at the genomewide level has been described between chronic inflammatory diseases 13 and psychiatric diseases 14 , respectively. The TWAS approach can be viewed as a transcriptome-wide screening method to test for gene-disease associations from GWAS data sets 15 . By using a recently published principled cross-tissue TWAS method (UTMOST) 16 -simultaneously training expression imputation models across multiple disease-relevant tissues and performing cross-tissue gene-level conditioned association tests-in combination with GWAS/TWAS conditional analysis for cross-phenotype studies we were able to address several types of confounding factors that have previously compromised the validity of TWAS studies for complex diseases 15,17 : (1) poor TWAS prediction accuracy due to imputation from reference expression panels containing tissues not relevant to disease, (2) a bias in the number of transcriptome-wide significant gene-disease associations due to different sample sizes of reference data sets in expression imputation models 18 , (3) an excess of TWAS association signals at loci with multiple TWAS signals due to correlated expression among genes of the same locus, and (4) an inaccurate determination of association boundaries for TWAS signals based on SNP information.
The objectives of this cross-tissue, cross-phenotype TWAS were to (i) evaluate the effectiveness of cross-tissue imputation models using ten GWAS consortium studies of psychiatric and inflammatory disease; (ii) identify transcriptome-wide significant genedisease associations in GBA tissues for each disease and localize the most strongly associated genes through cross-tissue and multiplegene-conditioned fine-mapping analyses; (iii) quantify the overall component of (shared) genetically regulated expressional changes (ρ E ) for immune-mediated and psychiatric diseases in relation to genome-wide genetic correlation (ρ G ); (iv) identify comorbidities and causal relationships using population-based administrative health data from the entire Danish population and by performing Mendelian Randomization (MR) analyses; (v) identify susceptibility genes and pathways shared between psychiatric disorders and inflammatory diseases of the gastrointestinal tract using bulk RNA-Seq data of different developmental stages and single-cell RNAsequencing (scRNA-seq) expression data for tissues of the GBA.

Results
Cross-tissue transcriptome-wide association studies (TWAS) for seven diseases of the GBA. The UTMOST framework was used to perform cross-tissue TWAS for 17,290 genes and 23 selected disease-relevant tissues, which had been identified as disease-relevant for diseases of the GBA in a previous study of Finucane and colleagues 19 (Methods), using transcriptome-wide and genome-wide reference data of 4043 samples provided by consortia projects GTEx 20 , STARNET 21 , and BLUEPRINT 22 ( Fig. 1 and Supplementary Data 2). A schematic overview of the study workflow is shown in Supplementary Fig. 1. Gene expression imputation models were used using GWAS summary statistics data of ten consortium meta-analysis studies comprising 180,592 independent cases of European ancestry (CD: 21 23 . The cross-tissue GBJ gene test quantifies gene-disease associations across all tissues of the GBA (separately for each disease), adjusts the correlation between TWAS association statistics for individual tissues, and also provides (compared with standard univariate meta-analysis approaches) a substantial increase in statistical power in situations where eQTL effects are not unique to a single tissue (Methods). The GBJ test identified a total of 586 (277), 345 (179), 214 (79), 361 (104), 38 (21), 77 (10), and 42 (8) gene-disease associations (number of loci with size of ±1 Mb) with transcriptome-wide significance (P GBJ_marginal < 0.05/15,587 = 3.20 × 10 −6 ; Supplementary Fig. 2; corrected for a maximum of 15,587 successfully tested and GBJmeta-analyzed genes) for CD, UC, PSC, SCZ, MDD, BD, and ADHD, respectively (Supplementary Data 3). To evaluate a possible bias in the number of transcriptome-wide significant gene-disease associations due to different sample sizes of reference data sets, we further conducted TWAS analyses using expression imputation models provided by S-PrediXcan 24 and FUSION 25 (Methods). We observed on average only a moderate positive correlation between reference sample size and the number of significant genes per tissue in our cross-tissue TWAS results (average Spearman's ρ UTMOST = 0.46; Supplementary Data 4 and Supplementary Fig. 3). This suggests that cross-tissue TWAS imputation models can effectively select predictive cis eQTL variants (within ±1 Mb of the transcription start/end site of a gene) shared by multiple tissues, whereas tissue-specific eQTLs with strong effects were retained.
Cross-tissue and multiple-gene-conditioned fine-mapping analyses at loci with multiple-gene-disease association signals. P GBJ_marginal values from cross-tissue TWAS analyses (Supplementary Data 3) showed weak to moderate inflation of GBJ association statistics (λ 1000 in the range [0.94; 1.05]; Supplementary Data 5). One reason for this may be co-regulation of multiple genes by the same eQTL, as well as LD between eQTLs with nonzero prediction weights, which may lead to correlated predicted expression between nearby genes and thus multiple correlated TWAS hits per locus 15 , analogous to the situation in a GWAS study where an association signal typically spans a large number of highly significant SNPs in high LD with the causal SNP 26 . In order to distinguish between causal and marker gene-disease associations at significant TWAS loci, we performed (i) multiple-gene-conditioned single-tissue analysis (Single-tissue conditional ) followed by (ii) multiple-geneconditioned cross-tissue analysis (GBJ conditional ) (Methods) with nearby genes within range of ±1 Mb around transcriptome-wide significant genes from marginal results (genes highlighted as red dots in the Manhattan plots in Supplementary Fig. 2). This accounted for potential cross-tissue co-regulation and LD between eQTL SNPs, and furthermore, the marginal association result of each gene was conditioned based on all genes within the ±1 Mb region (avoiding specification of fixed locus boundaries based on LD blocks from genotype data). This analysis yielded a total of 215 (80), 152 (50), 150 (22), 89 (53), 9 (5), 5 (5), 10 (6) GBJ gene-disease associations (separate loci of ±1 Mb) with transcriptome-wide significance (P GBJ conditional < 3.20 × 10 −6 ) for CD, UC, PSC, SCZ, MDD, BD, ADHD, respectively (Supplementary Data 3 and Supplementary Data 6, Methods), thus representing final transcriptome-wide significant cross-tissue multiple-geneconditioned TWAS gene-disease association results (Table 1 and Comorbidity analysis and trait correlation at the genetic level and the level of predicted gene expressions. Using populationbased administrative health records from the Danish National Patient Registry (DNPR), we conducted a retrospective cohort study for the period 1994 to 2018 by examining directed diagnosis pairs of 7,191,385 people from the entire Danish population (Methods). We confirmed statistically significant comorbidity (false discovery P FDR < 0.05) between our four psychiatric and three immunological phenotypes for nine psychiatric-immunological pairs out of 42 disease pairs (Supplementary Data 8) with an increased incidence of SCZ (relative risk RR = 1.15), BP (RR = 1.28) and depression (RR = 1.52) in CD, depression in UC (RR = 1.34) and PSC (RR = 1.42), CD in SCZ (RR = 1.12), BD (RR = 1.29) and depression (RR = 1.57), UC (RR = 1.37) and PSC (RR = 1.57) in depression. We further applied pairwise genetic correlation analysis with LDSC as described in Tylee et al. 27 (Methods) to complement TWAS analyses with up-to-date values of genetic correlations (Supplementary Data 9 and Supplementary Fig. 4).
Next, to examine pairwise trait correlations at the level of predicted (genetically regulated) gene expression (ρ E ) and to compare ρ E with genome-wide genetic correlations (ρ G ), we calculated Spearman's correlations between TWAS gene effect estimates for all pairs of diseases by using the gene-wise Z-scores from the analyses Single-tissue marginal and GBJ marginal and a correlation-pruned list of 1825 co-regulation-and eQTLindependent genes previously used in Mendelian randomization studies 28 (to avoid measuring correlated predicted expression on the transcriptome-wide scale) (Methods). As an example (Singletissue marginal ; Supplementary Data 10), the strongest positive correlation across psychiatric and immune phenotypes was observed for CD4 + T-cells (ρ E = 0.11 for CD and BP and  Fig. 1 Selection of 23 tissue and cell types to perform transcriptome-wide association analyses (TWAS) for gut-brain axis (GBA) diseases. We selected 23 tissues and cell types previously found to contribute to the heritability of psychiatric or immune-mediated disease (Methods). Cross-tissue TWAS training and imputation models were used using transcriptome-wide and genome-wide reference data from 4043 reference samples from 23 tissues and cell types (Supplementary Data 2) provided by consortium projects GTEx, STARNET, and BLUEPRINT (Methods): gastrointestinal and liver tissues (n = 9), brain tissues (n = 10), whole blood (n = 1), and immune cell types (CD14 + monocytes, CD16 + neutrophils, CD4 + T-cells; n = 3). Different colors represent biological samples from different tissue classes. The size of the circles corresponds to the number of reference samples of the respective tissue. ρ E = 0.14 for CD and SCZ) ( Supplementary Fig. 5). On the crosstissue level across all 23 tissues (GBJ marginal ; Supplementary Data 10), the strongest positive correlations across psychiatric and immune phenotypes were observed for pairs UC and BP (ρ E = 0.15), CD and BP (ρ E = 0.14), UC and SCZ (ρ E = 0.13), and CD and SCZ (ρ E = 0.11) (Fig. 2b). This differs slightly from the genetic level for pairs UC and BP (ρ G = 0.07), CD and BP (ρ G = 0.11), but may explain part of the genome-wide genetic correlations (ρ G ) of UC and SCZ (ρ G = 0.13), and CD and SCZ (ρ G = 0.11), suggesting that part of the effects of genetic variants shared between SCZ and immune phenotypes are likely to be mediated by gene expression.
Mendelian randomization analysis for psychiatric and immune phenotypes. Mendelian randomization (MR) is a commonly used method to assess a causal influence of one trait (exposure) on another (outcome). To identify putative pairs of immune-mediated and psychiatric diseases that may be related in a causal manner (i.e., vertical pleiotropy 29 ), we conducted MR analysis for transcriptome-wide significant lead genes (i.e., our instruments with P conditional < 3.20 × 10 −6 and filtered for the most-significant genes from Single-tissue conditional within ±1 Mb regions) using multi-gene TWAS MR analysis (including HEIDI-outlier filtering that removes instruments with strong putative horizontal pleiotropic effects, where one gene has independent effects on multiple traits) as implemented in GSMR 30 (Methods). After Bonferroni correction for 171 pairwise tests (P GSMR < 2.92 × 10 −4 ), we identified CD (P GSMR = 9.58 × 10 −5 ; OR = 1.10, 95% CI 1.05-1.15) and UC (P GSMR = 8.40 × 10 −6 ; OR = 1.17, 95% CI 1.09-1.26) as a putative risk factor (exposure) for SCZ in tissues "Colon_-Transverse" and "Esophagus_Gastroesophageal_Junction", respectively (Supplementary Data 11). HEIDI-filtering identified CSNK1A1 and SGSM3 as potential horizontally pleiotropic genes, suggesting a pleiotropic function in both diseases.
Identification of gene-disease associations shared between psychiatric and immune phenotypes. After identifying pairs of immune-mediated and psychiatric diseases as putative correlated traits on the transcriptome-wide level, we sought to identify susceptibility genes shared between psychiatric and immune phenotypes from the list of all transcriptome-wide significant genes from both analyses Single-tissue conditional and GBJ conditional ( . In summary, these three genes met the transcriptome-wide significance threshold of 3.20 × 10 −6 in the same brain or non-brain tissue for at least one immune disease and one psychiatric disease, where at least one of the diseases must also show another transcriptome-wide significant signal for another brain or non-brain tissue (brain tissue if the first signal occurred in a non-brain tissue or vice versa), with substantial gene expression heritability estimates h med 2 (i.e., heritability mediated by the cis genetic component of gene expression levels) across the 23 tissues (h med-min 2 , h med-max 2 ) of (13.4%, 18.2%) for NR5A2, 152 [50] 150 [22] 89 [53] 9 [5] 5 [5] 10 [6] n unique genes in total from single tissues

(4)
For the selection of 23 tissues, see *significant gene-disease associations in at least two of 23 tissues. **significant gene-disease associations in at least two tissues including the brain and non-brain tissues (excluding cross-tissue associations for immune and gastro/liver tissues only, respectively, see Fig. 1); in curved brackets: number of gene-disease associations outside the extended major histocompatibility complex (MHC; chr6:25-34 Mb).
(6.9%, 24.8%) for SATB2, and (5.0%, 10.9%) for PPP3CA (Table 2; Methods). Thus, these genes met the most stringent criteria for shared susceptibility genes, as they were shared between psychiatric and immune phenotypes in the same tissues and across the brain and non-brain tissues and were therefore prioritized below. We observed that an increase (decrease) in predicted expression for NR5A2 and SATB2 (PPP3CA) was associated with an increased risk of SCZ, CD, and/or UC (Table 2). In addition, four other gene candidates (INO80E, SF3B1, SGSM3, and ZC3H7B) from the secondary analysis (GBJ conditional ) met the transcriptome-wide significance threshold and showed associations across both brain and intestinal tissues, but each in different tissues for psychiatric and immune phenotypes, implying that for these four genes the most significant tissues differ between psychiatric and immune phenotypes (Supplementary . This suggests that those four genes may be involved in SCZ and CD/UC, but not necessarily in the same tissue. We will show below by gene set enrichment analysis that INO80E, SGSM3, and ZC3H7B are likely to be proxies for other gene-disease associations at the respective loci. GWAS/TWAS conditional analysis to describe the GWAS contribution to the TWAS result. To estimate the contribution of the three shared same-tissue susceptibility genes NR5A2, SATB2, and PPP3CA relative to the established GWAS signals as defined by the recent consortia fine-mapping studies for SCZ 31 and CD/UC 32,33 , we performed a joint GWAS/TWAS finemapping analysis across the 23 tissues of the GBA using the GWAS consortium summary statistics listed in Supplementary Data 1. For NR5A2, SATB2, and PPP3CA, respectively, we performed a multi-SNP-based conditional and joint association analysis using GCTA COJO 34 to condition GWAS summary statistics on independent eQTL effects of the three genes (Methods). We examined how much of the GWAS fine-mapping signal remained after the signal of the TWAS gene eQTLs was removed. We demonstrated the applicability of our approach by examining and replicating established eQTL associations of GWAS signals as reported in the fine-mapping GWAS studies for CD/UC  Increased genetically regulated expression of NR5A2 in the hypothalamus and colon transversum was associated with increased risk for SCZ, CD, and UC. We found that the TWAS association of NR5A2 at 1q32.1 in the hypothalamus and colon transversum (Fig. 3a) was driven by the nearby (but nonoverlapping) established SCZ and CD/UC GWAS signals located 104 and 654 kilobases (kb) upstream of NR5A2, respectively. This GWAS locus was not assigned a gene in the CD/UC consortium finemapping study 33 but was assigned an overlap with an epigenetic mark for immune cells (Immune_H3K4me1). Given the functional evidence for NR5A2 in attenuating inflammatory damage in murine and human intestinal organoids 35 (see also Supplementary Note 3) we propose NR5A2 as the best candidate gene for the GWAS locus 1q32.1 with a possible effect on the hypothalamic-pituitary-adrenal axis (HPA axis) 10 .
Increased genetically regulated expression of SATB2 in frontal cortex BA9 and colon sigmoideum was associated with increased risk of SCZ and UC. The TWAS association of SATB2 at 2q33.1 (Fig. 3b) is driven by the nearby (nonoverlapping) established SCZ and UC GWAS signals located 67 and 425 kilobases (kb) downstream and upstream of SATB2, respectively. Because multiple TWAS eQTLs were identified for SATB2 in the 95% credible set of the UC GWAS peak (consisting of intergenic variants between LOC101927619 and SATB2 33 ), it is convincing that the GWAS/TWAS fine-mapping analysis revealed a possible explanation of the GWAS signal by eQTLs at this locus. SATB2 as a potential susceptibility gene for UC and SCZ has already been implicated by other studies (Supplementary Note 3).
Decreased genetically regulated expression of PPP3CA in the putamen basal ganglia and colon transversum was associated with increased risk of SCZ and CD. We found that the TWAS association of PPP3CA at 4q24 (Fig. 3c) is driven by the nearby (nonoverlapping) established SCZ and CD GWAS signals located 280 and 384 kilobases (kb) upstream of PPP3CA, respectively. GWAS fine-mapping analysis 33 assigned the 95% credible set of fine-mapped SNP variants (n = 170 SNPs) to the gene BANK1, which is closest to the lead GWAS SNP variant at 4q24. Because PPP3CA encodes a subunit of calcineurin (Supplementary Note 3) that has been associated with SCZ and is also a known drug target (due to a direct physical interaction with FKBP Prolyl Isomerase 1 A (FKBP1A) 36 , also known as FKBP12) for the treatment of UC via calcineurin inhibitors, we propose PPP3CA as the best candidate gene for the GWAS association signal at 4q24.
Analysis of bulk RNA-Seq data of different developmental stages and scRNA-seq expression data for tissues of the GBA. Having tested the three gene-disease associations for plausibility at the GWAS level in the previous section, we hypothesized that high, tissue-or cell-type-specific expression would be indicative of a particular function in the tissue of interest, and that expression of a gene could also occur only in a small subset of cells of only a particular tissue or at a particular human developmental stage, which is hypothesized for SCZ 37 . We used publicly available non-diseased bulk tissue 38,39 and scRNA-Seq data sets 40,41 (Methods) to look up in which tissues and at which developmental stage of human tissues and cells the shared susceptibility genes are generally expressed, and found that the observed patterns matched the existing knowledge on the genes (Supplementary Note 3). It is also possible that an adverse event for each disease separately; for diseases listed in the "Diseases primary tissue(s)" column). These three genes are shared between psychiatric and immune phenotypes in the same tissues and across the brain and non-brain tissues (Fig. 3). Increased predicted expression (indicated by a positive Z-score) of NR5A2 and SATB2 is associated with increased risk SCZ and CD as well as SCZ and UC. Decreased Z statistic, Z-score from analysis Single-tissueconditional and/or GBJconditional for primary tissue(s), the plus/minus sign indicates increased/ decreased predicted expression of these genes to be associated with increased disease risk, TWAS in utero can interfere with normal brain development and create a brain vulnerability that, in an individual already at risk (genetic predisposition), can lead to SCZ later in life 37 . We, therefore, examined the expression of our three candidate genes in developmental data from brain 38 and intestine 39 , since it is conceivable that a susceptibility gene is expressed during fetal development but is no longer expressed in adulthood. Again, NR5A2 was almost absent in bulk brain samples from different developmental stages, whereas SATB2 was strongly upregulated specifically in the fetal forebrain during midfetal development, and PPP3CA was weakly expressed in all developmental stages of the brain 38 (Fig. 4a). In the intestine, it was the opposite: NR5A2 was strongly expressed in the ileum and colon during the fetal and juvenile stages, whereas SATB2 was weakly expressed in the colon sigmoideum and not ileum, and PPP3CA almost not at all 39 (Fig. 4a). To achieve resolution at the single-cell level, we analyzed human single-nucleus and scRNA-seq data from different cell types of the adult brain 40 and gastrointestinal tract 41 , respectively (Fig. 4b).
In these data sets, NR5A2 is expressed in the ileal progenitor, stem, and transient amplifying (TA) cells, less strongly in the colon and rectum, and not in the brain, consistent with the bulk expression data. Consistent with the bulk expression from developmental stages of the intestine (Fig. 4a), SATB2 was strongly expressed in cell types of the colon and rectum and barely expressed in brain cells. PPP3CA expression was highly enriched in enteroendocrine and Paneth-like cells of the ileum, colon, and rectum despite weak expression in the bulk RNA-Seq data of ileum or colon, which may imply a specific role of PPP3CA in immune modulation in the gut for CD and UC; in the brain, PPP3CA is also expressed in neurons (Fig. 4b). Bulk tissue and scRNA-seq results for genes INO80E SF3B1, SGSM3, and ZC3H7B are shown in Supplementary Fig. 42 and discussed in Supplementary Note 5.
Calcineurin-dependent NFAT and Wnt signaling as shared signaling pathways for IBD and SCZ. Based on the statistically unambiguous gene prioritization from multiple-gene-conditioned fine-mapping analyses, it is likely that NR5A2, SATB2, and PPP3CA are shared susceptibility genes for both IBD and SCZ.  45 , which is also part of the Wnt pathway. Recent findings also suggest that SCZ and BD are characterized by abnormal Wnt gene expression and plasma protein levels, suggesting that suggest that drugs targeting the Wnt signaling pathway may play a role in the treatment of severe mental disorders 46 . Thus, the Wnt signaling pathway and NFAT activation are the most likely candidate pathways for IBD and SCZ, where risk is mediated by gene expression.

Discussion
By performing cross-tissue and multiple-gene conditioned transcriptome-wide association studies (TWAS) for 23 tissues of the GBA for three clinically related inflammatory diseases of the gastrointestinal tract (CD, UC, and PSC) and four diseases of the mind and brain (SCZ, MDD, BD, and ADHD) we identified susceptibility genes NR5A2, SATB2, and PPP3CA shared between IBD (CD/UC) and SCZ that are (genetically) expressed both in intestinal and brain tissues. We estimated the proportion of the genetically regulated component of expression (h med-min 2 , h med-max 2 ) with respect to total gene expression, measured across a maximum of 23 different tissues, to be (13.4%, 18.2%) for NR5A2, (6.9%, 24.8%) for SATB2, and (5.0%, 10.9%) for PPP3CA. These three genes could represent a molecular link between psychiatric and inflammatory traits and may contribute to the observed genetic correlations 11,47 and increased prevalence of SCZ in CD/UC patients at the population level [7][8][9] . The correlations between BD and CD/UC are enigmatic. Tylee et al. find BD and CD/UC correlated on the genetic level, but they cite a study in which BD-UC was not significant and we find Fig. 3 The transcriptome-wide significant gene-disease association signals of NR5A2 (1q32.1), SATB2 (2q33.1), and PPP3CA (4q24) for SCZ, CD, and UC largely explain fine-mapped GWAS association signals from nearby GWAS susceptibility loci. a The gene-disease association signal of NR5A2 is driven by the nearby but nonoverlapping established SCZ and CD/UC GWAS signals located 104 and 654 kilobases (kb) upstream of NR5A2, respectively ( Table 2). Increased genetically regulated expression of NR5A2 in the hypothalamus and colon transversum was associated with increased risk for schizophrenia (SCZ), Crohn's disease (CD), and ulcerative colitis (UC). b The association signal of SATB2 is driven by the nearby but nonoverlapping established SCZ and UC GWAS signals located 67 and 424 kilobases (kb) downstream and upstream of SATB2, respectively. Increased genetically regulated expression of SATB2 in frontal cortex BA9 and colon sigmoideum is associated with increased risk for SCZ and UC. Although 95% credible sets of variants most likely to be causal (red circles; upper part) differ between SCZ and UC, the distribution pattern of TWAS gene eQTLs of SATB2 (gray asterisks [upper part] and green transparent circles [middle part]) leads to the shared gene-disease association for SATB2 in both SCZ and UC. c The gene-disease association signal of PPP3CA is driven by the nearby (nonoverlapping) established SCZ and CD GWAS signals located 280 and 383 kilobases (kb) upstream of SATB2, respectively. Decreased genetically regulated expression of PPP3CA in putamen basal ganglia and colon transversum is associated with increased risk for SCZ and CD. Raw data of this figure can be found in Supplementary Data 16 no significance, too. Contrarily, we find BD and CD/UC correlated on the genetic expression level, but no single gene overlap hits. A possible explanation is that Tylee et al used a different dataset for BD. BD is closely related to SCZ and depending on the inclusion criteria, might appear more or less similar to SCZ. Our bioinformatic cross-tissue TWAS pipeline for selected tissues of GBA addressed typical technical drawbacks of previous TWAS studies and enabled conditional analysis of multiple disease associations at TWAS loci. In our opinion, TWAS analyses for CD, UC, and SCZ had the highest statistical power to detect shared TWAS susceptibility genes, as specifically for CD, UC, and SCZ a much higher proportion of expression-mediated heritability (h med 2 /h g 2 ) has recently been estimated across GTEx tissues than for other psychiatric disorders such as BD 48 , so our TWAS analyses for BD had lower statistical power. Since h med 2 /h g 2 can vary considerably for different traits or diseases, we recommend that future TWAS studies additionally estimate h med 2 /h g 2 (e.g., with the MESC software 48 ) to assess whether a TWAS can be expected to have a low or rather high number of gene-disease associations for the disease under study. Another key outcome of our GWAS/ TWAS conditional analysis for these three genes is that the expression-disease associations largely explain the previously finemapped GWAS association signals at nearby GWAS susceptibility loci for SCZ 31 , CD, and UC 32,33 , suggesting that risk is mediated by genetically regulated expression at those loci. Together with the results of our gene set enrichment pathway analysis, we showed it is likely that genetic variation mediated by gene expression in the Wnt signaling pathway and NFAT activation by calcineurin is associated with both SCZ and IBD. One limitation is that we could only use candidate genes from six TWAS loci for pathway analysis, so future, more powerful TWAS are expected to provide a more detailed picture here. Another limiting factor of our study is that the heritability of our three (conservatively) selected genedisease associations accounts for only a small fraction of the estimated h med 2 and may present only the tip of the iceberg. We avoided further multiple testing correction for all seven traits, as this would be counterproductive for TWAS screening after Bonferroni correction for the number of all gene-disease associations (although not statistically independent) and conditional analysis for nearby genes at suggestive loci. In addition, the proportion of shared h med 2 between diseases is difficult to quantify and requires the development of bivariate (for pairs of diseases) methods to estimate the shared h med 2 .
A very interesting observation is the gene-disease association with PPP3CA. PPP3CA encodes a subunit of calcineurin (the catalytic subunit Calcineurin A). Calcineurin is a Ca 2+/ calmodulin-regulated serine/threonine protein phosphatase 49 , which activates the transcription factor NFAT (nuclear factor of activated T-cells) and thus plays a key role in the regulation of the immune response (Fig. 5). After multiple-gene-conditioned finemapping analyses at loci with multiple-gene-disease association signals, in total, 18 out of 55 genes of the NFAT signaling pathway showed a gene-disease association with either IBD or SCZ, or both ( Fig. 5; Methods). Calcineurin inhibitors (e.g., cyclosporine A and tacrolimus) have long been used as immunosuppressants after solid organ transplantation and are used as first-line treatment in PSC patients who have undergone liver transplantation 50 . Calcineurin inhibitors are also being studied for their efficacy in a number of autoimmune diseases 51 . Recent clinical trials with tacrolimus have demonstrated the efficacy and safety of tacrolimus in treatment-resistant UC 52 and ulcerative proctitis (UC confined to the rectum) 53,54 . However, knockout of calcineurin in the forebrain of mice was found to be associated with symptoms of schizophrenia-like psychosis 55 , and recent experiments in rats have shown that calcineurin inhibitor therapy may be a risk factor for the development of neurobehavioral changes 56 . In humans, treatment with calcineurin inhibitors sometimes showed an increase of neuropsychiatric side effects 57,58 and tacrolimus has been reported to cause a relapse of schizophrenia 59 . Calcineurin-inhibiting immunosuppression used after solid organ transplantation have also been associated with multiple neuropsychiatric side effects 60 . Consistent with these results from clinical and functional studies, we found that decreased predicted expression of PPP3CA is associated with increased risk for SCZ (Table 2).
Our analysis of bulk RNA-Seq and scRNA-Seq data showed that PPP3CA expression is mainly restricted to enteroendocrine and Paneth-like cells of the ileum, colon, and rectum. The gut microbiota produces various metabolites (including short-chain fatty acids, secondary bile acids, and lipopolysaccharides) that modulate enteroendocrine cells and produce hormonal signals reflecting food intake, microbial composition, and epithelial integrity 61 . Paneth cells are secretory epithelial cells of the intestine that contain antibacterial proteins and alpha-defensins that are released into the intestinal lumen in response to a series of stimuli 62 , indicating a possible link to the GBA. PPP3CA is also expressed in the brain, there is evidence for a general implication in SCZ, and studies show more calcineurin immunoreactive neurons in the caudate nucleus of SCZ patients (Supplementary Note 3), which, together with the putamen where we found the TWAS gene-disease association, forms the striatum. Taken together with the literature (Supplementary Note 3), PPP3CA has probably at least two distinct functions relevant to SCZ or IBD, one in neuronal signal transduction in the striatum and the other in signal transduction for immune modulation in Paneth and enteroendocrine cells. NR5A2 (encoding human LRH-1) regulates intestinal glucocorticoid synthesis and is known to protect epithelial integrity and attenuate inflammatory damage in murine and human intestinal organoids, including those derived from IBD patients (Supplementary Note 3). Although NR5A2 appears to be nearly absent from the brain and blood based on the present data, there is evidence of its function in brain tissues (Supplementary Note 3), so we suspect that NR5A2 expression is restricted to less-studied regions of the brain and that it has separate functions in the gut and brain. SATB2 is a promising candidate for a GBA susceptibility gene because it is predominantly expressed in the colon sigmoideum and prefrontal cortex, consistent with our TWAS results in which SATB2 expression is associated with SCZ and UC but not with CD. SATB2 is also a prognostic marker in UC-associated colorectal cancer (Supplementary Note 3). In summary, gene expression analyses for NR5A2, SATB2, and PPP3CA using intestinal and brain tissue data from different developmental stages and from scRNAseq studies support the findings from previous human, mouse, and organoid studies and suggest a pleiotropic role for these genes as part of the GBA.

Methods
Consortium GWAS summary statistics. Crohn's disease (CD) and ulcerative colitis (UC) case and control cohorts (Supplementary Data 1) from 15 countries Fig. 4 Analysis of developmental and single-cell RNA-seq data suggest a cell-specific pleiotropic role of NR5A2, SATB2, and PPP3CA in SCZ, CD, and UC and showed that PPP3CA expression was strongest in neurons and enteroendocrine and Paneth-like cells of the ileum, colon, and rectum, indicating a possible link to the GBA. Visualizations of bulk expression of the candidate genes can be found in Supplementary Fig. 42. a Brain expression data from humans of different ages 38 confirm GTeX and BLUEPRINT tissue data and show that NR5A2 is almost absent in brain samples. SATB2 is strongly upregulated in the fetal forebrain in midfetal development. This suggests a developmental function that may predispose to disease when dysregulated. PPP3CA expression is strongest in the forebrain and increases with age. Expression in the liver is given for reference. PC, post conception. Data on expression in the developmental gut 39 show that NR5A2 is strongly expressed in the ileum and less so in the colon, whereas SATB2 is more abundant in the colon than in the ileum. This is consistent with the observation that SATB2 is significant for UC but not for CD (Table 2). PPP3CA expression is not evident from bulk RNA-Seq data of intestinal tissue. b NR5A2 is expressed in ileal progenitor, stem cells, and transient amplifying (TA) cells, less strongly in colon and rectum, and not in the brain, consistent with bulk RNA-seq results. SATB2 is strongly expressed in all cell types of the colon and rectum. PPP3CA, despite weak expression in the ileum or colon, is enriched in enteroendocrine and Paneth-like cells of the colon and rectum, supporting a role in immunomodulation. In the brain, PPP3CA is enriched in neurons. Overall, these expression data indicate that separate functions are likely in the gut and brain-based on specific expression patterns. NR5A2 expression was not found in the brain, but literature results make brain-specific function highly likely (Supplementary Note 3). Raw data of this figure can be found in Supplementary Data 17. (exPFC glutamatergic neurons from the PFC, exCA1/3 pyramidal neurons from the Hip CA region, GABA GABAergic interneurons, exDG granule neurons from the Hip dentate gyrus region, ASC astrocytes, NSC neuronal stem cells, MG microglia, ODC oligodendrocytes, OPC oligodendrocyte precursor cells, NSC neuronal stem cells, SMC smooth muscle cells, END endothelial cells).
across Europe, North America, and Australia have previously been described, quality controlled (QCed), and genotype imputed 13,32,63 . Primary sclerosing cholangitis (PSC) case and control cohorts from 14 countries in Europe and North America and have previously been described, QCed and imputed 13,64 . The number of overlapping (duplicate) GWAS samples for CD, UC, and PSC GWAS data sets were determined by identity-by-descent (IBD) analysis 13 . Because of a partial sample overlap in cases and controls between Immunochip and GWAS data sets for CD, UC, and PSC (Supplementary Data 1), the smaller sample-sized GWAS data sets for CD, UC, and PSC (data sets #2, #4, #6 in Supplementary Data 1) were used only for validation of TWAS association results from the TWAS discovery phase (see text below). Schizophrenia (SCZ), major depressive disorder (MDD), bipolar disorder (BD), and attention-deficit/hyperactivity disorder (ADHD) imputed GWAS summary statistics were used as described elsewhere [65][66][67][68] and were downloaded from https://www.med.unc.edu/pgc/download-results/.
Written, informed consent was obtained from all study participants, and the institutional ethical review committees of the participating centers approved all protocols.
Selection of tissues for TWAS of the GBA. Recently, a heritability enrichment analysis of specifically expressed genes across 205 tissues and cell types in combined with GWAS summary statistics of CD, UC, SCZ, MDD, BD, and ADHD and subsequent validation of enrichments results with chromatin data yielded statistically significant enrichment results for tissues of the central nervous system (CNS) for psychiatric diseases and for tissues from blood/immune cell types and the gastrointestinal tract for inflammatory bowel diseases 19 . Based on these results, we restricted our TWAS analyses using genome-wide and transcriptome-wide reference data of 4043 samples from brain, immune cell and the gastrointestinal tissues (Supplementary Data 2) available from consortia projects GTEx 20 (version V7, dbGaP accession code phs000424.v7.p2), STARNET 21 (dbGaP accession code phs001203.v1.p1), and BLUEPRINT 22 ftp://ftp.ebi.ac.uk/pub/databases/blueprint/). As described by Hu et al. 16 , biallelic SNPs with minor allele frequency (MAF) ≥0.01 were retained and normalized gene expression information was adjusted to correct for potentially confounding effects of sex, sequencing platform, and the three main principal components from principal component analysis (PCA) of the genomewide genotype data, and with an estimation of expression residuals (PEER) factors 69 .
UTMOST expression imputation models. To estimate the genetically regulated expression of each gene in the genome across 23 tissues of the GBA, we applied multivariate-response penalized regression models as implemented in UTMOST (https://github.com/Joker-Jerome/UTMOST) to predict cross-tissue gene expression from reference data sets. Briefly, a standard least-squares loss function was minimized, with an L1 Lasso penalty adaptively set based on sample sizes to select predictive SNP variables in a range of ±1 Mb around each gene and force shrinkage in effect size estimation (within-tissue effects), and further with a group-Lasso penalty set across multiple tissues on the effect sizes of each SNP to select eQTLs shared between tissues (cross-tissue effects) 16 . Such cross-tissue models for expression imputation favor eQTLs that are shared across tissues while preserving tissue-specific effects, and UTMOST's approximation procedure proposed for coordinate descent iteration handles incomplete data in the reference expression matrix.
Association testing and GBJ test. UTMOST was further used to test for genebased association at the level of gene expression regulation, taking into account cross-tissue regulatory effects. A univariate regression model was applied to test the association between predicted gene expression (i.e., the genetically regulated component of gene expression) and disease status for each tissue separately (although previously trained over several tissues at the same time). To summarize the tissue-specific results and test whether a gene is significant for at least one tissue, a generalized Berk-Jones (GBJ) test 23 was used to combine associations across individual tissues into a unified test of association. The GBJ is based on the absolute values of gene trait Z-score, allowing for directions of eQTL effects in the different tissues, and uses tissue covariance to correctly weight results across tissues and correct for multiple tissue testing. We used the Bonferroni correction on the number of genes tested to determine transcriptome-wide significance (P < 0.05/ 15,587 = 3.20 × 10 −6 ). For the CD, UC, and PSC phenotypes, it had to hold additionally that transcriptome-wide significant genes replicate with P < 0.05 in the smaller GWAS data sets #2, #4, #6 in Supplementary Data 1.
Comparison with other pre-trained single-tissue expression imputation models. The elastic net approach from S-PrediXcan 24 was used for expression imputation of 19 GTEx tissues (Supplementary Data 2) followed by gene-disease association testing using logistic regression for our ten study data sets (Supplementary Data 1). The precomputed reference based on GTEx v7 data was available at http://predictdb.org/. In addition, FUSION's 25 Bayesian linear mixed model was used for expression imputation and gene-disease association testing with precomputed weights derived from the same 19 GTEx tissues v7 (http://gusevlab.org/ projects/fusion/). The number of Single-tissue marginal significant genes (P < 3.20 x 10 −6 ) from UTMOST, FUSION, and S-PrediXcan for the 19 GTEx tissues and the 10 GWAS studies included in this study were compared with the number of reference samples used for gene expression imputation training. The number of reference samples used in expression imputation training was taken from the supplementary information or the databases of the respective publications; it is possible that the number of GTEx v7 reference samples differs for the different tools due to different pre-filtering analyses. For all tools, only the marginal, uncorrected, P values were considered for benchmark purposes. Spearman correlation and a simple linear regression model were computed separately for each of the ten studies. A paired Mann-Whitney U-test was applied to test for differences in slopes (β) between the three tools.
Cross-tissue multiple-locus-genes conditional analysis and GBJ test. We used the UTMOST multiple regression approach to perform multiple-locus-genes conditional cross-tissue analysis for each transcriptome-wide significant gene (from unconditioned analysis) in each of the ten studies separately to prioritize genedisease associations at the same locus within the TWAS locus boundaries of the Fig. 5 Calcineurin-dependent NFAT signaling as shared signaling pathway for IBD and SCZ. a 18 gene-disease associations with suggestive significance (P GBJ conditional < 0.05; i.e., after multiple-gene-conditioned fine-mapping analysis) are enriched in the "calcineurin-dependent NFAT signaling in leukocytes" pathway (comprising 55 genes; Methods), showing genetically altered gene expression for SCZ, CD, or UC compared to healthy controls. PPP3CA encodes the catalytic subunit calcineurin A; FKBP1A (also known as FKBP12) encodes a cis-trans prolyl isomerase that binds the immunosuppressant FK506 (tacrolimus). b The calcineurin inhibitor tacrolimus, used against various inflammatory diseases and as an immunosuppressant in organ-transplanted PSC-IBD patients, prevents NFAT signaling by binding to FKBP1A and causes inhibition of calcineurin. The reduced genetic expression of PPP3CA that we identified in intestinal and brain tissues causes an increased risk of SCZ and CD (Table 2). Raw data of this figure can be found in Supplementary Data 18.
±1 Mb range. The conditional analysis accounts for one of the main problems of the TWAS gene-based association test, namely the potential co-regulation of multiple genes by the same eQTLs or the presence of independent eQTLs across genes that are in linkage disequilibrium (LD) 15 . Again, the association statistics of the conditional analysis were combined across tissues using the generalized Berk-Jones score (GBJ) test described above. Again, we used the Bonferroni correction on the number of genes tested to determine transcriptome-wide significance (P < 0.05/15,587 = 3.20 × 10 −6 ).
Overlap of gene-disease associations with the boundaries of known susceptibility loci from genome-wide association studies (GWAS) for CD, UC, PSC, SCZ, MDD, BD, ADHD. For genomic position comparison, locus boundaries of established GWAS loci were adopted from the last fine-mapping GWAS studies for SCZ 31 and CD/UC 32,33 . A gene was labeled "Overlap with GWAS locus" in Table 2 if the gene had an overlap with an established GWAS locus. Gene coordinates (GRCh37, Jan 2020) were obtained using the Ensembl database and BioMart 70 .
Comorbidity analysis in the Danish National Patient Registry (DNPR). To determine significant co-occurrences (disease pairs) for the seven diseases under study, we screened an independent dataset covering ICD-10 diagnosis codes from 7,191,385 people of the entire Danish population in the period from 1994 to 2018 71 . The DNPR includes primary and secondary diagnoses coded according to the International Statistical Classification of Diseases 10th Revision (ICD-10) from all Danish hospitals. We identified patients diagnosed with one of the seven diseases, for example, CD. Subsequently, we identified a random control group of 20 controls, matched on same-sex and year of birth for each patient diagnosed with CD. Since some of the psychiatric disorders we studied have an average prevalence of more than 1% in the general population, which is especially true for depression and ADHD, we needed a sufficiently high number of controls per case to avoid a significant number of cases occurring by chance in the control group. However, the gain in statistical power, in general, decreases rapidly beyond 4 to 20 controls per case 72 . Therefore, we have selected 20 controls per case so that the calculation does not take too long and because additional controls would not result in a significant gain of statistical power.
The incidence of the six other diseases (X) was calculated for the patients with CD and the randomly matched control group and the strength of the association between the diseases was assessed by relative risk (RR) using equation 1. Here, A x is the number of CD patients also diagnosed with disease X, while A t is the total number of CD patients. Likewise, C x is the number of controls diagnosed with disease X and C t is the total number of control patients.
P values for all disease pairs were calculated using a two-sided Fisher exact test and corrected for multiple testing using the false discovery rate (FDR). This analysis is repeated for all seven diseases under study and A t , Ax , RR as well as FDR adjusted P values were reported in Supplementary Data 8.
Trait correlation at the level of GWAS summary statistics (LDSC). We applied LDSC 73 genome-wide correlation as described in Tylee et al. 11 . Shortly, we excluded the extended MHC region and filtered for HapMap3 SNPs with an INFO score of at least 0.9 and a MAF of at least 0.05. The summary stats were subsequently processed by the munge_sumstats.py program of the LDSC software package (https://github.com/bulik/ldsc), and heritabilities and genetic correlation were calculated on the liability scale using population prevalences (Supplementary Data 9).
Trait correlation at the level of predicted gene expressions. Genes are often coregulated, i.e. they either share the same eQTLs or eQTLs exist in strong LD. For this reason, we used a subset of 1825 co-regulated and eQTL-independent genes recently provided for Mendelian randomization studies 28 Fig. 2).
Mendelian randomization (MR) analysis at the level of predicted gene expressions. We performed multi-gene MR analysis at the level of predicted gene expressions using GSMR 30 (Generalized Summary-data-based Mendelian Randomization) to test for a causal association between one trait (exposure) on another (outcome) using summary-level TWAS gene association statistics. We used only standardized transcriptome-wide significant marginal UTMOST gene-disease associations as instruments (p marginal&conditional < 3.2e-6) that were likely independent of neighboring genes by enforcing a minimum distance of 1 Mb between the instrumental genes (see section Cross-tissue multiple-locus-genes conditional analysis). MR analysis was performed on our tissue-specific TWAS results. At least 10 significant, independent gene-disease associations were necessary as instruments to obtain reliable regression results 30 . Because the genetically regulated expression is (by definition) heritable and inferred from SNP-disease association effects, the MR approach from GSMR was performed with (statistically independent) TWAS genes. This satisfies the requirements of MR analysis according to Bowden et al. 74 . We further used the HEIDI ((heterogeneity in dependent instrument)-outlier test) 30 to detect and eliminate pleiotropic gene-disease associations (genetic instruments) that have apparent pleiotropic effects (horizontal pleiotropy, P HEIDI < 0.01) on both traits (exposure and outcome) under investigation.
Gene overlap analysis. To find potential genes mediating the gut-brain axis in the same or different tissues, we considered as candidates all genes that were transcriptome-wide significant for both one inflammatory and one psychiatric trait (i) in the same tissue or (ii) in the GBJ test, both marginally (P marginal < 3.20 × 10 −6 ) and conditionally (P conditional < 3.20 × 10 −6 ) significant. Option (ii) is more liberal and thus includes genes, which are significant in one tissue in a psychiatric trait but significant in an inflammatory trait in another tissue.
Gene expression heritability estimation. Because the heritability of gene expression of each gene was not given by UTMOST expression reference and training files, we extracted heritability estimates of all available tissues from our FUSION results (see section Comparison with other pre-trained single-tissue expression imputation models) by determining the lowest and the highest heritability value per gene, respectively, to give a kind of confidence range of the expected heritability. Briefly, for heritability estimation, the cis locus was defined as ±500 kb of gene boundaries. Samples were restricted to Europeans by principal components analysis (PCA). FUSION's heritability models were adjusted for 20 gene expression principal components from PCA, two genetic ancestry PCs, and local structural variation estimated from SNP array data of reference data sets 25 . Heritability estimation was performed using GCTA/GREML 75 .
GWAS/TWAS conditional analysis. If a significant TWAS disease gene is present for a given locus, we expect that the eQTLs of this gene explain a portion of the original GWAS signal seen in the input GWAS summary statistics, for example, in the presence of statistically causal protein-coding variants. To estimate the proportion of the GWAS signal explained by the eQTLs of a significant TWAS candidate gene, GWAS summary statistics were conditioned with GCTA COJO 76 , version 1.92.2) using a maximal set of noncolinear eQTL variants of a TWAS gene (of ±1 Mb flanking region around the gene) for each candidate gene of interest. The aim of this analysis was to qualitatively test whether the eQTLs of a transcriptomewide significant gene were a possible statistical explanation for the GWAS association signal. GCTA COJO uses the allele frequency, beta, standard error of the beta, the association P value, and the population size of the GWAS summary statistics as input. All eQTLs with a nonzero effect of a significant TWAS candidate gene were given to COJO using parameter --select, whose algorithm is designed to find noncolinear associated SNPs (MAF > 0.01) to subsequently condition on. The P value threshold for including disease-associated noncolinear eQTL sets was chosen 0.05; this relatively weak association significance threshold was chosen to additionally identify eQTLs from non-genome-wide significant regions. The resulting set of noncollinear eQTL SNPs was used to condition the GWAS summary data using the GCTA COJO parameter --cond via conditional linear regression analysis 34 . The resulting TWAS-conditioned GWAS summary statistic represents the minimal portion of the GWAS signal that is not covered by the eQTLs of the selected gene and thus could not have been included in the TWAS association. This may suggest that the TWAS gene, although significantly associated, may not be sufficient to fully explain the GWAS association at this locus. In summary, our complete UTMOST-and GCTA COJO-based analysis can answer three questions on the association of genetically regulated expression of a gene in a given tissue; (i) whether the eQTLs weighted from the UTMOST-trained models provide an explanation for a GWAS signal with ±1 Mb around a TWAS candidate gene (marginal UTMOST test), (ii) which TWAS gene is most likely to represent a GWAS signal given multiple associated TWAS genes (conditional UTMOST test), and (iii), whether there are SNP-disease associations in the GWAS data that cannot be explained by eQTLs in LD (COJO test), such that there may be a different eQTL gene, causal (protein-coding) variants, or epigenetic modifications at the locus that result in a GWAS association signal. Single-cell RNA-seq (scRNA-seq) data of healthy human small intestine, colon, and rectum biopsies, from two donors each, with a total of 14,537 cells were retrieved from GSE125970 41 and single-cell libraries from five male postmortem donors (three prefrontal cortex samples, four hippocampus) from the GTEx biobank (Habib et al.; 28846088) were retrieved via https://www.covid19cellatlas.org. Data were processed and log-normalized as described by Sungnak et al. 78 and visualized using the scanpy v1.6.0 package 79 (https://scanpy.readthedocs.io/en/stable/api/scanpy.pl.dotplot.html).
NFAT pathway visualization. The list of 55 genes of the term "calcineurindependent NFAT signaling role in lymphocytes" from the database BioPlanet 2019 36 was downloaded from https://maayanlab.cloud/Enrichr/#stats 80 (May 25, 2021) and used as input for GeneMania 81 with no genes added and all available physical interaction edges, pathway-based edges and colocalization edges plotted. The resulting network was exported and imported into Cytoscape 3.8.2 82 to adjust the visual style. We chose both marginal and conditional nominal significance (P GBJ conditional < 0.05) as the threshold for highlighting genes.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.