Differential and spatial expression meta-analysis of genes identified in genome-wide association studies of depression

Major depressive disorder (MDD) is the most prevalent psychiatric disorder worldwide and affects individuals of all ages. It causes significant psychosocial impairments and is a major cause of disability. A recent consortium study identified 102 genetic variants and 269 genes associated with depression. To provide targets for future depression research, we prioritized these recently identified genes using expression data. We examined the differential expression of these genes in three studies that profiled gene expression of MDD cases and controls across multiple brain regions. In addition, we integrated anatomical expression information to determine which brain regions and transcriptomic cell types highly express the candidate genes. We highlight 12 of the 269 genes with the most consistent differential expression: MANEA, UBE2M, CKB, ITPR3, SPRY2, SAMD5, TMEM106B, ZC3H7B, LST1, ASXL3, ZNF184 and HSPA1A. The majority of these top genes were found to have sex-specific differential expression. We place greater emphasis on ZNF184 as it is the top gene in a more conservative analysis of the 269. Specifically, the differential expression of ZNF184 was strongest in subcortical regions in males and females. Anatomically, our results suggest the importance of the dorsal lateral geniculate nucleus, cholinergic, monoaminergic and enteric neurons. These findings provide a guide for targeted experiments to advance our understanding of the genetic underpinnings of depression.


Introduction
Major depressive disorder (MDD) is a leading cause of disability and a large contributor to morbidity and mortality, with an estimated annual prevalence affecting over 4.4% of the world's population 1 . MDD is clinically diagnosed and characterized by prolonged periods of low mood or anhedonia in addition to physical and cognitive symptoms making it a complex and heterogeneous disorder 2 . The heritability of MDD, estimated through twin studies, is 31-42%, which is considered modest 3,4 . Genome-wide association studies (GWAS) are performed to identify the common variants that increase the risk of a genetic disease. However, due to the complex nature of MDD, initial GWAS were unable to identify reproducible genetic loci, potentially suggesting that many genetic factors of small-effect contribute to the overall disease manifestation [5][6][7][8] . Moreover, genes and pathways affected differ between males and females [9][10][11][12][13][14] , which may explain some variability observed in depression phenotypes. To identify genetic variants of smaller effect, a consortium effort acquired higher power by profiling larger sample sizes. This increase was achieved by including individuals that displayed broader phenotypes of depression. Cohorts that include individuals with MDD and broader depression phenotypes were defined in the recent GWAS as depression 15 .
Howard et al. conducted the largest GWAS of depression to date (total n = 807,553) by meta-analyzing data from three previous studies of depression: Hyde et al. 16 , Howard et al. 17 and Wray et al. 18 . This large sample size resulted in the identification of 102 independent variants and 269 genes associated with depression 15 . Additionally, they found that the genes near the identified variants were expressed at higher levels in the frontal cortex and within neuronal cell types of the brain through a partitioned heritability approach using transcriptomic resources. Their results provided significant insights into the etiology of depression. However, few of the 269 genes have been studied in the context of the disorder. Furthermore, their enrichment results were based on 13 brain regions and three brain cell types. To provide additional context, we examined these genes in studies that have profiled gene expression in postmortem brain samples of MDD cases. We hypothesized that genes with greater genetic associations would be differentially expressed in these transcriptomic studies of MDD. We performed a differential expression meta-analysis to prioritize the 269 genes and tested for evidence of opposing molecular signals between males and females. In addition, we used large transcriptomic atlases to obtain a finer perspective on the specific anatomy associated with the genetic findings. Our hypothesis for this analysis was that the prefrontal cortex and neuronal cell types are more enriched for the expression of the 269 genes. Figure 1 provides an overview of these analyses. Ultimately, we sought to provide guidance for future studies of depression by narrowing genetic and anatomical targets.

Depression GWAS data
The latest GWAS of depression included 246,363 cases and identified 102 genetic variants. The included cohorts measured a broad range of phenotypes that included nerves, tension, self-reported depression and impairment, and clinically diagnosed depression. For example, the UK Biobank cohort included broad depression phenotypes and the 23andMe cohort assessed phenotypic status based on the responses provided in online surveys and that selfreported being diagnosed with depression by a professional. As the majority of the included participants did not have MDD, this was defined as a study of depression 15   Highlighted are the different brain regions sampled within each study (middle) that will help prioritize the genes (bottom). Other transcriptomic resources that were used (middle) will identify anatomical targets associated with the disease (bottom). Images are from the cited publications, Dr. David M Howard, and Wikimedia Commons (Gray's Anatomy by Henry Vandyke Carter).
summarize the variant to gene associations, Howard et al. used the MAGMA (Multi-marker Analysis of GenoMic Annotation) tool 19 . Genome-wide, MAGMA aggregated the genetic variants associated with depression to reveal the 269 of 17,842 tested genes that passed the multiple test correction threshold. Our analyses focused on these 269 depression risk genes.

MDD transcriptomic studies
MDD transcriptomic studies were selected based on the following criteria: transcriptomic profiles were obtained from human postmortem brain tissues, cases were diagnosed with MDD, results of the study included data from each sex, and the study was published within the past five years. A summary of the transcriptomic datasets used in our meta-analysis is presented in Table 1. The cases in each dataset were diagnosed with MDD through psychological autopsies that included interviews with family or individuals best-acquainted with the deceased. More information is outlined in the respective studies 13,20,21 .

Ding et al. transcriptomic analyses
Using microarray expression profiling, Ding et al. analyzed 101 human postmortem subjects (Table 1) 20 . Eight studies were conducted between the two sexes in three corticolimbic regions: four studies were performed in the subgenual anterior cingulate cortex, two in the amygdala and two in the dorsolateral prefrontal cortex. Initially, 16,689 unique genes were assayed across all studies but were further reduced. Firstly, genes were ranked based on expression level, and the lowest 20% of genes were considered non-expressed and filtered out. Then, genes were ranked based on the variation of expression and the lowest 20% were filtered out. This left Ding and colleagues with 10,680 genes. For each gene, they provided eight single-study p-values and effect sizes (one from each sexspecific study) that we used in our analyses. These statistics were calculated with a random intercept model combined with Bayesian information criterion for parameter selection by Ding and colleagues 20 .  (Table 1) and reported sexspecific transcriptional signatures of MDD using RNA sequencing. They sampled from six corticolimbic structures: the subgenual prefrontal cortex (BA25), orbitofrontal cortex (BA11), dorsolateral prefrontal cortex (BA8/9), anterior insula, nucleus accumbens and ventral subiculum 13 . Genome-wide results were provided by Labonté and colleagues and are available in our GitHub repository (24,943 genes). Of those genes, 20,386 had pvalues, and the associated log fold change values for both sexes in each brain region (12 p-values per gene), which were used in our analyses.

Ramaker et al. transcriptomic analyses
Samples from the anterior cingulate cortex, dorsolateral prefrontal cortex and the nucleus accumbens were profiled by Ramaker et al. using RNA sequencing. We used data from the controls and those with MDD for a total of 48 subjects 21 . We re-processed the metadata and raw count files obtained from GSE80655 using the BioJupies R package referencing the methods in their paper 22 . For the differential expression analysis, we included the same covariates as outlined in their paper: age, brain pH (pH), disorder (MDD), postmortem interval (PMI) and percentage of reads uniquely aligned (PRUA). Unlike the other two transcriptomic studies, Ramaker et al. did not include sex as a covariate. The normalized data was transformed to log2-counts per million using the limma's R package voom function to be linearly fitted with the full design model previously mentioned using limma's lmFit function [23][24][25] . The differentially expressed data was then calculated from the linear fit model using limma's eBayes function 24 . This resulted in 35,238 genes with the associated p-and t-values for each brain region for both sexes for downstream meta-analyses.

Differential expression statistics
We integrated differential expression statistics at the level of genes and found that most of the 269 GWAS identified genes were assayed in at least two transcriptomic studies. The Ding et al. dataset provided differential expression statistics for 155 of the 269 depression risk genes. Of the 114 genes without data, 68.4% were filtered out due to the study's filtering criteria, and the remaining 31.6% were uncharacterized in this study. Labonté et al. had complete differential expression data for 243 of the 269 genes. For the 26 missing genes, seven genes did not have p-values for both sexes across their sampled brain regions and were filtered out from our analysis. The remaining 19 genes were found to be assayed in the dataset (GSE102556), but appear to have been filtered out by the analysis pipeline of Labonté and colleagues. However, Ding et al. also filtered out 14 of the 19 genes suggesting they had low expression levels and variance. For the Ramaker et al. dataset, we re-analyzed the corresponding dataset (GSE80655), resulting in differential expression statistics for all 269 genes. Overall, differential expression statistics from all three transcriptomic studies were available for 153 of the 269 depression risk genes.

Meta-analysis
We performed study-specific meta-analyses that combined across sexes and brain regions in a single study and broader meta-analyses that joined results across studies. These meta-analyses followed one of five criteria that differ in the number of brain regions or sexes across the transcriptomic studies. For instance, the full analysis included data from all brain regions and both sexes. We also separated female from male data across all brain regions to identify sex-specific effects. The expression patterns across the cortex are relatively stable compared to the larger expression differences found across the subcortex 26 . To limit regional variability, we performed separate analyses that were restricted to cortical and subcortical samples. Select criteria were applied in our three developed models to highlight candidate genes associated with the different objectives of the models, which are further described in the sections below.
Our meta-analysis methods differed depending on the model under analysis, but all followed the same general process. First, genes were prioritized in association with MDD by performing a meta-analysis in each transcriptomic dataset. For each study, we combined the onesided p-values across the desired sex and brain regions for each gene in both directions of expression change using Fisher's combined probability test 27 . The direction with the more significant p-value was used to calculate the two-sided study-specific meta p-value and meta direction. To aggregate the three study-specific meta-analyses into one across-study meta-analysis, the one-sided study-specific p-values for each gene were combined using Fisher's method in each direction 27 . The across-study meta direction and meta p-values for each gene were calculated as described above. The Bonferroni method was used to correct for multiple testing.

First model
Our first model was the simplest, where the objective was to identify the genes that were consistently differentially expressed across the three transcriptomic datasets under the five meta-analysis criteria.

Sex-interaction model
Opposing sex-specific patterns have been previously reported in transcriptomic studies of MDD 13,14 . This model's objective was to test for genes with opposing transcriptional differences between male and female cases of MDD. To do this, we inverted each gene's direction of differential expression (multiplied by −1) for males before performing our study-specific meta-analyses. Genes were prioritized under our full, cortical and subcortical criteria.

Genome-wide ranking model
This model was designed to equally weight the per-gene statistics of each study, providing a relative assessment of the gene's significance compared to the rest of the genome. This model was applied to the results of the eight study-specific meta-analyses.
This model uses genome-wide study-specific metaanalysis results from the other two models. Howard et al. identified the 269 depression risk genes by testing 17,842 genes using MAGMA 19 . Therefore, we filtered our studyspecific results to select for those included in the 17,842 gene set. An additional step was then taken compared to the other models to convert the absolute p-value to a genome-wide relative statistic. In every study-specific meta-analysis result from each model, we calculated the proportion of genes in the genome with a smaller studyspecific meta p-value than the current gene under observation in both directions (higher and lower expression in cases). For example, a gene with a study-specific pvalue of 1.56 × 10 −6 that ranks 100th genome-wide, would be assigned an empirical p-value of 100/17842 = 0.0056. This procedure is applied for all three studies, providing a similar uniform distribution of p-values across the genome. Then, as with the other models, these p-values for each gene and direction were combined across studies using Fisher's method as described above.

Genetic and transcriptomic associations
We investigated the degree of association between our across-study meta-analyses results and the gene-based MAGMA statistics for the 269 genes using Spearman correlation. We also tested if our across-study metaanalyses statistics significantly differed for the 269 genes compared to the 17,573 genes that were not associated with depression using the Wilcoxon rank-sum test.

Neuroanatomical expression enrichment
The Allen Human Brain Atlas, a comprehensive transcriptomic atlas of the human brain, was used to characterize neuroanatomical expression patterns 28 . This atlas mapped the human brain's transcriptomic architecture from six healthy adults of five males and one female (ages . This atlas contains 3702 expression profiles of 232 distinct brain regions.
Using this atlas, we created a maximum expression map that assigns the brain region that maximally expresses each of the 269 depression risk genes. We used the probeto-gene mappings generated by the Re-Annotator software 29 . Some regions were profiled from a single donor resulting in some donor-specific bias. To reduce this bias, we filtered the brain regions that included expression data from at least four donors leaving 190 brain regions. Probe level expression values were averaged for each gene transcript across the donors in the 190 brain regions. We then filtered for the region with the greatest expression for each gene, creating our maximum expression gene-toregion mapping. We used the hypergeometric test to identify if any region was significantly enriched for maximal expression.

Cell-type taxon expression enrichment
Zeisel et al. used single-cell RNA sequencing to characterize the transcriptomic cell types within the mouse nervous system 30 . They obtained the transcriptome of 509,876 cells, which was reduced to 160,796 cells after assessing quality. These remaining cells formed 265 transcriptomic cell-type clusters, which were broadly grouped into 39 distinct cell-type taxa across the central and peripheral nervous systems.
We referenced these results to map the 269 genes to the cell-type taxon that most highly expresses it. We downloaded the study's publicly available expression matrix (level 6 taxon level 4 aggregated all cell types) loom file found at http://mousebrain.org/loomfiles_level_L6.html.
This expression matrix provides the average molecule counts for each cell-type taxon. The taxon that displayed the highest expression for each gene was selected to create our maximum expression map. The R homologene package was used to map the 269 genes to orthologous mouse genes 31 . The hypergeometric test was used to identify taxa enriched for maximal expression of the depression risk genes, and their z-scores across the 39 taxa were calculated.

Results
We prioritized the 269 depression risk genes identified in the most recent GWAS of depression. Differential expression statistics were obtained from three transcriptomic studies that examined expression in a total of 197 postmortem brains (101 MDD cases and 96 control subjects, Table 1). These studies focused on the cerebral cortex by sampling from the orbitofrontal, dorsolateral prefrontal, insular, and anterior cingulate cortices. Subcortical samples from the rostral amygdala, nucleus accumbens and the ventral subiculum were also transcriptomically profiled. Of these, the dorsolateral prefrontal and anterior cingulate were profiled in all three studies.

Full across-study meta-analysis
Beginning with the broadest prioritization perspective, we were interested in identifying the depression risk genes that were most consistently differentially expressed across all brain regions and both sexes. Our full across-study meta-analysis was a result of combining 26 p-values across the study-specific meta-analyses. In this analysis, two genes were differentially expressed: SPRY2 (p Bonf < 0.00347) with lower levels of expression and ITPR3 (p Bonf < 0.0161) with higher levels of expression in cases (Supplement Data Table S1, Fig. 2). Visualization of the differential expression statistics for SPRY2 showed overall lower expression in MDD cases, while ITPR3 was more variable across the two datasets with available data (Fig. 2). All across-study meta-analysis results are also available online as interactive spreadsheets (see Data availability).

Sex-specific across-study meta-analysis
Evidence of gender differences has been previously shown in MDD 13,14,32 . Therefore, we performed a stratified analysis to test if any depression risk genes were differentially expressed in a sex-specific manner. When restricted to male data, four genes were statistically significant: UBE2M, CKB, ITPR3, all with higher expression and HSPA1A (all p Bonf < 0.0249) had lower expression in MDD cases (Supplement Data Table S2, Fig. 2). For females, three genes were differentially expressed: SPRY2 and SAMD5 had lower levels of expression and MANEA (all p Bonf < 0.0257) displayed higher levels of expression in MDD cases (Supplement Data Table S3, Fig. 2).

Cortical and subcortical across-study meta-analysis
Cortical structures are common targets of depression research, and expression patterns across the cerebral cortex are more consistent than subcortical tissues 26,33-38 . Therefore, we restricted our analysis to cortical brain regions in both sexes by combining 18 region and sexspecific analyses. This highlighted four statistically significant genes: SAMD5, with lower levels of expression, ZC3H7B with higher levels of expression, SPRY2 with lower expression and UBE2M with higher levels of expression in MDD cases (all p Bonf < 0.0202). ZC3H7B was the only gene that was not identified in the above analyses, suggesting a stronger cortex-specific signal for this gene (Supplement Data Table S4, Fig. 2). The other three remaining genes were previously identified in the above meta-analyses.
We additionally performed a subcortical across-study analysis that combined eight region and sex-specific analyses. This analysis highlighted one gene ZNF184 (p Bonf < 0.0457), suggesting a specific subcortex signal with overall lower expression in MDD cases (Supplement Data Table S5, Fig. 2).
The cortical meta-analyses, which resulted in four differentially expressed genes in comparison to the single gene identified in the subcortical data suggest a stronger cortical signal. However, fewer regions were included in the subcortical analysis, reducing the power to detect consistent differential expression. To test this effect, we reduced the cortical meta-analyses to the same number of region and sex-specific analyses used in the subcortical analyses (eight). Of the 36 possible cortical combinations with matching power, 32 had no, or a single differentially expressed gene. This suggests our limited findings in the subcortical metaanalysis is due to less sampling of the subcortex, and expression differences are not enriched in cortical regions.

Sex-interaction across-study analyses
Previous analyses using the Ding and Labonté datasets have found that differentially expressed genes showed inverse expression differences between male and female MDD cases 13,14 . To determine if this applied to the 269 genes, we tested for opposing transcriptional changes. Using data from all assayed brain regions, we found MANEA, UBE2M, TMEM106B, CKB, LST1 and ASXL3 were differentially expressed in opposing directions between sexes (all p Bonf < 0.0235, Supplement Data Table  S6, Fig. 2). When we restrict the interaction analysis to cortical samples, the same genes were identified except LST1 and ASXL3 (Supplement Data Table S7, Fig. 2), and when restricted to subcortical brain regions, no genes were differentially expressed (Supplementary Data Table  S8, Fig. 2). Given this increased number of hits, we additionally tested if our meta-p values are lower across all 269 genes and found they are not (paired Wilcoxon rank-sum test, p = 0.24). While these results are limited, the increased number of hits from this model provides some support for previous findings of opposing gene expression signatures of MDD between males and females. Fig. 2 Heatmap visualizations of differential expression results. a Study-specific direction signed log(p-values) for the top 12 genes separated by sex and region. Cell colours range from blue to red, which represents lower and higher expression in cases compared to controls, respectively. Colour intensity represents the degree of differential expression. Missing values are marked in gray. b Corrected meta p-values for the same genes across the 8 across-study meta-analyses. Cell colours range from low (yellow) to high (purple) corrected p-values in each meta-analysis. ACC anterior cingulate cortex (two studies), DLPFC dorsal lateral prefrontal cortex, nAcc nucleus accumbens, Ins anterior insula, Sub subiculum, AMY amygdala, SI sex interaction.

Genome-wide ranking analyses
The Labonté dataset had greater influence in our across-study meta-analysis results. Specifically, ITPR3 and SPRY2 that were found in our full analysis from the first model were only significant in the full Labonté-specific meta-analysis. The full Labonté-specific meta-analysis also had the lowest p-values across the 269 genes (Labonté =1.56 × 10 −6 ; Ding = 3.3 × 10 −5 ; Ramaker = 3.38 × 10 −4 ). Labonté et al. assayed more regions, which possibly amplified donor-dependent signals when pvalues were combined. Therefore, to equalize the contributions of each study, we derived normalized ranks for each gene, relative to the rest of the genome (see "Methods"). Across the eight genome-wide meta-analyses, top genes were ZNF184 from the subcortical analysis of the first model (empirical p Bonf = 0.156), PSORS1C1 from the cortical analysis in the sex-interaction model (empirical p Bonf = 0.184) and HSPA1A in the male analysis from the first model (empirical p Bonf = 0.262) (Supplement Data Table S9-S11). The top genes from the remaining five meta-analyses had an empirical p Bonf of 1 (Supplement Data Table S12-S16). Although there is a significant loss of power, when the 269 genes are analyzed relative to the remainder of the genome ZNF184 shows the most consistent differential expression when studies are equally weighted.

Broad associations between genetic and transcriptomic results
Beyond individual gene tests, we assessed broader relationships between the genetic and differential expression results. In our 16 across-study meta-analyses, there was no correlation between the genetic and differential expression statistics (|r | < 0.04, p > 0.0598) and no significant difference between the statistics for the 269 genes and the 17,573 tested genes not associated with depression (Wilcoxon rank-sum test). Overall, a broad association between the genetic and gene expression signals was not observed.

Neuroanatomical expression enrichment
To provide a spatial perspective, we created a maximal expression map that links each depression risk gene to the brain region that most highly expresses it. To reduce donor-specific sampling biases from the Allen Human Brain Atlas, we examined 190 regions that were all assayed from at least four donors. With the exception of C7orf72, the remaining 268 genes were profiled in this Atlas. Seventy-nine brain regions maximally expressed at least one of the 268 genes. Given this large number of regions, we tested if specific brain regions were significantly enriched for maximal expression of the 268 genes than expected by chance (Supplementary Data  Table S17). The midbrain raphe nuclei had the strongest enrichment for maximal expression (p Bonf = 0.021). The six genes that were maximally expressed in this region are all members of the protocadherin alpha family (PCDHA1, PCDHA2, PCDHA3, PCDHA4, PCDHA5, PCDHA7). These genes form a cluster on chromosome 5 and have very similar sequences that can cause a single microarray probe to match several protocadherin genes 39 . This was reflected in our results where three genes (PCDHA2, PCDHA4, PCDHA7) were mapped to the same probes. After grouping these protocadherin genes together, enrichment of the midbrain raphe nuclei was no longer statistically significant; and the top brain region was replaced by the dorsal lateral geniculate nucleus of the thalamus (15 genes maximally expressed, p Bonf = 0.0806). The map showed the central glial substance maximally expressed the most genes (26 genes) but was not statistically significantly enriched (p Bonf = 1) (Supplementary Data Table S17). The combined corticolimbic structures maximally expressed 36 of the 268 genes indicating that the majority of depression associated genes are highly expressed in other brain regions. Therefore, a diverse set of regions are highly enriched for the depression risk genes.

Cell-type taxon expression enrichment
We next sought to identify cellular populations enriched for expression of the 269 depression risk genes. We created a maximum expression map of the cell-type taxon that most highly expresses each gene. Transcriptomic cell types were obtained from a clustering of cells from the mouse nervous system 30 . This maximum expression map summarizes the cell-type taxon maximally enriched for each depression risk gene. Of the 269 depression risk genes, 240 had orthologous mouse genes with expression data available. Of the 39 transcriptomic cell-type taxons, 34 had maximal expression of at least one of the risk genes. Two transcriptomic cell types were enriched for maximal expression: cholinergic and monoaminergic neurons (p Bonf = 2.26 × 10 −5 ) and enteric neurons (p Bonf = 0.00893) (Supplementary Data Table S18). In Fig. 3, expression across all 39 cell-type taxa is presented for the top differentially expressed genes. This marks the diffuse pattern of Ckb (max z-score = 2.2) and specific expression of Hspa1a and Spry2 in enteric cells (max z-score > 3.4). The enteric neuron taxon included neuronal cell clusters annotated as nitrergic and cholinergic 30 . Deeper analysis using the more granular 265 transcriptomic cell-type clusters indicated that enteric cholinergic neurons had a greater enrichment than nitrergic enteric neurons (Supplementary Data Table S19). The cholinergic and monoaminergic neuron taxon contains clusters that express various neurotransmitters and are localized to the mid-and hindbrain 30 . Within this taxon, greatest enrichment was observed in the cluster named 'afferent nuclei of cranial nerves VI-XII' followed by clusters of cholinergic and serotonergic neurons (Supplementary Data Table S19). In this mouse dataset, we again observe a diverse anatomical pattern for the 269 genes.

Predictability of Gene Expression
To assess how specific the differential expression signals are to depression, we examined the depression associated genes in the context of a large differential expression meta-analysis 40 . This meta-analysis calculated the prior probabilities for a list of genes. The higher the probability, the more likely that gene will be differentially expressed for many case-control disease studies. We included these empirical prior probabilities for the 269 genes in our result tables (Supplement Data Tables S1-S16).
For our top 12 genes, data for UBE2M was not available, and the remaining genes had empirical prior probabilities above 0.732 except for ZNF184 (0.368) and ZC3H7B (0.183). These results suggest that on an individual gene basis, differential expression of ZC3H7B and ZNF184 are specific to depression while the other nine genes may be perturbed by generic processes.

Interactive online spreadsheet
We provide all our tables as interactive online spreadsheets to promote collaborative information sharing for these 269 genes. Across-study meta-analysis results are available online as interactive spreadsheets (see Data availability). Comments are enabled, and edit access can be requested to add information as we learn more about these candidate causal genes.

Discussion
We prioritized the genes identified in the largest genetic study of depression to date by incorporating differential expression data from 197 individuals across seven unique brain regions related to reward, attention and emotion processing. We highlight 12 genes with the most consistent differential expression. Referencing transcriptomic atlases, we find that these genes are broadly expressed with some enrichment in the dorsal lateral geniculate nucleus, cholinergic, monoaminergic, and enteric neurons. Our study highlights relevant pathogenic tissues and candidate causal genes to guide future studies of depression risk factors.
Dysfunction in prefrontal cortical circuits is commonly implicated in depression pathogenesis 15,18,[41][42][43] . Furthermore, these regions primarily play a role in executive functions and emotion regulation, which are often impaired in depression [33][34][35][36][37][38]44,45 . Prior focus on the frontal cortex may have indirectly inflated its relevance to the disorder. For example, in schizophrenia, a larger number of dorsolateral prefrontal cortex associations from a transcriptome imputation analysis was driven by tissue sample size rather than the relevance of the region 7 . Howard et al. found that genes harbouring the genetic variants have specific expression enrichment in the healthy prefrontal cortex. However, in our analysis, the dorsal lateral geniculate nucleus of the thalamus was most enriched for the depression risk genes. This region that relays visual information most highly expresses CKB. In addition, MANEA, another top hit, is highly expressed in the nearby dorsolateral thalamus. Past studies have explored the association between vision impairment and depression [46][47][48][49][50][51] . Research has also identified possible sex differences related to visual perception 52 . Our lack of enrichment in the frontal cortex may be a result of our focus on the 269 genes and the finer anatomical resolution of our analyses. We suspect that experiments targeting these specific regions and genes may provide deeper insight into depression.
We provide evidence that neurons are enriched for the expression of candidate depression risk genes than expected by chance. Our findings highlighted enteric neurons, supporting previous associations between the gut microbiome and mental health 53 . Furthermore, integration of the depression GWAS results and transcriptomic data from brain and non-brain tissues found enrichment in the colon 7 . Future research should continue to explore the potential associations between the enteric nervous system and mood disorders.
Broadly, we observed no correlation between differential expression in MDD and the degree of genetic association. Similar findings were also reported in a metaanalysis of autism spectrum disorder 54 . Past consortium analysis identified 108 loci associated with schizophrenia, comparable to the 102 loci associated with depression 55 . In a transcriptomic study of schizophrenia, two genes harbouring the 108 loci were differentially expressed in the prefrontal cortex 56 . In addition, epigenetic risk scores for depression are largely independent of the polygenic genetic risk scores 57,58 . Given these previous findings of weak relationships between differential expression, methylation, and genetic hits, our number of highlighted genes is not unexpected.
Mirroring our CKB results, creatine studies have also found sex-specific signals in the context of depression. Recently, CKB was also differentially expressed in a single-nucleus study of the prefrontal cortex in MDD 59 . Creatine kinase isoenzymes, including CKB, which is specific to the brain, converts creatine to phosphocreatine to efficiently meet energy demands 60 . In rodents, creatine kinase isoenzymes are sexually dimorphic with higher activity in males than females 61 . The Human Protein Atlas indicated CKB is expressed at higher levels in male versus female tissues 62 . In MDD studies, increased creatine levels heightened depressive symptoms in male rats while females displayed antidepressant-like effects 63 . Phosphocreatine levels and depression scores were negatively correlated in the frontal lobe in adolescent females with treatment-resistant MDD 64 . Recently, a negative relationship between dietary creatine consumption and depression was found in an American sample of 22 692 adults 65 . When stratified by sex, this effect was only statistically significant in females. In support of past studies, our findings warrant further investigation of CKB activity and creatine concentrations in the context of depression.
There is a genetic correlation between depression and obesity, and shared genetic factors include Sprouty RTK Signaling Antagonist 2 (SPRY2) 15,18,66 . SPRY2 was significantly associated with body fat percentage and type 2 diabetes mellitus in large genetic studies [67][68][69] . A knockout analysis of SPRY2 found a significant increase in glucose uptake and lipid droplet accumulation in an in vitro model of human hepatocyte cells 70 . This suggests that decreased expression of SPRY2 in human hepatocytes contributes to the pathogenesis of obesity and type 2 diabetes. MDD severity in females was correlated with various measures of obesity (BMI, total body fat and visceral fat mass) 71 . Our results reflect that SPRY2 is more female-specific, with overall decreased levels of expression in cases. Additionally, SPRY2 is most highly expressed in enteric neurons suggesting an association with the gutbrain-axis. Further genetic studies may reveal the role of SPRY2 in both depression and obesity, particularly in females.
UBE2M has been associated with various cancers [72][73][74] , and dermatomyositis 75 . These illnesses predominantly affect males and commonly have overactivation of UBE2M that generally results in poorer survival [72][73][74][75] . Similarly, we show that UBE2M is a more male-specific gene with greater expression in MDD cases. Additionally, UBE2M is most highly expressed in peripheral sensory neurons, which are also affected in some cases of dermatomyositis [76][77][78][79][80][81][82] . Further studies are needed to better understand this gene in the context of both dermatomyositis and depression.
Although ITPR3 was filtered from the Ding et al. study, it remained highly prioritized with higher expression in cases, particularly males. This gene encodes a receptor protein that mediates the intracellular release of calcium 83 . In our analysis, ITPR3 was most highly expressed in the supraoptic nucleus of the hypothalamus. This region produces vasopressin, an antidiuretic hormone 84,85 . Past studies found that MDD cases have increased vasopressin plasma concentrations, which were also found to be positively correlated with psychomotor retardation [86][87][88] . Inositol and its supplementation have been studied in the context of depression with mixed results (reviewed in ref. 89 ). Additional studies are needed to assess the interrelationship between ITPR3, vasopressin, inositol, calcium and depression.
The limitations of this study are consistent with those inherent in most postmortem brain gene expression studies and must be considered when interpreting our results. By combining datasets, we sought to alleviate challenges associated with low sample size and choice of brain regions assayed. Signals of RNA quality, postmortem interval, and patient drug use may still be present despite efforts to control for these factors. Also, the differential expression signature of depression in the brain appears to be weak. When the individual publications are considered separately, two of the three did not identify differentially expressed genes after multiple test correction with Ding and colleagues identifying only nine. When examining each of the 26 studies in isolation, only two genes survive correction for 269 tests (CKB and UBE2M). Another limitation is that our use of Fisher's method assumes the independence of gene expression profiles from different regions of the same donor. As a result, the correlation between expression profiles from different regions of the same donor will probably boost signals repeated across brain regions. We performed ranked and cortical/subcortical analyses to address this, but note our analyses are biased towards expression differences that are consistent across brain regions. We also note that not all of the 269 genes had data from all three transcriptomic studies. Furthermore, the cell-type results are based on mouse rather than human data, which may not accurately translate to humans. This species difference also resulted in missing cell-type taxon assignment for some of the 269 depression risk genes without a mouse homolog. While the neuroanatomical enrichment analyses were performed on pathologically normal brains, we hope that our results will help target future cell and region-specific studies.

Conclusion
We prioritized the 269 GWAS depression risk genes and highlighted 12 that were consistently differentially expressed across three transcriptomic studies of MDD: MANEA, UBE2M, CKB, ITPR3, SPRY2, SAMD5, TMEM106B, ZC3H7B, LST1, ASXL3, ZNF184 and HSPA1A. We provide evidence of greater influence from sex compared to the brain region profiled. Our results revealed the depression risk genes are maximally expressed in various brain regions but highlight the dorsal lateral geniculate nucleus of the thalamus. In the mouse nervous system, cholinergic, monoaminergic, and enteric neurons highly express the candidate genes. Characterization of where these genes are most expressed revealed a diversity of regions, supporting depression's heterogeneous nature. Overall, our results contribute important information to guide future studies and advance our understanding of the etiology of depression.