Multi-ancestry genome-wide association study of major depression aids locus discovery, fine mapping, gene prioritization and causal inference

Most genome-wide association studies (GWAS) of major depression (MD) have been conducted in samples of European ancestry. Here we report a multi-ancestry GWAS of MD, adding data from 21 cohorts with 88,316 MD cases and 902,757 controls to previously reported data. This analysis used a range of measures to define MD and included samples of African (36% of effective sample size), East Asian (26%) and South Asian (6%) ancestry and Hispanic/Latin American participants (32%). The multi-ancestry GWAS identified 53 significantly associated novel loci. For loci from GWAS in European ancestry samples, fewer than expected were transferable to other ancestry groups. Fine mapping benefited from additional sample diversity. A transcriptome-wide association study identified 205 significantly associated novel genes. These findings suggest that, for MD, increasing ancestral and global diversity in genetic studies may be particularly important to ensure discovery of core genes and inform about transferability of findings.


Multi-ancestry genome-wide association study of major depression aids locus discovery, fine mapping, gene prioritization and causal inference
Most genome-wide association studies (GWAS) of major depression (MD) have been conducted in samples of European ancestry.Here we report a multi-ancestry GWAS of MD, adding data from 21 cohorts with 88,316 MD cases and 902,757 controls to previously reported data.This analysis used a range of measures to define MD and included samples of African (36% of effective sample size), East Asian (26%) and South Asian (6%) ancestry and Hispanic/Latin American participants (32%).The multi-ancestry GWAS identified 53 significantly associated novel loci.For loci from GWAS in European ancestry samples, fewer than expected were transferable to other ancestry groups.Fine mapping benefited from additional sample diversity.A transcriptome-wide association study identified 205 significantly associated novel genes.These findings suggest that, for MD, increasing ancestral and global diversity in genetic studies may be particularly important to ensure discovery of core genes and inform about transferability of findings.
Major depression (MD) is one of the most pressing global health challenges 1 .While genome-wide association studies (GWAS) have shown promise of uncovering biological mechanisms underlying the development of MD 2,3 , they have revealed a highly polygenic genetic architecture, characterized by variants that individually confer small risk increases 4 , probably due to the heterogeneity of MD symptoms and etiology 5 .Previous genetic research explored the impact of different outcome definitions 2,6 , sex [7][8][9] and trauma exposure [10][11][12] on heterogeneity.However, the role of ancestry and ethnicity in the genetics of MD has not yet been systematically evaluated.
So far, GWAS of MD were mostly conducted in individuals of European ancestry 2,[13][14][15] .The largest MD GWAS combined data from several studies and identified 223 independent significant single-nucleotide polymorphisms (SNPs) 15 .That study also included data from 59,600 African Americans from the Million Veteran Program (MVP) cohort.In their bi-ancestral meta-analysis, the number of significant SNPs increased to 233.Other MD GWASs were conducted in African American and Hispanic/Latin American participants with limited sample sizes, and did not find variants with statistically significant associations with MD 16,17 .
With 10,640 female Chinese participants, the CONVERGE study is the largest MD GWAS conducted outside 'Western' countries so far 3 .The study identified two genome-wide significant associations linked to mitochondrial biology and reported a genetic correlation of 0.33 with MD in European ancestry samples 18 .In line with this, our recent work demonstrated that some of the previously identified loci from GWAS conducted in samples of European ancestry are not transferable to samples of East Asian ancestry 19 .
Heterogeneity in genetic effects could impact on findings when evaluating causal effects of risk factors for MD.Previous studies in samples of European ancestry reported genetic correlations and causal relationships between MD and cardiometabolic outcomes 2,13,15,20 .Notably, our previous study indicated a contradicting direction for associations between MD and body mass index (BMI) in East Asian individuals and European ancestry individuals (positive causal effect of BMI) 19,21 .Thus, investigating causal relationships using Mendelian randomization (MR) in diverse ancestry groups and in different disease subtypes is important to ensure generalizability and to distinguish between biological and societal mechanisms underlying the relationship between a risk factor and the disease.https://doi.org/10.1038/s41588-023-01596-4and cut-offs to assign participants.However, creating such groups enabled us to look for associations that are specific to groups and to assess the transferability of previously identified loci.The studies included in the meta-analyses used the following measures to define MD: structured clinical interviews, medical healthcare records, symptoms questionnaires and self-completed surveys (Supplementary Table 1 and Supplementary Note).
The analyses included 36,818 MD cases and 161,679 controls of African ancestry, 21,980 cases and 360,956 controls of East Asian ancestry, 4,505 cases and 27,176 controls of South Asian ancestry, and 25,013 cases and 352,946 controls in the Hispanic/Latin American group (Extended Data Fig. 1 and Supplementary Figs.8-11).To account for the minor inflation found in the Hispanic/Latin American samples (λ 1,000 = 1.002 and linkage disequilibrium score regression (LDSC) intercept 1.051; Supplementary Table 2), we corrected test statistics for this analysis based on the LDSC intercept.
In the Hispanic/Latin American group, the G-allele of rs78349146 at 2q24.2 was associated with increased risk of MD (effect allele frequency (EAF) 0.04, β (regression coefficient) = 0.15, s.e.m. 0.03, P = 9.3 × 10 −9 ) (Supplementary Fig. 12).To test the role of these loci in molecular profiles, we performed colocalization for depression and multi-ancestry brain expression quantitative trait loci (eQTLs) 27 .Loci with posterior probability (PP) >90% for both traits being associated and sharing two Increasing diversity in genetic research is also important to ensure equitable health benefits 22 .In the United States, differences in presentation of MD across ethnic groups can impact on the likelihood of diagnosis 23 .Genetics optimized for European ancestry participants would primarily benefit that group of patients and could therefore further widen the disparities in diagnosis and treatment between groups.
In this Article, we used data from samples with diverse ancestries and carried out genome-wide association meta-analyses, followed by fine mapping and prioritization of target genes (Fig. 1).We assessed the transferability of genetic loci across ancestry groups.Finally, we explored bi-directional causal links between MD and cardiometa-https://doi.org/10.1038/s41588-023-01596-4different but linked variants (hypothesis (H)3) or a single causal variant (H4) were considered as colocalized.We observed significant colocalization for DPP4, RBMS1 and TANK.We tested ancestry-specific eQTLs from blood and observed RBMS1 (H3: PP (Hispanic/Latin American) 99.12%) and TANK (H3: PP (European) 97.85%; H3: PP (Hispanic/Latin American) 99.61%) at the 2q24.2locus.For the protein quantitative trait loci (pQTLs) from blood, we either did not find the genes in the cohort or the number of SNPs within the gene was too low (<20) to test for colocalization (Supplementary Table 3).
Although the lead variants at 2q24.2 and at 6q16 did not display strong evidence of association in a large published GWAS in participants of European ancestry 14 (P > 0.01), in each case there was an uncorrelated variant within 500 kb of the lead variants associated at P < 10 −6 (Supplementary Fig. 12).Hence, although the evidence does not support a shared causal variant, we cannot rule out that there is an association at the same locus, but possibly with a different causal variant in European ancestry participants.
As a sensitivity analysis, we conducted a meta-analysis for each ancestry/ethnic group for clinical depression, comprising studies in which MD was ascertained by structured clinical interviews or medical healthcare records following the International Classification of Diseases (ICD9)/10 or the Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV/5 criteria for major depressive disorder (Supplementary Table 1).There were 29,389 cases and 49,999 controls of African ancestry, 7,886 cases and 14,412 controls of East Asian ancestry, 848 cases and 13,908 controls in the Hispanic/Latin American group, and 4,252 cases and 26,738 controls of South Asian ancestry (Extended Data Fig. 2 and Supplementary Figs.13-16).In the South Asian ancestry GWAS, the A allele of rs7749931 at 6q15 was associated with decreased risk of MD (effect allele: A, EAF 0.49, β = −0.15,s.e.m. 0.03, P = 4.3 × 10 −8 ) (Extended Data Fig. 2d and Supplementary Fig. 15).The variant is located downstream of STX7 (syntaxin 7).We did not observe genome-wide significant loci associated with clinical depression in any other ancestry group.

Transferability of MD associations across ancestry groups
Previous GWAS in samples of European ancestry have identified 206 loci associated with MD (Supplementary Table 4) [13][14][15] .The results for 196 of these loci were available in at least one of the ancestry/ethnic groups.We assessed whether these genetic associations are shared across different ancestry groups.Individual loci may be underpowered to demonstrate an association; therefore, we followed an approach we recently developed 28 and first estimated the number of loci we expect to see an association for when accounting for sample size (n), linkage disequilibrium (LD) and minor allele frequency (MAF).This estimate varied widely between ancestry groups, for example, we expected to detect significant associations for 65% of MD loci in the GWAS with samples of African ancestry, but only for 15% of MD loci in samples of South Asian ancestry (Fig. 2).We report the power-adjusted transferability (PAT) ratio, that is, the observed number divided by the expected number of loci.Transferability was low, with PAT ratios of 0.27 (95% confidence interval (CI) 0.19 to 0.35) in African ancestry samples, and 0.29 in both East Asian (95% CI 0.20 to 0.39) and South Asian (95% CI 0.12 to 0.46) ancestry samples.In the Hispanic/Latin American group, the PAT ratio was 0.63 (95% CI 0.55 to 0.72), notably higher than in the other groups.PAT estimates for clinical MD were close to those for broad MD, with overlapping CIs in each case (Fig. 2).We were unable to estimate PAT ratios for clinical MD in the Hispanic/Latin American group because of insufficient numbers of cases based on this definition.We also assessed the transferability of 102 loci identified in the Psychiatric Genomics Consortium-Major Depressive Disorder Working Group's (PGC-MDD) GWAS 13 and in an independent study in samples of European ancestry, the Australian Genetics of Depression Study (AGDS) 14 .The PAT ratio was 1.48, considerably higher than the cross-ancestry PAT estimates.We report evidence of transferability of individual loci (Supplementary Table 5) as well as their ancestry-specific eQTL and pQTL colocalization (Supplementary Table 6 and Supplementary Fig. 17).
In addition, we estimated trans-ancestry genetic correlations using POPCORN version 1.0 (ref.29).We only present genetic correlation estimates where the s.e.m. was less than 0.3.The sample size for the South Asian ancestry group was too small to conduct this analysis.The genetic correlations for MD between the European and the Hispanic/ Latin American, African and East Asian ancestry groups were ≥0.75.The lowest estimate was observed between the East Asian ancestry, and the Hispanic/Latin American group (r g = 0.52) (Fig. 3).

Multi-ancestry meta-analysis
We carried out a multi-ancestry meta-analysis using data from studies conducted in participants of African, East Asian and South Asian ancestry and Hispanic/Latin American samples (Supplementary Note), and combined them with previously published data for 258,364 cases and 571,252 controls of European ancestry 13,14 , yielding a total sample size of 345,389 cases and 1,469,702 controls.These analyses provided results for 22,941,580 SNPs after quality control.There was no evidence of residual population stratification (λ 1,000 = 1.001,LDSC intercept 1.019; Supplementary Table 2).We identified 190 independent genome-wide significant SNPs mapping to 169 loci that were separated from each other by at least 500 kb (Extended Data Fig. 3, Supplementary Table 7 and Supplementary Fig. 18).Fifty-three of the SNPs represent novel associations (r 2 < 0.1 and located more than ±250 kb from previously reported variants).Most of the 196 previously reported loci were associated at genome-wide significance in the multi-ancestry meta-analysis, which incorporates the discovery data for these loci (Supplementary Table 4).
As a sensitivity analysis, we also conducted a multi-ancestry meta-analysis for clinical depression.There were 57,714 cases and 110,358 controls of European ancestry under the clinical definition of MD, which were subsequently combined with the aforementioned non-European clinically ascertained studies by meta-analysis (100,089 cases and 214,415 controls in total) (Extended Data Fig. 4 and Supplementary Figs.19 and 20).This analysis identified seven genome-wide significant loci, two of which were novel (rs2085224 at 3p22.3 and rs78676209 at 5p12) (Supplementary Table 8).
We then excluded cohorts that had an extreme case-control ratio (n cases /n controls <0.25) and did not adjust for this analytically, as well as cohorts with adolescent participants.This sensitivity analysis also yielded consistent results for the 190 lead SNPs (Supplementary Fig. 21).
Finally, we re-analyzed the data using a multi-ancestry meta-analysis approach implemented in MR-MEGA, which resulted in 44 independent regions associated with MD after lambda GC correction, some of which had been missed in the main analyses due to their between-ancestry heterogeneity (Supplementary Table 9).

Multi-ancestry fine mapping
We used a multi-ancestry Bayesian fine-mapping method 30 to derive 99% credible sets for 155 loci that were associated at genome-wide significance and did not show evidence of multiple independent signals.For comparison, we also implemented single ancestry fine mapping of the same loci based on GWAS conducted in participants of European ancestry, including PGC-MDD and AGDS 13,14 .
Multi-ancestry fine mapping increased fine-mapping resolution substantially as compared with fine mapping solely based on the data from European ancestry participants.The median size of the 99% credible sets was reduced from 65.5 to 30 variants.Among the 145 loci for which we conducted fine mapping on both sample sets, 113 (77.9%) loci had a smaller 99% credible set from the multi-ancestry fine mapping, while four loci (0 from the European fine-mapping) were resolved to single putatively causal SNPs (Fig. 4 and Supplementary Table 10).For example, rs12699323, annotated as an intronic variant, is linked to expression of TMEM106B (transmembrane protein 106B).rs1806152 is a splice region variant associated with expression of the nearby gene PAUPAR (PAX6 upstream antisense RNA) on chromosome 11.At another locus, rs9564313 has been linked to expression of PCDH9 (protocadherin-9), a gene that is also highlighted in our TWAS and multi-marker analysis of genomic annotation (MAGMA) results 31,32 .

TWAS and gene prioritization
To better understand the biological mechanisms of our GWAS findings, we performed several in silico analyses to functionally annotate and prioritize the most likely causal genes.We carried out a transcriptome-wide association study (TWAS) based on the results from the multi-ancestry meta-analysis for expression in tissues relevant to MD 33 .We combined the TWAS results with functional mapping and annotation (FUMA), conventional MAGMA and HiC-MAGMA to prioritize target genes.
Forty-three genes displayed evidence of association across all four gene prioritization methods (TWAS, FUMA, MAGMA and HiC-MAGMA) and were classified as high-confidence genes (Table 1 and Supplementary Tables 11-15).These included genes repeatedly highlighted in previous studies due to their strong evidence of association and biological relevance in MD: NEGR1, DRD2, CELF4, LRFN5, TMEM161B

Fig. 3 | Genetic correlations for MD between different ancestry groups. A plot
showing the genome-wide genetic correlations between the African, European, East Asian and Hispanic/Latin American groups.The intensity of the coloring reflects the strength of the correlation.The estimated coefficients and standard errors are also shown in each cell.We only present estimates where the s.e.m. was smaller than 0.3; otherwise, the field is colored in gray.

Discussion
We present the first large-scale GWAS of MD in an ancestrally diverse sample, including data from almost 1 million participants of African, East Asian and South Asian ancestry, and Hispanic/Latin American samples.The largest previous report included 26,000 cases of African ancestry 15 .By aggregating data in ancestry-specific meta-analyses, we identified two novel loci, 2q24.2 and 6p15.In the Hispanic/Latin American group, variants at 2q24.2 were associated with MD.Most of the cases in this group were defined using symptoms questionnaires.Future studies will be required to assess whether the association of this loci with clinical MD is consistent with our estimate.While the additional association at 6q16 in the GWAS in samples of African ancestry requires further confirmation in future studies, the link with MD is biologically plausible.The lead variant was significantly associated with the expression of MCHR2 specifically in brain cortex tissue.Melanin-concentrating hormone (MCH) is a neuropeptide that is expressed in the central and peripheral nervous systems.It acts as a neurotransmitter and neuromodulator in a broad array of neuronal functions directed toward the regulation of goal-directed behavior, such as food intake, and general arousal 34 .
The diversity, in combination with the large sample size, enabled a comparison of the causal genetic architecture across ancestry groups.
We assessed to what extent the 206 previously identified loci from large European ancestry discovery GWAS were transferable to other ancestry groups.Differences in allele frequencies, linkage disequilibrium and variable sample sizes impact on power to observe associations for each group.We recently developed PAT ratios, an approach to account for all these factors by comparing observed transferable loci with what is expected for a study of a given ancestry and sample size 28 .The PAT ratios were about 30% for African, South Asian and East Asian ancestry, remarkably similar and consistently low.We previously computed PAT ratios for several other traits and found variation between traits, but the estimates for MD were at the bottom 19,28 .With a PAT ratio of 64%, the transferability of MD loci discovered in European ancestry samples was much higher for the Hispanic/Latin American group.This finding may reflect that the Hispanic/Latin American group contained many participants with a high proportion of European ancestry 35,36 .The majority of cases within this group were defined via symptom questionnaires rather than clinical MD.Hence, it may be possible that the transferability for clinical MD is even higher in this group.For African, South Asian and East Asian ancestry, the PAT ratios for clinical MD were all below 0.5 and consistent with the estimates from the main analysis, demonstrating that heterogeneity in outcome definitions does not explain the limited transferability of MD loci across ancestry groups.
To better understand mechanisms underlying individual differences in vulnerability to development of MD, we need to bridge the gap from locus discovery to the identification of target genes.Our study achieved substantial progress in this respect.Fine mapping benefitted from the additional diverse samples, with median credible sets reduced from 65.6 to 35 in size and with 32 loci resolved to ≤10 putatively causal SNPs (11 loci from the European ancestry fine mapping).
On the TWAS, the expression of 354 genes was significantly associated with MD.Out of these, 205 gene associations were novel, and 89 were overlapping with results of the largest previously published MD TWAS 15 .Furthermore, 80 genes were overlapping with associations from another, previously published, large MD TWAS with largely overlapping samples of European ancestry 33 .A number of these TWAS features, including NEGR1, ESR2 and TMEM106B, were previously also fine mapped and highlighted as putatively causal in previous post-TWAS analyses, strengthening the role of TWAS as an important tool to better understand the relationship between gene expression and MD.
Through TWAS and three other tools that incorporate the growing body of knowledge about functional annotations of the genome, we classified 43 genes as 'high confidence'.The definition admittedly remains arbitrary until the field establishes clear guidelines.Nevertheless, the high-confidence list represents an evidence-based starting point for further follow-up.It provides confirmation for several genes that have repeatedly been highlighted as being near a GWAS-associated  variant and having high biological plausibility 2,13-15,33 : NEGR1, DRD2, CELF4, LRFN5, TMEM161B and TMEM106B.Furthermore, cadherin-9 (CDH9) and protocadherins (PCDHA1, PCDHA2 and PCDHA3) were classified as high-confidence genes.Cadherins are transmembrane proteins, mediating adhesion between cells and tissues in organisms 37 .In previous studies, cadherins have been linked with MD and with other disorders involving the brain, including late-onset Alzheimer's disease, which often manifests as neuropsychiatric symptoms coupled with depression and anxiety 13,[38][39][40] .The results of our study strengthen the evidence for the involvement of cadherins and protocadherins in the etiology of MD.
Genes newly implicated in MD development in our study highlight novel pathways, pinpoint potential new drug targets and suggest opportunities for drug repurposing.NDUFAF3 encodes mitochondrial complex I assembly protein, which is the main target of the drug metformin 41 , the first-line drug for treating type 2 diabetes.Research in model organisms has provided a tentative link between metformin and a reduction in depression and anxiety 42 .Furthermore, a recent study using more than 360,000 samples from the United Kingdom Biobank (UKB) found associations between NDUFAF3 and mood instability, suggesting that energy dysregulation may play an important role in the physiology of mood instability 43 .
Previous MR studies conducted in populations of European ancestry suggested a causal relationship of higher BMI increasing the odds of depression [44][45][46] .To our knowledge, evidence of a reverse causal association (that is, MD genetic liability increases the odds of higher BMI) has not been previously reported 2 .We also observed that the genetic liability to MD was associated with higher TG levels, lower HDL cholesterol and higher LDL cholesterol levels in individuals of European ancestry, which were not significant in the only previous MR study of smaller statistical power 47 .Individuals with depression present higher levels of inflammation and are at increased risk of cardiometabolic disorders, irrespective of the age of onset 48 .The phenotypic associations between MD and cardiometabolic traits may partly reflect the genetic overlap between them 49 .However, in other ancestry groups, no significant relationship between BMI and MD was observed.Our MR analyses showed an effect of reduced TGs on increasing odds of MD in participants of East Asian ancestry.The data are presented with a β and a 95% CI.Nominally significant associations are marked with a red asterisk.Statistics have been derived using the β and standard errors for the number of variants used as IVs in each analysis, shown as N SNPs.Results are not shown for diastolic blood pressure for which there were no significant associations.*P < 0.05 (P values in order from top to bottom: 6.88 × 10 −11 , 8.22 × 10 −3 , 9.22 ×10 −3 , 7.93 × 10 −7 , 7.67 × 10 −3 , 3.43 ×'10 −3 , 0.03 and 0.01).More details can be found in Supplementary Table 16.AFR, African ancestry; EAS, East Asian ancestry; EUR, European ancestry; HIS, Hispanic/Latin American group; SAS, South Asian ancestry. https://doi.org/10.1038/s41588-023-01596-4 Therefore, we provide further evidence for an opposite direction of effect for the relationship between MD and metabolic traits in European and East Asian ancestry groups 19,21 .Instead of generalizing findings about depression risk factors across populations, further studies are needed to understand how genetic and environmental factors contribute to the complex relationships across diverse ancestry groups.
Our study has limitations.In this study, we assigned individuals into ancestry and ethnic groups.While this enabled important insights (for example, about transferability of MD loci), such categorical assignments are imprecise and some participants with admixed ancestry may still get excluded.In future research, we aim to implement different analytic strategies that are fully inclusive.
The sample size varied greatly across ancestry groups.The smallest group were individuals of South Asian ancestry.Most of the individuals included in our study live in the United States or in the United Kingdom.To characterize MD in global populations, future studies prioritizing primary data collection are needed.To contribute to this, we are currently recruiting MD patients and controls from Pakistan into the DIVERGE study 50 .However, a concerted global effort to increase diversity in genetics will be necessary to fully address the issue 22 .This also applies to the lack of other omics data and other functional databases to support downstream analyses for ancestrally diverse GWAS, such as large resources for transcriptomics or proteomics in relevant tissues 51,52 .This may have impacted on our TWAS results because the RNA sequencing data was predominantly from participants of European ancestry.
Furthermore, statistical power for discovery of genetic associations may be impacted by reduced coverage of genetic variation present in diverse ancestral groups, as well as other factors such as the reliability of outcome assessment across different groups.
Additionally, our bi-directional MR analysis tested the relationships between MD and cardiometabolic traits.When testing MD as the exposure, the results should be interpreted as the effect of MD genetic liability and not as the effect of MD itself.
This study utilized data from several existing cohorts and bioresources to achieve large sample sizes.This necessitated using different outcome definitions, covering self-administered symptom questionnaires, electronic healthcare records and structured clinical interviews.The potential advantages and disadvantages of these approaches have been extensively discussed in previous studies 2,6 .It is possible that some of the 190 genome-wide significant loci we identified are linked to a more general susceptibility to mental illness instead of being specific to MD.However, given the overlap between different psychiatric disorders 53 , such findings are nevertheless of value for our understanding of the biology and for the development of new treatments for MD.
In conclusion, in this first large-scale, multi-ancestry GWAS of MD, we demonstrated through transferability analyses that a notable proportion of MD loci are specific to samples of European ancestry.We identified novel, biologically plausible associations that were missed in European ancestry analyses and demonstrated that large, diverse samples can be important for identifying target genes and putative mechanisms.These findings suggest that for MD, a heterogeneous condition with highly complex etiology, increasing ancestral as well as global diversity in genetic studies may be particularly important to ensure discovery of core genes and to inform about transferability of findings across ancestry groups.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41588-023-01596-4.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.© The Author(s) 2024 https://doi.org/10.1038/s41588-023-01596-4

Participating cohorts
For the analyses of the African, East Asian, South Asian and Hispanic/ Latin American group, we included data from 21 cohorts (Supplementary Table 1) with ancestrally diverse participants, where measurements were taken from distinct samples.Details including study design, genotyping and imputation methods and quality control for these studies had been described by previous publications (Supplementary Note).All participants provided informed consent.All studies obtained ethical approvals from local ethics review boards.Measures were taken from distinct samples rather than repeat measures from the same individual.
For each study, a principal component analysis was carried out based on the genetic similarity of individuals.Individuals who clustered around a reference group with confirmed ancestry were assigned to that specific group and included in the association analysis, except for the Hispanic/Latin American group, which was based on self-reported ethnicity.Individuals with admixture between the predefined ancestry reference groups were excluded.
We also included two previously published studies of MD, using data from ancestrally European participants, including the PGC-MDD2 (n cases = 246,241 and n controls = 558,568) 13 and the AGDS (n cases = 12,123 and n controls = 12,684) (ref.14) to conduct a multi-ancestry meta-analysis of MD (Supplementary Table 1).The total sample size of the multi-ancestry meta-analysis was 1,815,091 (n cases = 345,389 and n effhalf = 559,332).Of the participants, 70.1% (effective sample size) were of European ancestry, 8.2% East Asian, 11.8% African and 1.5% were of South Asian ancestry, and 7.9% were Hispanic/Latin American.
We used a range of measures to define depression, including structured clinical interviews, medical care records, symptom questionnaires and self-reported surveys (Supplementary Table 1).The meta-analyses were primarily conducted combining GWASs of all phenotype definitions (that is, broad MD).In addition, meta-analyses for clinical depression and relevant downstream analyses were also conducted.We considered depression ascertained by structured clinical interviews (directly assessing diagnostic criteria based on DSM-IV, DSM-5 or ICD9/10 through interviews or self-report) or medical care records (ICD9 or ICD10 from primary or secondary care units) as clinical depression.Among the GWASs, the Genes and Health study, MVP, the Genetic Epidemiology Research on Aging Study (GERA), BioVU, the Prevention Intervention Research Center First Generation Trial (PIRC), the Mexican Adolescent Mental Health Survey (MAMHS), CON-VERGE, the UKB, the Army Study to Assess Risk and Resilience in Service Members (Army-STARRS) and BioMe fulfilled the clinical definition of depression.On the basis of European ancestry data from previous published work of the PGC-MDD group 13 , all studies fulfilled the clinical definition, except for the UKB and the 23andMe, which were excluded in the analysis of clinical MD.

Study-level genetic association analyses
Throughout the manuscript, all statistical tests were two sided, unless explicitly indicated.We had access to individual-level data for Army STARRS, UKB, Women's Health Initiative (WHI), Intern Health Study (IHS), GERA, Jackson Heart Study ( JHS), Drakenstein Child Health Study and the Detroit Neighborhood Health Study.Data access was granted via our collaborators, the UKB under application ID 51119 and the dbGaP under project ID 18933.
SNP-level associations with depression were assessed through logistic regressions using PLINK version 1.90.The additive per-allele model was employed.Age, sex, principal components and other relevant study-level covariates were included as covariates.Where available, genotypes on chromosome X were coded 0 or 2 in male participants and 0, 1 and 2 in female participants.Data for variants on X were only available for some of the studies (Supplementary Table 1).The effective sample size was 1,763 for African, 58,833 East Asian, 13,099 South Asian ancestry and 79,720 for Hispanic/Latin American.Summary statistics were received from our collaborators for all other studies.Additive-effect logistic regressions were conducted by the 23andMe Inc, Taiwan-MDD study, MVP, BBJ, Rabinowitz, MAMHS, PrOMIS and BioVU.Age, sex, principal components and other relevant study-level covariates were included as covariates.
Mixed-effect models were used in the association analysis for CKB, BioME and Genes and Health with SAIGE (version 0.36.1)(ref.54).The CONVERGE study initially conducted mixed-effect model GWA tests with Bayesian and logistic regression toolkit-linear mixed model (BOLT-LMM), followed by PLINK logistic regressions to retrieve log odds ratios (ORs).For the CONVERGE study, the logORs and s.e.m. from PLINK were used in our meta-analysis.The HCHS/SOL implemented mixed-effect model GWA tests to adjust for population structure and relatedness with depression as binary outcome 16 and was conducted using GENESIS 55 .The summary statistics from GENESIS were converted into logOR and s.e.m. before meta-analysis.First, the score and its variance were transformed into β and s.e.m. by β = score/variance and s.e.m. = sqrt(variance)/variance.Afterwards, β and s.e.m. were converted into approximate logOR and s.e.m. using β = β/(pi × (1 − pi)) and s.e.m. = s.e.m./(pi × (1 − pi)), where pi is the proportion of cases in analysis 56 .
We restricted the downstream analysis to variants with imputation accuracy info score of 0.7 or higher and effective allele count (2 × MAF × (1 − MAF) × N × R 2 ) of 50 or higher.For study of small sample size, we required a minor allele frequency of no less than 0.05.The alleles for indels were re-coded as 'I' for the longer allele and 'D' for the shorter one.Indels of different patterns at the same position were removed.

Meta-analyses
We first implemented inverse variance-weighted (IVW) fixed-effect meta-analyses for GWAS from each ancestry/ethnic group (that is, African ancestry, East Asian ancestry, South Asian ancestry and the Hispanic/Latin American group) using METAL (version 2011-03-25) (ref.57).The genomic inflation factor λ was calculated for each study and meta-analysis with R package GenABEL version 1.8.0 (ref.58).Given the dependence of this estimate on sample size, we also calculated λ 1,000 (ref.59) as λ 1,000 = 1 + (λ − 1) × (1/n case + 1/n control ) × 500 (ref.60).The LDSC intercept was also calculated with an ancestry-matched LD reference panel from the Pan UKB reference panel 61 for each meta-analysis with LDSC (version 1.0.1)(ref.62).For meta-analyses with residual inflation (λ > 1.1), test statistics for variants were adjusted by LDSC intercept.Following the meta-analyses by METAL, variants present in less than two studies were filtered out.Statistical tests were generally two sided unless otherwise stated.We also performed a heterogeneity analysis with METAL to assess whether observed effect sizes (or test statistics) are homogeneous across samples.
We combined data from 71 cohorts with diverse ancestry using an IVW fixed-effects meta-analysis in METAL 57 .λ and λ 1,000 were calculated, and were 1.687 and 1.001, respectively.The LDSC intercept was also calculated with the multi-ancestry LD reference panel (Supplementary Note), which was 1.019 (s.e.m. 0.011).We adjusted the test statistics from the multi-ancestry meta-analysis using the LDSC intercept of 1.019.Only variants present in at least two studies were retained for further analysis, yielding a total of 22,941,580 variants.We also calculated the number of cases and the total number of samples for each variant based on the crude sample size and availability of each study.
We used a significance threshold of 5 × 10 −8 .To identify independent association signals, the GCTA forward selection and backward elimination process (command 'cojo-slt') were applied using the summary statistics from the multi-ancestry meta-analysis, with the aforementioned multi ancestry LD reference panel (GCTA version 1.92.0 beta2) 63,64 .It is possible that the algorithm identifies false positive secondary signals if the LD in the reference set does not match the actual LD in the GWAS data well; therefore, for each independent signal https://doi.org/10.1038/s41588-023-01596-4defined by the GCTA algorithm, locus zoom plots were generated for the 250 kb upstream and downstream region.We then inspected each of these plots manually and removed any secondary signals from our list where there was unclear LD separation, that is, some of the variants close to the secondary hit were in LD with the lead variant.
Loci were defined by the flanking genomic interval mapping 250 kb upstream and downstream of each lead SNP.Where lead SNPs were separated by less than 500 kb, the corresponding loci were aggregated as a single locus with multiple independent signals.The lead SNP for each locus was then selected as the SNP with minimum association P value.The analysis for loci identification, along with all other R-related tasks unless otherwise stated, was conducted using R (version 3.4.3)(ref.65) and figures were produced using the packages ggplot2 (version 3.2.1)(ref.66), qqman (version 0.1.4)(ref.67) and ggpubr (version 0.6.0)(ref.68).
We conducted sensitivity analyses for outcome definitions, case-control ratio and using a different multi-ancestry meta-analysis approach (Supplementary Note).

Fine mapping
We fine mapped all loci with statistically significant associations from the multi-ancestry GWAS using a statistical fine-mapping method for multi-ancestry samples 30 .Briefly, this method is an extension of a Bayesian fine-mapping approach 30,69 that utilizes estimates of the heterogeneity across ancestry groups, such that variants with different effect estimates across populations have a smaller prior probability to be the causal variant.
For each lead variant, we first extracted all nearby variants with r 2 > 0.1 as determined by the multi-ancestry LD reference.The multi-ancestry prior for each variant to be causal was calculated from a fixed-effects meta-analysis combining the summary statistics from ancestry-specific meta-analysis for each of the five major ancestry groups.I 2 statistics were calculated to estimate the heterogeneity of the effect estimates across ancestry groups.The posterior probability for a variant to be included in the credible set was proportional to its chi-square test statistic and the prior.The 99% credible set for each lead variant was determined by ranking all SNPs (within r 2 > 0.1 of the lead variant) according to their posterior probabilities and then including ranked SNPs until their cumulative posterior probabilities reached or exceeded 0.99.
As a comparison, we also conducted a Bayesian fine-mapping analysis based on the summary statistics of the European-ancestry meta-analysis.The same list of independent lead SNPs from the multi-ancestry meta-analysis were used for this fine mapping in the European ancestry data.All nearby SNPs with r 2 > 0.1 as determined by the 1,000 Genomes European LD reference panel were included in the fine mapping.The posterior probability was calculated in a similar way, but without the multi-ancestry prior.Similar to the multi-ancestry fine mapping, all SNPs were ranked, and 99% of the credible sets were derived accordingly.
Since our fine mapping was based on meta-analysis summary statistics, heterogeneity of individual studies (for example, due to differences in genotyping array) can influence the fine-mapping calibration and recall.We used a novel summary statistics-based quality control method proposed by Kanai and colleagues (SLALOM) to dissect outliers in association statistics for each fine-mapped locus 70 .This method calculates test statistics (DENTIST-S) from Z-scores of test variants and the lead variant (the variant of the lowest P value in each locus), and the LD r between test variants and the lead variant in the locus 71 .Among the 155 fine-mapped loci in our study, there were 134 loci with the largest variant posterior inclusion probability of greater than 0.1.For these 134 loci, r values were calculated for all variants within the 1 Mb region of the lead variant for each locus based on our multi-ancestry LD reference from the UKB data.In line with the criteria used by Kanai and colleagues, variants with DENTIST-S P value smaller than 1 × 10 −4 and r 2 with the lead variant greater than 0.6 were defined as outliers.Fine-mapped loci were classified as robust if there were no outlying variants.

Colocalization analysis
We performed colocalization between genetic associations with MD and gene expression in brain and blood tissues from samples of European and African ancestry and Hispanic/Latin American participants using coloc R package 72 .To select genes for testing, we mapped SNPs within a 3 Mb window at 2q24.2 and 6q16.2 using Variant Effect Predictor 31 , resulting in eight and four genes, respectively.Loci with posterior probability >90% either for both traits are associated and share two different but linked variants (H3 hypothesis) or a single causal variant (H4 hypothesis) were considered as colocalized.The European and African ancestry summary statistics for MD were tested against multi-ancestry brain eQTLs from European and African American samples 27 .For the Hispanic/Latin American group, we tested gene and protein expression of blood tissue from Multi-Ethnic Study of Atherosclerosis and Trans-omics for Precision Medicine 73 .For African ancestry, we tested gene expression of blood from GENOA study 74 and proteome expression of blood 75 .For European ancestry, we tested gene expression of blood from eQTLgen 76 , and proteome expression from blood 75 .We also carried out ancestry-specific eQTL and pQTL colocalization analyses for previously reported loci that were or were not transferable.

Assessment of transferability of MD-associated loci
We assessed whether published MD-associated loci display evidence of association in the East Asian, South Asian and African ancestry and Hispanic/Latin American samples.Pooling the independent genome-wide significant SNPs from two large GWAS of MD in samples of European ancestry yielded 195 loci [13][14][15] .The ancestrally diverse groups included in this study had smaller numbers of participants than the European ancestry discovery studies.Also, a given variant may be less frequent in another ancestry group.Therefore, individual lead variants may not display evidence of association because of lack of power.Moreover, in the discovery study, the lead variant is either the causal variant or is strongly correlated with it.However, differences in LD mean that the lead variant may not be correlated in another ancestry group and may therefore not display evidence of association.Our assessment of transferability was therefore based on PAT ratios that aggregate information across loci and account for all three factors, sample size, MAF and differences in LD 28 .
First, credible sets for each locus were generated.They consisted of lead variant plus all correlated SNPs (r 2 ≥ 0.8) within a 50 kb window of the lead variant (based on ancestry matched LD reference panels from the 1,000 Genomes data) and with P < 100 × P lead .A signal was defined as being 'transferable' to another ancestry group if at least one variant from the credible set was associated at two-sided P < 10 (log 10 0.05)−P f ×(N−1) with MD and had consistent direction of effect between the discovery and test study.N is the number of SNPs in the credible set for each locus, and P f is a penalization factor we derived from empirical estimations.The effective number of independent SNPs was often higher in other ancestry groups due to differences in LD, leading to higher multiple testing burden and higher likelihood of identifying SNPs with a low P value, by chance alone.This inflates the test statistics and was adjusted for by the penalization factor (P f ).To derive the P f for each ancestry group, we used the summary statistics from a previous GWAS on breast cancer 77 , in which phenotypes were believed to be uncorrelated with MD.A total of 441 breast cancer significant SNPs were taken from their paper, and linear regressions were conducted for the P values of these SNPs in each of our ancestrally diverse summary statistics for MD on the number of SNPs in credible sets.The coefficient estimates (slope from regressions) were treated as P f for each ancestry.As a result, P f were 0.008341, 0.007378, 0.006847 and 0.003147 for samples of African, East Asian, South Asian ancestry and for the Hispanic/Latin American group, respectively.https://doi.org/10.1038/s41588-023-01596-4 In the next step, the statistical power to detect an association of a given locus was calculated assuming an additive effect at a type I error rate of 0.05, with effect estimates from the discovery study, and allele frequency and sample size from each of the target datasets from diverse ancestry/ethnic groups.The power estimates were summed up across published loci to give an estimate of the total number of loci expected to be significantly associated.This is the expected number if all loci are transferable and accounts for the statistical power for replication.We calculated the PAT ratio by dividing the observed number of loci by the expected number.In addition, loci were defined as 'nontransferable' if they had sufficient power for identifying an association but did not display evidence of association, that is, if they contained at least one variant in the credible set with >80% power, while none of the variants in the credible set had P < 0.05 and no variant within 50 kb of locus had P < 1 × 10 −3 in the target dataset.
For comparison, we also conducted a transferability assessment for a European ancestry look-up study.The 102 significant loci reported by Howard and colleagues 13 were evaluated for their transferability in the AGDS study using the aforementioned method.
To assess whether low transferability may be due to heterogeneous outcome definitions, we carried out a sensitivity analysis, where we estimated PAT ratios based only on studies fulfilling the clinical MD definition.

Trans-ancestry genetic correlations
We estimated trans-ancestry genetic correlations using POPCORN version 1.0 (refs.29,54,78).Pairwise correlations were calculated between each combination of the five ancestry/ethnic groups (that is, African, European, East Asian, South Asian and Hispanic/Latin American) for broad depression and clinical depression separately.

Gene annotation
The summary statistic from the multi-ancestry meta-analysis was first annotated with FUMA 79 .Both positional mapping and eQTL mapping results were extracted from FUMA.The 1,000 Genomes European samples were employed as the LD reference panel for FUMA gene annotation.Datasets for brain tissue available in FUMA were employed for eQTL gene annotation.
Gene-based association analyses were implemented using Multi-marker Analysis of GenoMic Annotation (MAGMA, version 1.08) (ref.80) and Hi-C coupled MAGMA (H-MAGMA) 81 .The aforementioned multiple ancestry LD reference panel from the UKB was used as the LD reference panel.H-MAGMA assigns noncoding SNPs to their cognate genes based on long-range interactions in disease-relevant tissues measured by Hi-C 81 .We used the adult brain Hi-C annotation file.

Transcriptome-wide association analysis and drug mapping
To perform a TWAS, the FUSION software was used 82 .SNP weights were downloaded from the FUSION website 83 and were derived from multiple external studies, including (1) SNP weights from all available brain tissues, adrenal gland, pituitary gland, thyroid gland and whole blood 33 from GTEx v8 (ref.84) (based on significantly heritable genes and 'All Samples' in GTEx v8, which also includes African American and Asian individuals); (2) SNP weights from the CommonMind Consortium, which includes samples from the brain dorsolateral prefrontal cortex; (3) SNP weights from the Young Finns study; and (4) from the Netherlands Twin Register, which provides SNP weights from blood tissues (whole blood and peripheral blood, respectively).
We used the multi-ancestry LD reference panel described above.Variants present in the 1,000 Genomes European population reference panel were retained.A separate TWAS was also performed using a LD reference panel based on the 1,000 Genomes Project's samples of European ancestry, as a sensitivity analysis.
The transcriptome-wide significance threshold for the TWAS associations in this study was P < 1.37 × 10 −6 .This threshold was previously derived using a permutation-based procedure, which estimates a significance threshold based on the number of features tested 33 .
The results were compared with previous TWAS in MD, including the two largest MD TWAS so far 2,15,33,85,86 .These studies generally used smaller sets of SNP weights (except the study by Dall'Aglio and colleagues, which used similar SNP weights as the current study, but with SNP weights derived from the previous GTEx release, v7).The TWAS Z-score plot was generated using a TWAS-plotter function 87 .
To assess the relevance of novel genes to drug discovery, genes were searched in three large drug databases: GeneCards 88 , DrugBank and ChEMBL 89,90 .In Table 1, a selection of drugs (the ones reported in multiple publications) probably targeting our high-confidence prioritized gene sets are shown for each gene.

MR
We performed a bi-directional two-sample MR analysis using the TwoSampleMR R package (version 0.5.6) 91,92to test possible causal effects between MD and six cardiometabolic traits.We followed the STROBE-MR (strengthening the reporting of observational studies in epidemiology using Mendelian randomisation) guidelines (Supplementary Note).For individuals of European ancestry, the UKB was used to select instruments for BMI, fasting glucose, HDL, LDL, SBP and TGs.SBP summary data were obtained from the UKB for individuals of African and South Asian ancestry and Hispanic/Latin American participants.For samples of African, East Asian and South Asian ancestry and the Hispanic/Latin American group, a meta-analysis was performed using METAL 57 with inverse variance weighting using the UKB and the following consortia: GIANT 93 for BMI; MAGIC 94 for fasting glucose; Global Lipids Genetics Consortium 95 for HDL, LDL and TG; and Biobank Japan 95,96 for SBP in samples of East Asian ancestry.The genetic associations with quantitative variables were estimated with respect to the scale, units and models defined in the original studies.Heterogeneity analyses were also performed.To avoid sample overlap, the datasets used to define instrumental variables (IV) for the cardiometabolic traits were excluded from the MD genome-wide association statistics used for the MR analyses conducted with respect to each ancestry group.
Genome-wide significance (P = 5 × 10 −8 ) was used as the threshold to select IVs for the exposures.However, if less than ten variants were available, a suggestive threshold (P = 5 × 10 −6 ) was used to select IVs (Supplementary Table 16).We only included IVs that were present in both datasets (exposure and the outcome).We followed the three main IV assumptions for the analysis: (1) relevance: the IV is associated with the risk factor of interest; (2) independence: the IV is not associated with confounders; and (3) exclusion: the IV is only associated with the outcome through the exposure.We used the following criteria for clumping: r 2 = 0.001 and a 10,000 kb window.The following information was used in both the exposure and outcome data: SNP ID, effect size, effect allele, other allele, EAF and P value.We used five different MR methods: IVW, MR-Egger, weighted median, simple mode and weighted mode 92 .The IVW estimates were reported as the main results due to their higher statistical power 97 while the other tests were used to assess the consistency of the estimates across different methods.MR-Egger regression intercept and MR heterogeneity tests were conducted as additional sensitivity analyses.In case of significant heterogeneity, the MR-pleiotropy residual sum and outlier global test was used to remove genetic variants based on their contribution to heterogeneity 98 ).

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.https://doi.org/10.1038/s41588-023-01596-4sharing of genome-wide summary statistics.The full GWAS summary statistics for the 23andMe discovery dataset will be made available through 23andMe to qualified researchers under an agreement with 23andMe that protects the privacy of the 23andMe participants.Please visit https://research.23andme.com/collaborate/#dataset-access/for more information and to apply to access the data.Investigators can apply for access to WHI and JHS via dbGaP (https://www.ncbi.nlm.nih.gov/gap/).The current study utilized data from dbGaP studies under application #18933.

Fig. 4 |
Fig. 4 | Resolution of the locus fine mapping based on the multi-ancestry and the European ancestry GWAS, showing the size of the credible sets for 155 significant loci.a, A box plot showing the median (central line) and interquartile range (upper and lower hinges) of the sizes of the credible set for fine-mapped loci.The whiskers extend to 1.5 times the interquartile range.Data points falling outside that range are denoted by individual dots in the figure.b, Stacked bar charts showing the number of loci within size categories for credible sets.

Fig. 5 |
Fig. 5 | Manhattan-style Z-score plot of gene associations with MD in a TWAS based on the GWAS summary statistics for broadly defined MD.Significant gene associations are shown as red dots (354 significant genes, 205 of them novel), and the 50 most significant gene names are highlighted on both sides of the plot.Novel associations are shown in black, while genes previously associated with MD are shown in gray.The red lines indicate the significance threshold (P < 1.37 × 10 −6 ).For genes on the top part of the graph, increased expression was associated with increased depression risk, while expression of the genes on the bottom part of the plot showed an inverse association.NT, novel transcript.

Fig. 6 |
Fig. 6 | Bi-directional MR tests between MD and cardiometabolic outcomes.The data are presented with a β and a 95% CI.Nominally significant associations are marked with a red asterisk.Statistics have been derived using the β and standard errors for the number of variants used as IVs in each analysis, shown as N SNPs.Results are not shown for diastolic blood pressure for which there

Extended Data Fig. 1 |Extended Data Fig. 2 |Extended Data Fig. 3 |Extended Data Fig. 4 |
https://doi.org/10.1038/s41588-023-01596-496.Sakaue, S. et al.A cross-population atlas of genetic associations for 220 human phenotypes.Nat.Genet.53, 1415-1424 (2021).97.Bowden, J. et al.A framework for the investigation of pleiotropy in two-sample summary data Mendelian randomization.Stat.Med.36, 1783-1802 (2017).98. Verbanck, M., Chen, C.-Y., Neale, B. & Do, R. Detection of widespread horizontal pleiotropy in causal relationships inferred from Mendelian randomization between complex traits and diseases.Nat.Genet.50, 693-698 (2018).Manhattan plots for genetic associations with major depression in non-European ancestries.The y-axes show the −log 10 P values for the associations between each single-nucleotide polymorphism and major depression.The x-axes show the chromosomal position (GRCh37).The red line represents the genome-wide significance threshold of 5 × 10 −8 and the blue line 10 −5 .a, Manhattan plot for African ancestry.Due to the restriction that SNPs need to be available in at least two studies, only results for 6,051 variants were available on the X chromosome.b, Manhattan plot for East Asian ancestry.c, Manhattan plot for Latin American ancestry.Association P values have been adjusted by the LDSC intercept of 1.0508.d, Manhattan plot for South Asian ancestry.Only one cohort provided data for variants on the X chromosome.Those are not included because for the meta-analysis at least two cohorts were required to provide data for each variant.Manhattan plots for genetic associations with clinical major depression in individuals of non-European ancestries.The y-axes show the −log 10 P values for the associations between each single-nucleotide polymorphism and major depression.The x-axes show the chromosomal position (GRCh37).The red line represents the genome-wide significance threshold of 5 × 10 −8 and the blue line 10 −5 .a, Manhattan plot for African ancestry.b, Manhattan plot for East Asian ancestry.c, Manhattan plot for Latin American ancestry.d, Manhattan plot for South Asian ancestry.Manhattan plot for genetic associations with major depression in the multi-ancestry meta-analysis.The y-axes show the −log 10 P values for the associations between each single-nucleotide polymorphism and major depression.The x-axes show the chromosomal position (GRCh37).The red line represents the genome-wide significance threshold of 5 × 10 −8 and the blue line 10 −5 .Association P values have been adjusted by the LDSC intercept of 1.0185.Manhattan plot for genetic associations with clinical major depression in the multi-ancestry meta-analysis.The y-axes show the − log 10 P values for the associations between each single-nucleotide polymorphism and major depression.The x-axes show the chromosomal position (GRCh37).The red line represents the genome-wide significance threshold of 5 × 10 −8 and the blue line 10 −5 .

Transferability of previously reported loci from European ancestry discovery GWAS of MD to other ancestry groups
. a, A Venn diagram showing the numbers of previously identified loci from European ancestry studies with evidence of transferability to the other ancestry/ethnic groups: African, Hispanic/Latin American, South Asian and East Asian (in black) and their intersections (in cyan).Only the 112 loci with evidence of transferability to at least one ancestry group are shown here.b, A plot showing power-
This table includes 12 genes significantly associated in the TWAS and Hi-C MAGMA, that is, not in physical proximity to a GWAS hit, and 43 genes significantly associated across all four methods (TWAS, FUMA, MAGMA and Hi-C MAGMA).a Ensembl IDs are shown for genes without symbol names.b Drugs targeting the prioritized genes or genes of the same family from GeneCards, DrugBank and ChEmbl.c Gene mapped by FUMA positional mapping or eQTL mapping.d Bonferroni adjusted two-sided P value for MAGMA or Hi-C MAGMA of z statistics (P < 0.05 implies statistical significance).e Novel report as compared with previous MAGMA and TWAS on MD. f Number of variants in the 99% credible set, only available for mapped loci from multi-ancestry fine-mapping.