Genome-wide meta-analyses of stratified depression in Generation Scotland and UK Biobank

Few replicable genetic associations for Major Depressive Disorder (MDD) have been identified. Recent studies of MDD have identified common risk variants by using a broader phenotype definition in very large samples, or by reducing phenotypic and ancestral heterogeneity. We sought to ascertain whether it is more informative to maximize the sample size using data from all available cases and controls, or to use a sex or recurrent stratified subset of affected individuals. To test this, we compared heritability estimates, genetic correlation with other traits, variance explained by MDD polygenic score, and variants identified by genome-wide meta-analysis for broad and narrow MDD classifications in two large British cohorts - Generation Scotland and UK Biobank. Genome-wide meta-analysis of MDD in males yielded one genome-wide significant locus on 3p22.3, with three genes in this region (CRTAP, GLB1, and TMPPE) demonstrating a significant association in gene-based tests. Meta-analyzed MDD, recurrent MDD and female MDD yielded equivalent heritability estimates, showed no detectable difference in association with polygenic scores, and were each genetically correlated with six health-correlated traits (neuroticism, depressive symptoms, subjective well-being, MDD, a cross-disorder phenotype and Bipolar Disorder). Whilst stratified GWAS analysis revealed a genome-wide significant locus for male MDD, the lack of independent replication, and the consistent pattern of results in other MDD classifications suggests that phenotypic stratification using recurrence or sex in currently available sample sizes is currently weakly justified. Based upon existing studies and our findings, the strategy of maximizing sample sizes is likely to provide the greater gain.


Introduction
Major Depressive Disorder (MDD) is a frequently disabling, chronic disorder for which there is substantial evidence of a genetic contribution to its liability 1 . Until recently, the largest international mega-analysis of clinically diagnosed MDD (9240 MDD cases and 9519 controls) yielded no genome-wide significant findings 2 . Given the success of similarly sized studies for other adult psychiatric disorders 3,4 , this study suggested that MDD is an extensively heterogeneous phenotype. This heterogeneity, in addition to the relatively high prevalence and low heritability of MDD, impacts substantially on the statistical power to detect genetic effects 1,5 . Possible means of improving statistical power include stratifying the phenotype into potentially more homogeneous subtypes, or considerably increasing the sample size whilst accepting a broader phenotype. Both of these approaches have since identified associations between genetic variants and MDD [6][7][8][9] .
A genome-wide association study (GWAS) from the CONVERGE consortium identified two genome-wide significant loci (chromosome 10q21.3 and 10q26.13) using a severe depressive phenotype (5053 cases and 5337 controls) in female Han Chinese individuals, treated in a hospital setting 6 . A subsequent study by the Psychiatric Genomics Consortium (PGC) stratified the PGC MDD mega-analysis sample 2 by age of onset and identified a risk conferring locus at chromosome 3q27.2 in individuals with onset after 27 years of age 9 .
In contrast to the above study designs, two studies have utilized larger sample sizes with less detailed structured clinical assessments. The first of these studies came from the CHARGE consortium and employed a quantitative assessment of depressive symptoms using the Center for Epidemiological Studies Depression Scale. In a combined dataset of 51 258 individuals, a genome-wide significant locus was identified at chromosome 5q21.2 7 . More recently, Hyde et al 8 conducted a GWAS using 23andMe data of self-reported depression in 45,773 cases and 106,354 controls, revealing 15 genome-wide significant loci. The genetic correlation (rG) between the 23andMe depression phenotype and the clinical phenotype reported by the PGC was rG(SE) = 0.73(0.09), suggesting a strong association between the additive genetic components of each trait. Previous work comparing self-reported depression and clinically defined MDD in Generation Scotland: Scottish Family Health Study (GS:SFHS) also provides evidence that these traits have substantially overlapping common genetic architectures 10 .
The findings from these four studies support both phenotypic stratification and increased sample size as strategies which may help reveal the underlying architecture of MDD. The international collaborative efforts by groups such as the PGC and the development of largescale biobanks with genetic and extensive phenotypic information will ensure ever increasing sample numbers. It is therefore timely to investigate the contrasting strategies that may be employed in the analysis of these emerging datasets.
In the current study, we sought to compare these strategies by conducting a suite of genetic analyses for depression and stratified subtypes in two UK-based cohorts: GS:SFHS 11,12 and UK Biobank (UKB) 13,14 . To maximize the sample size, an unstratified analysis was initially conducted. This used MDD diagnostic information obtained at structured clinical interview 15 in GS: SFHS (2603 cases, 16,122 controls), and a probable MDD phenotype obtained from a touchscreen questionnaire 16 , previously validated by Smith et al 17 , in UKB (8248 cases, 16,089 controls). Subsequent analyses stratified the phenotype on the basis of recurrence or sex. Each approach was evaluated using several metrics: the successful identification of variants reaching genome-wide significance in GWAS meta-analysis, an increased SNP-based heritability estimate, identification of significant genetic correlations with other traits using LD score regression, and increased variance explained by polygenic profile scores for MDD derived from three independent cohorts. These metrics aim to test whether basic stratification of the MDD phenotype improves etiological insight.

Data and code availability
Data are available to qualified researchers on a costrecovery basis via online application processes, accessible via www.gsaccess.org and www.ukbiobank.ac.uk/registerapply/. The code used in these analyses is available on request from the lead author.

Participants
Generation Scotland: The Scottish Family Health Study (GS: SFHS) GS:SFHS is a family and population-based study consisting of 23,690 participants recruited via general medical practices across Scotland. The recruitment protocol and sample characteristics are described in detail elsewhere 11,12 . Briefly, participants were over 18 years old, and not ascertained on the basis of having any particular disorder. A diagnosis of depression (MDD) was made using the structured clinical interview for DSM-IV disorders (SCID) 15 . Participants who answered yes to either of the two screening questions were invited to continue the interview, which provided information on the presence or absence of a lifetime history of MDD, age of onset and number of depressive episodes. Participants who answered no to both screening questions or who completed the SCID but did not meet the criteria for depression were assigned control status. Case definition was further refined through NHS data linkage. Controls with a history of antidepressants or who had been referred to a secondary psychiatric care centre (n = 1 072) were excluded, as were cases who had received a previous diagnosis of schizophrenia or bipolar disorder (n = 47). This resulted in 2603 depression cases (of which 1289 were recurrent) and 16,122 controls. Stratification by sex resulted in 1859 female cases, 9159 female controls, 770 male cases and 6958 male controls.

UK Biobank (UKB)
UKB 13 is a population-based health research resource consisting of approximately 500,000 people, aged between 40 and 69 years, who were recruited between the years 2006 and 2010 from across the UK 14 . Of these, 152,729 individuals were included in the first genotype data release. In the current study we restricted the sample to individuals of white British ancestry. Participants who were also in GS:SFHS, their relatives and relatives of remaining UKB participants (relatives: up to and including third degree) were identified by a kinship coefficient ≥ 0.0442, using the KING toolset 18 , and subsequently excluded (n = 7 698). Depression case/control status was assessed in 172,751 of the 500,000 individuals using a selfdiagnosed touchscreen questionnaire. Case status was defined as either "probable single lifetime episode of major depression" or "probable recurrent major depression (moderate and severe)". Control status was defined as "no mood disorder", as described by Smith et al 17 . 149,847 individuals had sufficient data to allow an assessment of case/control status. Individuals with probable bipolar disorder (n = 1 615) or mild depressive/manic symptoms (n = 26 847) were excluded. After exclusions outlined above, this resulted in 8248 depression cases (of which 6056 were recurrent) and 16,089 controls. Stratification by sex resulted in 5138 female cases, 7013 female controls, 3110 male cases and 9076 male controls. Further information on sample collection, genotyping and assessment of the depression phenotype in GS:SFHS and UKB are provided in the Supplementary Methods.

Imputation and quality control GS:SFHS
Autosomal genotype data were available for all GS:SFHS individuals in the present study (n = 18 725). Genotypes were imputed using the Haplotype Reference Consortium reference panel (HRC.r1-1) 19 via the Sanger Imputation Server pipeline (https://imputation.sanger.ac.uk). Prior to imputation, individuals with missingness ≥ 3% were excluded, as were SNPs with a call rate of ≤98%, Hardy Weinberg Equilibrium (HWE) P-value ≤ 1 × 10 −6 , and a minor allele frequency (MAF) ≤ 1%. Phasing of genotype data was performed using the SHAPEIT2 algorithm 20 utilizing the duoHMM option, which refines phasing by utilizing pedigree information. Imputation was performed using PBWT software 21 . Multi-allelic variants, monomorphic variants and SNPs with an imputation INFO score < 0.8 were removed 22 . Population outliers (more than 6SDs from the mean of the first principal component (PC)) were identified and removed from the sample 23 , as were one from each of 52 monozygotic twin pairs, identified by IBD (preferentially retaining cases), and 7 individuals who matched samples from the Psychiatric Genomics Consortium, identified using genotype checksums 24 . After imputation, individuals with missingness ≥ 2%, and genotype with a call rate of ≤98%, MAF ≤ 0.5% and HWE P-value ≤ 1E-05 were excluded using PLINK version 1.9. 25,26 . Strand ambiguous SNPs with 40% ≤ MAF ≤ 50% were also excluded.

UKB
Autosomal genotypes were available for all UKB individuals in the present study (n = 24 337). Pre-imputation QC, phasing and imputation are described elsewhere 27 . In brief, prior to phasing, multiallelic SNPs or those with MAF ≤ 1% were removed. Phasing of genotype data was performed using a modified version of the SHAPEIT2 algorithm 28 . Imputation to a reference set combining the UK10K haplotype and 1000 Genomes Phase 3 reference panels 29 was performed using IMPUTE2 algorithms 30,31 . A further QC protocol was then applied at the Wellcome Trust Centre for Human Genetics before the data was released, as described elsewhere 32 . The analyses presented here were restricted to autosomal variants with an imputation INFO score ≥ 0.9 and MAF ≥ 0.5%.
Of the SNPs which passed QC in each dataset, only SNPs in common between both datasets were used in subsequent analyses, with allele and strand in GS:SFHS harmonized to be consistent with UKB, resulting in 7,105,178 autosomal SNPs.

Statistical analysis
All analyses present here were performed on four subsets of the data: all available cases and controls (MDD), all controls and recurrent cases (rMDD), female controls and cases (fMDD), and male controls and cases (mMDD). The total sample size for each depression subgroup was n = 43 062 for MDD, n = 39 556 for rMDD, n = 23 169 for fMDD and n = 19 886 for mMDD. The number of cases and controls and demographic information for these subsets are shown in Supplementary Table 1.

Association analysis GS:SFHS
GWAS of MDD, rMDD, fMDD and mMDD in GS: SFHS were conducted using mixed linear model based association (MLMA) analysis 33 , implemented in GCTA (v1.25.) 34 . To account for population structure, two genomic relationship matrices (GRMs) were used, as this method allows the inclusion of closely and distantly related individuals in genetic analyses 35 . The first GRM included pairwise relationship coefficients for all individuals. The second GRM had off-diagonal elements < 0.05 set to 0. GRMs were created using the mixed linear model with candidate marker excluded (MLMe) approach, where GRMs are calculated excluding SNPs located on the chromosome under analysis 33 . No fixed effects covariates were fitted in this analysis as sex was being assessed as a stratifier, and the two GRMs adequately accounted for population stratification (tested using univariate LD Score Regression 36 ). MLMA employs restricted maximum likelihood methods on the linear scale. As such, test statistics (betas and their corresponding standard errors) were transformed to Odds Ratios and their corresponding 95% Confidence Intervals on the liability scale using a Taylor transformation expansion series 37,38 . Further details of GWAS can be found in the Supplementary Methods.

UKB
GWAS of MDD, rMDD, fMDD, and mMDD in UKB were conducted using logistic regression, implemented in PLINK v1.9 25 . Assessment centre, genotype array and batch were fitted as fixed effects. The first 8 PCs (out of 15) supplied by UKB were also fitted, as visual inspection indicated that these PCs resulted in multiple clusters, indicating structure in the data.

Meta-analysis, variant look-up and gene-based analysis
The meta-analysis of GS:SFHS and UKB was conducted using the classical inverse-variance approach, which weights effect sizes by sampling distribution, implemented in the METAL package 39 . SNPs with a metaanalysis P-value of P ≤ 1E-05 were subjected to clumpbased linkage disequilibrium pruning using PLINK 25 using an LD r 2 cut off of 0.1 and a 500 kb sliding window to create SNP sets of approximately independent "lead SNPs". All SNPs which surpassed genome-wide significance were entered into the NHGRI-EBI catalog of published GWAS 40,41 (www.ebi.ac.uk/gwas/) to observe whether these SNPs had been previously observed in association analysis.
Gene-based analysis was performed for MDD, rMDD, fMDD, and mMDD using MAGMA 42 . The gene-based statistics were derived using the summary statistics from each meta-analysis. Genetic variants were assigned to genes based on their position according to the NCBI 37.3 build, with a gene boundary defined by an extended region between 20 kb upstream of transcript start site and 20 kb downstream of transcript end site for each of the genes. This resulted in a total of 18 111 genes for MDD, fMDD, and mMDD, and 17,225 genes for rMDD being analyzed. The European panel of the 1000 Genomes data (phase 1, release 3) was used as a reference panel to account for linkage disequilibrium 43 . A genome-wide significance threshold for gene-based associations was calculated using the Bonferroni method (α = 0.05/18 111; P < 2.76 × 10 −6 for MDD, fMDD and mMDD; α = 0.05/ 17 225; P < 2.90 × 10 −6 for rMDD).

Pathway and functional genomic analyses
Pathway and functional genomic analyses were performed using the GWAS results for each of the MDD meta-analyses. These included DEPICT analyses 44 , reference to RegulomeDB 45 (http://www.regulomedb.org/) and to the Genotype-Tissue Expression Portal (http:// www.gtexportal.org) for independent SNPs with P < 1.0 × 10 −5 and all genome-wide significant SNPs (P < 5.0 × 10 −8 , nSNPs = 6). Further information on pathway and functional genomic analysis can be found in the Supplementary Methods.

Heritability, polygenicity and cross-trait genetic correlations
Univariate GCTA-GREML 46 analyses were used to estimate the proportion of variance explained by all common (MAF > 1%) SNPs for each of the depression phenotypes. A relatedness cutoff of 0.05 was used in the generation of the genetic relationship matrix, as including close relatives inflates heritability estimates 47 . This did not alter the sample size in UKB due to previous sample filtering, however in GS:SFHS this reduced the sample size by 38.5-58.4% (Supplementary Table 15). In GS:SFHS, the first 20 PCs were fitted as fixed effects. In UKB, batch, recruitment centre and the first 8 PCs were fitted. Univariate Linkage Disequilibrium Score regression (LDSR), implemented in LD Score (v1.0.0.) 36 , was applied to GWAS summary statistics to evaluate the proportion of inflation in the test statistics caused by confounding biases, such as population stratification, relative to genuine polygenicity. This method also provides an estimate of SNP-based heritability. Pre-computed LD Scores were used, estimated from the European-ancestry samples in the 1000 Genomes Project 43 . To obtain heritability estimates on the liability scale, sample and population prevalence estimates were used. Sample prevalence estimates were calculated as the proportion of cases in each subset. Population prevalence estimates were derived from the literature [48][49][50] . Prevalence estimates used in GCTA-GREML and LDSC are given in Supplementary Table 1. Genetic correlations between meta-analyzed depression subgroups and 200 health-related traits were calculated using bivariate LDSR 51 , implemented in the LD Hub software 52 . Traits derived from non-Caucasian or mixed ethnicity samples were removed prior to analysis. False discovery rate (FDR) correction was applied across the 800 tests to correct for multiple testing 53 .

Polygenic profiling analysis
To test the association of MDD-associated alleles with each subtype of MDD in GS:SFHS and UKB, summary statistics for major depressive disorder from the Psychiatric Genomics Consortium (minus GS:SFHS, n = 50 455 cases, 105,411 controls; minus UKB, n = 43 204 cases, 95,680 controls) 54 , UKB and GS:SFHS (from the current study) were used to provide weights for polygenic profile scores (PGS).
PGS for GS:SFHS and UKB individuals were derived at 5 GWAS P-value thresholds (P T < 0.01, < 0.05, < 0.1, < 0.5 and all SNPs) using PRSice 55 . Genotyped SNPs (with MAF > 1%) were subjected to clump-based linkage disequilibrium (LD) pruning, using an LD r 2 cut off of 0.25 and 200 kb sliding window to create SNP sets in approximate linkage equilibrium. PGS were then standardized to have a mean of zero and a unit standard deviation.
In GS:SFHS, the associations between PGS and MDD, rMDD, fMDD, and mMDD were tested using a mixed linear model, covarying for the first 20 PCs to account for population stratification. Prior to this analysis, the requisite number of PCs was established using a stepwise linear regression approach, adding one PC at a time, and using a likelihood-ratio test (LRT), the output of which was assessed against a mixed 0.5(χ 2 ) + 0.5(0) distribution 56 . An additive genetic component was fitted as a random effect to account for the increased relatedness within GS: SFHS. To ensure that common environment was adequately modeled, models incorporating shared parentoffspring, sibling, and spousal environmental components as additional random effects were tested using a stepwise LRT approach, however no environmental component improved model fit. Further details of mixed linear model selection are provided in the Supplementary Methods. Fstatistics, degrees of freedom, effect sizes, Z-scores and Pvalues were derived using the Wald Conditional F-test 57 , in ASReml-R 58 .
In UKB, the association between PGS and MDD, rMDD, fMDD, and mMDD was tested in a generalized linear model framework by regressing the PGS onto the phenotype, covarying for assessment centre, genotype array and batch and the first eight PCs.
FDR correction was applied across the 80 tests to correct for multiple testing 53 . For both GS:SFHS and UKB, trait variance explained by the PGS was calculated using: (var(x × β))/var(y), where x was the standardized PGS, β was the corresponding regression coefficient and y was the phenotype 59 .

Meta-analysis of depression in GS:SFHS and UKB
One genomic region on chromosome 3p22.3 achieved genome-wide significance in the males only case/control (mMDD) analysis (index SNP rs4478037, β(SE) = 0.29 (0.05), P = 2.29 × 10 −8 ). None of the SNPs achieving genome-wide significance (nSNPs = 6) associated with any phenotype in currently published GWAS available via the NHGRI-EBI catalog. One variant (rs7613051) within the local genomic region (defined 3:33000000-33200000) has previously shown an association with Atopic dermatitis 60 , however this SNP is not in LD with the genomewide significant SNPs (r 2 < 0.1). Meta-analysis of MDD, rMDD, and fMDD did not yield any genome-wide significant findings. Manhattan plots are shown for each trait in Fig. 1, and summary statistics for independent loci with a meta-analysis association P ≤ 1 × 10 −6 are shown in Table 1. A regional association plot for genome-wide significant index SNP, rs4478037, is shown in Fig. 2. Regional association plots for this SNP in other depression subtypes are shown in Supplementary Figure 4, demonstrating that this locus does not replicate in any other depression subtype (minimum P = 8.05 × 10 −4 in MDD, β(SE) = 0.10(0.05)). Full details of all independent loci used in downstream analyses (P ≤ 1 × 10 −5 ) are shown in Supplementary Tables 4-11. The QQ plots (Supplementary Figure 3) demonstrate λ GC ranged from 1.02-1.06, comparable to the value (1.056) observed in the PGC mega-analysis of MDD 2 . Univariate LDSR analyses estimated that meta-analyzed MDD subtypes had mean chi-squared statistic (μχ 2 ) values ranging from 1.018 (mMDD) to 1.062 (MDD) with a Ratio, defined as (Intercept-1)/(μχ 2 -1), ≤ 0.35 across subtypes, indicating that any inflation in μχ 2 can be attributed to polygenicity rather than residual population stratification 36 .

Gene based analysis of MDD subtypes
Three genes at chromosome 3p22.3 (CRTAP, GLB1, and TMPPE) were significantly associated with mMDD after Bonferroni correction (Supplementary Table 12). Whilst CRTAP and GLB1 have not previously shown association with psychiatric disorders, both genes are members of the CNTN1 PPI subnetwork. This subnetwork contains CNTN1, which encodes a protein that may play a role in the formation of axon connections in the developing nervous system 61 . Furthermore, the CNTN1 PPI subnetwork also contains HTR1A, which encodes a serotonin (5-HT) receptor subtype that binds endogenous 5-HT 62 . To assess whether significant association of these 3 genes was due to LD in the region, the meta-analysis of MDD in males was re-run conditional on SNPs with an R 2 > 0.9 with the top ranking SNP, rs4478037. This analysis, implemented in GCTA (v1.25) 34 , indicated that the signal was being driven by LD across the region (Supplementary Figure 5). There were no significant gene-based associations with MDD, rMDD or fMDD.

Estimating SNP-based heritability and polygenicity
Using GCTA-GREML methods 46 , the SNP-based heritability (h 2 SNP ) estimates in UKB were consistent and significant across MDD subtypes, with h 2 SNP (SE) estimates of MDD = 0.20(0.04); rMDD = 0.20(0.03); fMDD = 0.22(0.06) and mMDD = 0.18(0.06). Due to the unrelated subset of individuals in GS:SFHS being markedly smaller than the full sample (n max = 7 795), the heritability estimates were non-significant across all MDD definitions. Results from GCTA-GREML are shown for MDD subtypes in Supplementary Table 15. LDSR yielded lower h 2 SNP estimates than GCTA-GREML methods (Supplementary Table 16). This is to be expected as LDSR utilizes summary scores, which have usually been subjected to genomic control, as opposed to full SNP data.

Genetic correlation with health-related traits
Bivariate LDSR showed nominally significant (P < 0. and systemic lupus erythematosus (rG (SE) = 0.28(0.08); P = 8.00 × 10 −4 ). These findings are consistent with previously reported, well-established relationships between MDD and neuroticism [63][64][65] , bipolar disorder 51,66 , PGC cross-disorder 66 , depressive symptoms 7,67 and subjective well-being 68 . Relationships between MDD and age at first birth, and SLE have been previously reported although these have been based on phenotypic correlations [69][70][71] . The majority of these traits (with the exception of age at first birth and systemic lupus erythematosus) demonstrated rG of a similar magnitude, direction and significance with recurrent and female MDD. In contrast to these results, bivariate genetic correlations between mMDD and health-related traits were all non-significant after adjustment for multiple testing (Supplementary Table 17). Summary statistics from univariate and bivariate LDSR (Supplementary Tables 14 and 17) indicate that the lack of association between mMDD and other health-related traits is due to reduced statistical power, rather than a genuine sex difference. Univariate LDSR of mMDD returned a mean χ 2 = 1.018, indicating low power (as a minimum mean χ 2 = 1.02 is deemed appropriate for LDSR 72 ). In addition, the univariate LDSR h 2 SNP (SE) = 0.05(0.03) for mMDD. The LDSR rG is cal- where ρ g is the genetic covariance between traits, h 2 i is the heritability of trait i and h 2 j is the heritability of trait j 36 . Near-zero h 2 estimates can therefore cause the rG estimate to become out of bounds (rG > 1), as observed in three out of five nominally significant traits, with large standard errors 72 , as observed in all nominally associated traits. Fig. 3 shows the genetic correlation of meta-analyzed depression subtypes with significantly correlated health traits.

Polygenic profiling analysis
The GWAS results from the MDD phenotype in 3 discovery samples (PGC MDD29, UKB and GS:SFHS) were used to build polygenic profile scores (PGS) in GS: SFHS and UKB, incorporating SNPs with a discovery sample association P-value cut-off of P T ≤ 0.01, P T ≤ 0.05, P T ≤ 0.1, P T ≤ 0.5, and all SNPs (P T ≤ 1). Results from P T ≤ 0.05 are shown in Fig. 4, as this P T explained the most variance in the target datasets. Results from all P T are shown in Supplementary Tables 18-21. PGS derived using information from the PGC MDD29 GWAS yielded significant associations between depression phenotype and PGS across almost all thresholds in both GS:SFHS and UKB (with the exception of mMDD at P T ≤ 0.01 in GS:SFHS). PGS derived using information from the UKB MDD GWAS yielded significant associations with MDD, rMDD, and mMDD phenotypes in GS:SFHS at all thresholds, however fMDD did not survive multiple testing correction at any P T . PGS derived using information from the GS:SFHS MDD GWAS yielded significant associations with MDD, rMDD, and fMDD phenotypes in UKB at all thresholds, except P T ≤ 0.01-presumably due to the low number of SNPs contributing to the score. However, mMDD did not survive multiple testing correction at any P T . Across all associations, the largest proportion of variance explained in GS:SFHS was 0.66% for fMDD using the MDD polygenic score derived using SNPs at P T ≤ 0.5 using weights from the PGC MDD29 GWAS. The largest proportion of variance explained in UKB was 0.72% for fMDD, again using SNPs at P T ≤ 0.5 using weights from the PGC MDD29 GWAS.

Discussion
For many years, the depressed phenotype has been refractory to genetic inquiry due to issues regarding statistical power. Recently, studies have successfully identified loci associated with depression by either substantially increasing the sample size 7,8 or by refining the phenotype by illness course 9 , recurrence and sex 6 . In this study we used techniques designed to interrogate complex traits to ascertain whether maximizing the sample size (n max = 43,062) or phenotypic stratification by recurrence or sex was more advantageous for investigating the genetic architecture of MDD, using data from two large UK cohorts. Each MDD definition was evaluated using several metrics: the successful identification of variants reaching genome-wide significance in GWAS meta-analysis, an increased SNP-based heritability estimate, identification of significant genetic correlations with other traits using LD score regression, and increased variance explained by polygenic profile scores for MDD derived from three independent cohorts.
For all analyses, MDD, recurrent MDD and MDD in females returned similar results: overlapping SNP-based heritability estimates; genetic correlations with consistent magnitude of effect, direction, and significance with six health-related traits; low trait variance explained (<1%) and overlapping effect size estimates in polygenic profiling analysis, and no genome-wide significant findings from Fig. 3 Genetic correlation (rG) between meta-analyzed MDD subsets and other health-related traits, derived using GWAS summary statistics and LD score regression. Traits presented showed a significant rG with MDD subsets after multiple testing correction (FDR p ≤ 0.05) and are coloured by category (personality, psychiatric, reproductive and autoimmune). No rG between mMDD and other health-related traits survived multiple testing correction  Tables 15-18 GWAS meta-analysis (akin to similarly sized published GWAS of these phenotypes 2,9 ).
With the exception of polygenic profiling analysis, MDD in males generally did not conform to the pattern of results demonstrated by other MDD definitions. Some of these differences, such near-zero SNP-based heritability estimate and subsequently no genetic correlations with other traits surviving multiple testing correction, can be attributed to reduced statistical power in this MDD definition. Interestingly, genome-wide meta-analysis yielded a single genome-wide significant locus in depression in males. The locus at chromosome 3p22.3 includes TMPPE, CRTAP, and GLB1 genes; all of which were significant in gene-based testing. Conditional GWAS on the lead signal demonstrated that the signal which spanned these three genes was due to high LD in the region. However, functional genomic analysis of the lead SNP returned eQTL evidence for GLB1 and CRTAP, suggesting that the causal variant is more likely to affect the expression of these genes rather than TMPPE. The lack of replication of this signal in other published GWAS is unfortunate, but unsurprising given that the current study is the largest GWAS of depression in males to date (relative to published GWAS). Ever-increasing sample sizes from international consortia will provide muchneeded larger replication datasets for corroborating or dispelling this finding. The lack of replication also highlights the importance of moving towards linking results at the functional level.
There are several limitations to this study. Firstly, the sample size for MDD and rMDD groups are very similar (n = 43,062 and n = 39,556, respectively), therefore perhaps rMDD is not the best stratifier in this sample. The sample size of all depressed cases could have been increased by including individuals with mild depressive/ manic symptoms (n = 26,847), however as case classification was based on very few items (two symptoms and help-seeking behavior), it wasn't possible to determine whether mild symptoms should be classified as cases or controls 17 , therefore including these individuals could introduce further phenotypic heterogeneity. Whilst the differential model selection in GS:SFHS and UKB used to adequately account for differential family and population structure introduces analytical heterogeneity, the genetic correlation of MDD GWAS summary statistics from the two samples was rG(SE) = 0.997(0.26). Lastly, the higher prevalence of females in GS:SFHS caused a gender imbalance in the sample sizes, resulting in lower statistical power for the male only analysis.
Overall our results suggest that there was little benefit to stratifying depression by either sex or recurrence for currently available data sizes. Extreme differences between sexes, such as opposite directions of effect in the two sexes, would have to exist to necessitate their analysis separately. The power implications of stratifying on these traits is likely to out-weigh the identification of such loci. In situations where the effect of a SNP is only found in one sex and zero effect in the other, such as rs4778037 in this study, it is still better to analyze sexes together to reduce the multiple testing burden of separate analyses. The increased trait variance explained demonstrated by using the largest available training and discovery datasets (PGC MDD29 and UKB MDD, respectively) in polygenic profiling supports increasing the sample size over phenotypic refinement. Similarly, the lack of discernible difference between h 2 SNP and rG estimates between depression subtypes suggests that the best approach, currently, is to maximize the sample size in order to reduce sampling error and obtain more accurate point estimates. However, addressing recurrence, sex and ancestral heterogeneity in a large ascertained cohort does have intrinsic merit, as demonstrated previously by CONVERGE 6 , with the implication that addressing several sources of heterogeneity has more utility than implementing recurrence and sex separately 6 .
Phenotypic stratification still has plenty of scope for aiding the tractability of genetic analysis in depression. There are many traits which could be used to create subgroups, including treatment response and physical comorbidities, and perhaps these will be more successful than sex and recurrence. However, it is worth noting that due to the limitations of statistical power with current sample sizes, the performance of the stratified phenotypes presented here is a lower bound of the stratification strategy. Recently, age-at onset 9 and use of polygenic risk scores derived from health-related traits 73 have been shown to result in subsets of depression with improved heritability. Our study suggests that, until a better understanding of the determinants of genetic heterogeneity in depression exist-increasing sample number remains the optimal strategy.