We report a genome-wide association study (GWAS) of major depressive disorder (MDD) in 1221 cases from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study and 1636 screened controls. No genome-wide evidence for association was detected. We also carried out a meta-analysis of three European-ancestry MDD GWAS data sets: STAR*D, Genetics of Recurrent Early-onset Depression and the publicly available Genetic Association Information Network–MDD data set. These data sets, totaling 3957 cases and 3428 controls, were genotyped using four different platforms (Affymetrix 6.0, 5.0 and 500 K, and Perlegen). For each of 2.4 million HapMap II single-nucleotide polymorphisms (SNPs), using genotyped data where available and imputed data otherwise, single-SNP association tests were carried out in each sample with correction for ancestry-informative principal components. The strongest evidence for association in the meta-analysis was observed for intronic SNPs in ATP6V1B2 (P=6.78 × 10−7), SP4 (P=7.68 × 10−7) and GRM7 (P=1.11 × 10−6). Additional exploratory analyses were carried out for a narrower phenotype (recurrent MDD with onset before age 31, N=2191 cases), and separately for males and females. Several of the best findings were supported primarily by evidence from narrow cases or from either males or females. On the basis of previous biological evidence, we consider GRM7 a strong MDD candidate gene. Larger samples will be required to determine whether any common SNPs are significantly associated with MDD.
Major depressive disorder (MDD) is the leading cause of disability for adults under 45 years of age,1 and has a lifetime incidence of 12–20%.2 Twin studies suggest a heritability of approximately 40% (perhaps higher in clinical samples), with a two- to threefold increased risk to first-degree relatives of MDD probands.3 There are no established neurobiological mechanisms or definitive genetic associations. In this study, we report on a new genome-wide association study (GWAS) of MDD in the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) sample, and on a meta-analysis of STAR*D and two other data sets: the Genetics of Recurrent Early-onset Depression (GenRED) GWAS reported in a companion article;4 and Genetic Association Information Network–MDD (GAIN–MDD), a data set that was analyzed in the first MDD GWAS report5 and that has been made available to scientists through the database of Genotypes and Phenotypes repository (dbGaP).6
The new GWAS sample includes 1221 cases from STAR*D, a multi-center, National Institute of Mental Health (NIMH)-sponsored antidepressant clinical trial.7, 8 The GenRED GWAS4 included 1020 cases, with 1636 controls from the Molecular Genetics of Schizophrenia (MGS) study9 (excluding controls who reported any history of MDD). The STAR*D analysis uses the same control data, and our meta-analysis corrects for that overlap. We accessed the GAIN–MDD data set and carried out a new analysis (for methodological consistency) of 1715 cases and 1792 controls, slightly smaller than the published sample5 but with very similar results.
Genome-wide association study methods evaluate the contribution of common single-nucleotide polymorphisms (SNPs) to common diseases. They have identified robust associations to many non-psychiatric disorders10 and to bipolar disorder,11 schizophrenia12, 13, 14 and autism.15 No genome-wide significant findings were reported for GAIN–MDD5 or GenRED,4 or for a GWAS (not included in this meta-analysis) of 1514 recurrent MDD cases and 2052 controls (without lifetime depressive or anxiety disorders) from a German clinical sample and a Swiss population-based sample.16 This is not surprising, as most GWAS findings have emerged when multiple data sets were combined to achieve large sample sizes (often 10 000–20 000 cases plus controls) with power to detect variants that produce small increases in risk.10 We have reported separate GenRED and STAR*D analyses, because their distinctive characteristics could prove relevant to interpreting results across studies in the future, but to achieve a larger sample size we also report a meta-analysis of STAR*D, GenRED and GAIN–MDD data.
Materials and methods
Cases were participants in STAR*D. Individuals (ages 18–75) were enrolled from primary care or psychiatric outpatient clinics if they had a diagnosis of MDD (by clinician rating of Diagnostic and Statistical Manual of Mental Disorders, 4th edition (DSM-IV) criteria) and a current 17-item Hamilton Depression Rating Scale (HAM-D) score of 14 by independent raters7, 8 (although that score did not capture the severity of past depression). Of 1953 participants who donated DNA, we selected the 1500 who self-identified as ‘white’ as they represented most of the sample and European-ancestry controls were available. After quality control (QC) procedures (described below), 1221 cases were available for analyses. All subjects signed informed consent for genetic studies. Work described here was approved by the institutional review board of the University of California, San Francisco.
Controls were the same as those used in the GenRED GWAS analysis.4 Details are described elsewhere.9, 13 They were recruited for MGS by a survey research company (Knowledge Networks, Inc., Menlo Park, CA, USA) from a nationally representative internet-based panel that was selected by random digit dialing. Participants had completed an online version of the Composite International Diagnostic Interview-Short Form17 for lifetime history of common mood, anxiety and substance use disorders. They consented to anonymization and deposition of their DNA and clinical information in the NIMH repository for use in any medical research. The 1636 European-ancestry controls used here had no lifetime history of MDD (or of recurrent depression missing MDD by one criterion) by Composite International Diagnostic Interview-Short Form criteria (which over-diagnose MDD18). The MGS collaboration gave permission for us to use genotypes for the part of the control sample that is still under a dbGaP publication embargo. Clinical and demographic characteristics are summarized in Table 1.
GenRED cases (N=1020) were recruited from multiple clinical settings and media and internet announcements and advertisements. Cases were assessed with the diagnostic interview for genetic studies19 (version 3; http://nimhgenetics.org) and consensus best estimate diagnoses were assigned by review of Diagnostic Interview for Genetic Studies, informant report and available psychiatric records.4 Probands had recurrence (two or more episodes, or one episode lasting at least 3 years), onset before age 31, and recurrent MDD in a sibling or parent with onset before age 41 (but no suspected bipolar-I disorder in a sibling or parent), features that predict greater familial liability to MDD.3, 20, 21 The GenRED GWAS used the same MGS controls as STAR*D (see above). GAIN–MDD recruited individuals from a twin registry and two population-based samples in the Netherlands, selecting cases who received MDD diagnoses based on a Composite International Diagnostic Interview (CIDI), and controls without MDD and without high neuroticism scores.5 Each study excluded bipolar disorder, schizophrenia or schizoaffective disorder and more severe substance use disorders, with minor differences in exclusion criteria.
For meta-analysis, we defined two phenotypic models: Broad (all 3957 MDD cases from the three samples, vs 3428 controls), and Narrow (2191 cases with onset before age 31 and recurrence, including GenRED chronic cases). We did not require positive family history because STAR*D and GAIN–MDD assessed this by proband response to a single question. Exploratory separate analyses of males and females were carried out for each phenotype, because females are at a twofold increased risk, and twin studies suggest partial independence of genetic risk factors for females and males.22, 23 Characteristics of the three samples are summarized in Table 2.
Genotypic data were managed and analyzed using PLINK v1.04–1. 06, except for imputation analyses and analysis of imputed data as described below and in Supplementary Methods.24 STAR*D results were compiled and visualized using WGAViewer v1.25T-Z25 and HaploView v4.1.26
Genotyping was conducted for 754 cases by Affymetrix, Inc. (South San Francisco, CA, USA), with the Affymetrix GeneChip Human Mapping 500 K Array Set and genotypes called with the Bayesian Robust Linear Model with Mahalanobis distance classifier.27 We genotyped the remaining 746 cases with the Affymetrix Genome-Wide Human SNP 5.0 Array and called genotypes with the updated Bayesian Robust Linear Model with Mahalanobis-P algorithm. There were 500 568 SNPs that were assayed by both arrays.
The GenRED cases and MGS controls were genotyped at the Broad Institute on the Affymetrix Genome-Wide Human SNP 6.0 Array, and genotypes were called with Birdseed version 2.4, 13 The GAIN–MDD sample was genotyped with the Perlegen platform.5
Quality control analyses
STAR*D: DNA samples were genotyped on three related platforms: cases on Affymetrix 500 K and 5.0, and controls on Affymetrix 6.0, resulting in 382 598 SNPs that were assayed on all three platforms and that passed QC for the MGS/GenRED controls. To ensure consistency of results, we then excluded SNPs for all samples in the STAR*D analysis based on cross-platform data as follows:
Using data for 806 controls genotyped on Affy 6.0 and 500 K,28 61 440 SNPs were excluded for which more than 1% of samples had discordant calls (>8 for autosomal SNPs, >7 for chromosome X);
Using 12 cases genotyped with Affy 500 K and 5.0, 4049 SNPs had one or more discordant calls and were excluded;
We also examined data for 12 controls genotyped by us with Affy 5.0 and 6.0, but found no additional SNPs (not already excluded) with one or more discordancies.
Single-nucleotide polymorphisms were also excluded for deviation from Hardy–Weinberg equilibrium in controls at a P<1 × 10−6, SNP call rate <98% in either cases or controls, a 2% or greater difference in call rate between cases and controls, or minor allele frequency <0.05. After all QC there were 260 474 SNPs available for analysis that captured an estimated 52.2% of common variation at an r2 threshold of 0.8 and 66.3% at a threshold of 0.5 (that better reflects the power of a GWAS29). Total genotyping rates in the final post-QC data sets were 99.8 and 99.9% for autosomal and X SNPs, respectively.
GenRED and GAIN–MDD: SNP QC for the GenRED sample is described in the companion paper4 and Supplementary Methods. We carried out new QC analyses of the GAIN–MDD data set (Supplementary Methods), to ensure consistency across the data sets and because final post-QC data were not available from dbGaP. We included 434 312 SNPs (vs 435 291 in the published GWAS report5).
Cluster plots of genotype intensity data were visually examined for all top results discussed below for STAR*D or the meta-analysis, including genotyped SNPs or (for the meta-analysis) those critical for the imputation of ungenotyped SNPs that produced strong signals.
Table 2 summarizes the numbers of SNPs available for each data set for meta-analysis.
Cases were initially evaluated with PLINK24 using a subset of approximately 85 000 SNPs. Pairwise estimates of identity-by-descent detected three unexpected duplicates and 21 cryptic relatives (estimated kinship 0.1); for each pair the sample with the lower call rate was excluded. Four additional cases were removed for unusual degrees of SNP heterozygosity. To evaluate ancestry differences, multidimensional scaling vectors were computed and plotted, and 230 outliers to the main European-ancestry cluster were removed—most self-identified Hispanics were excluded, but 24 had scores within the main European cluster and were retained. We also removed cases with ambiguous gender (N=20), or call rate <97% (N=1 for autosomal and 11 for chromosome X analyses), leaving 1221 cases for autosomal analyses and 1211 for chromosome X. QC procedures for the 1636 controls have been described in the companion paper4 and in Supplementary Methods; briefly, samples were excluded for genotyping call rate <97%; inconsistency between reported and genotypic gender; outlier values for mean heterozygosity across genotypes; outliers in the distributions of principle component scores for ancestry; outliers in the number of other subjects with which kinship was estimated at >10%; and cryptic relatives (retaining the sample with the best call rate).
Quality control procedures for GenRED and GAIN–MDD (similar to methods described above for controls) are described in Supplementary Methods. For GAIN–MDD, we excluded slightly more ancestry outliers based on principal component scores. Genomic control λ values are shown in Table 3 for each analysis. Quantile-quantile plots are shown in Supplementary Tables S8–11.
To obtain consistent ancestry-informative covariates, we carried out a final principal components analysis30 of all subjects, using the 82 361 autosomal SNPs common to the three data sets. Subjects who were outliers to the distributions of the two largest components were excluded (no additional STAR*D cases had to be excluded beyond those noted above), and the first 10 principal components (PC) scores were entered into the analyses as covariates to correct for population substructure.
Imputation of data for non-genotyped single-nucleotide polymorphisms
For the meta-analysis, to create genotypic data for the same SNPs for all data sets, we imputed data for each sample for HapMap II SNPs that were not genotyped in that sample, using MACH 1.0.31 (autosomal SNPs) or IMPUTE32 (X chromosome). For each data set, imputation was based on SNPs that passed QC for both cases and controls. MACH and IMPUTE are two of several available methods with similar accuracy.33 Using a Hidden Markov Model algorithm with phased Centre d'Etude du Polymorphisme Humain from Utah (CEU) HapMap haplotypes as training data, a non-integer ‘allele dosage’ is assigned to each individual for each SNP based on weighted probabilities of possible genotypes. For each SNP, an r2 value estimates concordance with actual genotypes (and thus the predicted concordance with the association tests they would produce). A low r2 predicts greater variance in the concordance of genotypes and of test statistics. This uncertainty is taken into account in the meta-analysis procedure. SNPs have been excluded from analysis if minor allele frequency was <1% in any data set or if imputation r2 was <0.3. This threshold was used in four previous large meta-analyses because it removed most poorly imputed SNPs but few well-imputed SNPs.34, 35, 36, 37 The meta-analysis included 2 391 203 SNPs (2 339 408 autosomal and 51 795 X chromosome SNPs).
Analysis of genetic association
For each data set, separate association analyses were carried out for Broad and Narrow phenotypes (all GenRED cases were Narrow) for all subjects and then for males and for females separately. The a priori primary analyses (for STAR*D and for the meta-analysis) considered the Broad phenotype for all subjects. For STAR*D, the primary analysis was limited to genotyped SNPs; for the meta-analysis it included genotyped plus imputed SNPs.
For each analysis, single-SNP tests were carried out for each data set by logistic regression for genotyped and imputed SNPs. For discrete genotypes without covariates, logistic regression is asymptotically equivalent to a trend test for additive effects, while permitting covariates. We used custom software to implement the same logistic regression approach for imputed non-integer genotype ‘dosages’. Covariates included the first 10 ancestry-informative PCs, plus an indicator for sex for X chromosome SNPs. Combined analysis (‘mega-analysis’) of genotypes was not straightforward because of the overlapping STAR*D/ GenRED controls, with different numbers of genotyped SNPs for the two case groups. We could have assigned unique subsets of controls to GenRED and STAR*D, but some power is lost when imputation information content is much lower in one sample (see Supplementary Methods). Therefore, we used a meta-analysis procedure as described in Supplementary Methods. Briefly, for each SNP, the procedure weights the Z-score for each data set by the case and control sample sizes and imputation r2 values (r2=1 for genotyped SNPs), while correcting for the shared controls between STAR*D and GenRED. Combined odds ratios were obtained with a similar procedure. This method takes into account the direction of association in the data sets (that is, which allele is associated), assuming that the same allele should be associated in samples with closely related ancestries. This increases power compared with the classical procedure, which ignores direction. For the primary analysis, P<5 × 10−8 was considered the 5% genome-wide significance threshold.38, 39, 40
We also examined STAR*D and meta-analysis results for SNPs within 50 kb of 41 earlier noted MDD candidate genes. For the meta-analysis, we used a permutation-based procedure to determine whether the distribution of P-values observed for these SNPs deviated from chance expectation (see Supplementary Methods for details).
Power analysis methods are described on page S-19 and results shown in Supplementary Tables S3 and S4 and Figure S13. Power was computed for a genome-wide significance threshold of P<5 × 10−8 and additive inheritance. For the primary STAR*D analysis, there was 80% power to detect an allele with a genotypic relative risk of 1.70, 1.50 and 1.43 for allele frequencies of 0.1, 0.2 and 0.3, respectively; and for the primary meta-analysis, power was approximately 50% for an allele with genotypic relative risks of 1.19 or 1.16 for allele frequencies of 20 or 50%, and was approximately 80% with genotypic relative risk of 1.20 and frequency of 30%.
Genotypic and clinical data are available to qualified scientists through controlled-access repository programs: the NIMH repository program (http://nimhgenetics.org) for the GenRED and STAR*D case samples; dbGaP for the MGS control sample and the GAIN–MDD sample.
The distribution of P-values is similar to chance expectation (Figure 1), with a genomic control λ value of 1.022. Figure 1 also summarizes association findings by chromosomal location. The top 25 findings are listed in Table 4, and all results with P<0.001 in any analysis are provided online in stard_supplementary_data.txt. There were no genome-wide significant findings. Our top finding (rs12462886, P=1.73 × 10−6) is located in a gene desert in 19q12. Brain-expressed genes tagged by the top 100 SNPs include: LPHN2, SRD5A2, DYSF, RPRM, CCDC109B, CTNND2, MSR1, SLC18A1, ANKRD46, CSMD3, SLC5A12, MARK2, RCOR2, KCTD14, SYN3, NLGN4X and FGF13. None of the genes had strong signals in more than one linkage disequilibrium block, but in several instances there were clusters of SNPs with strong signals within an linkage disequilibrium block, which is evidence against genotyping error. For sex-specific analyses, signals (among the top 100 for either sex) in genes of known neurobiological function or expressed in brain include: in males, SNPs in CTNND2, GRIA1, SLC18A1, PLEKHA7, ERBB2IP, KIFAP3, CLTCL1, THRB and SYN3; and in females, SNPs in CSMD3, CACNA2D4, SV2B and NRXN3.
Results for SNPs in 41 previous MDD candidate genes are shown in Supplementary Table S7. The best finding was for rs3788477, a SNP intronic to SYN3 (P=1.64 × 10−4). No other SNP in this analysis achieved P<10−3.
No genome-wide significant result was observed. Figure 2 illustrates results for all genotyped and imputed SNPs. Table 5 (Broad) and Table 6 (Narrow) summarize results for all regions with at least one SNP with P<10−5. Results for SNPs with P<10−3 in any analysis are provided in online files meta-analysis_broad_supplementary_data.txt and meta-analysis_narrow_supplementary_data.txt. The Annotation columns of Tables 5 and 6 provide information regarding the closest gene (within 250 kb) or other functional elements annotated in the UCSC browser (full gene names and summaries of known functions are provided in Supplementary Results). For all regions with no genes or elements listed, peaks of high homology with known regulatory sequences were detected by the evolutionary and sequence pattern extraction through reduced representations method for estimating regulatory potential.41
There are annotated reports of copy number variants in some of these regions, but none were detected in a survey of HapMap data,42 and Birdsuite42 (Birdseye module) copy number variant analysis of the GenRED data set showed that no SNP listed in Tables 5 and 6 was spanned by a copy number variant in more than a few subjects.
Figure 3 illustrates annotation information and P-values for all SNPs in the three best-supported gene-containing regions (8p21.2/ATP6V1B2, 3p26.1/GRM7 and 7p15.3/SP4).
Results of the analyses of SNPs in or near 41 MDD candidate genes are summarized in Supplementary Table S8 and online file candidate_gene_results.xls. The aggregate analysis did not support the hypothesis of an excess of low P-values among these SNPs.
The GWAS of STAR*D for the MDD phenotype (1221 cases and 1636 controls) did not produce genome-wide significant findings. Several regions with modest levels of significance in STAR*D were more strongly supported in the meta-analysis, including SLC18A1, ATP6V1B2 and PLEKHA7 for the Broad phenotype and SYN3 for the Narrow phenotype. As genotypes were assayed on three different platforms, stringent QC measures were required to avoid spurious findings. The very low genomic control inflation factor (λ) suggests that these measures succeeded, but they also reduced the number of SNPs (260 474) available for analysis.
In the meta-analysis of 3957 cases (2191 with a Narrow phenotype) and 3428 controls, genome-wide significant evidence for association to MDD was not observed for 2 391 203 genotyped or imputed HapMap II SNPs, suggesting that if any common SNPs are associated with MDD, their individual genotypic relative risks are likely to be small. Such associations could be detected in future, larger GWAS meta-analyses, a strategy that has succeeded for many other common diseases.43, 10 In samples of one or a few thousand cases, many such loci will produce unimpressive results, but the regions with the strongest evidence for association are statistically most likely to be true associations. We discuss here the three genes in which P-values of approximately P<10−6 were observed in the primary meta-analysis: ATP6V1B2, SP4 and GRM7.
ATP6V1B2 encodes a subunit for a vacuolar proton pump ATPase. H+-ATPases consist of three A, three B and two G domains. In a bipolar disorder GWAS,28 a P-value of 3.32 × 10−5 was observed in ATP6V1G1, encoding the G subunit of the same cytosolic V1 domain to which ATP6V1B2 contributes and which forms a complex with the transmembrane V0 domain for organelle acidification, critical to some forms of receptor-mediated endocytosis and generation of proton gradients across synaptic vesicle membranes. Modest association to bipolar disorder was also reported in an adjacent gene, SLC18A1 (previously VMAT1), which transports monoamines into synaptic vesicles.44 Our signal lies in a distinct linkage disequilibrium block within ATP6V1B2, but SLC18A1 could conceivably have regulatory sequences in this upstream region.
SP4 encodes the brain-specific Sp4 zinc-finger transcription factor.45 In several small samples, modest association to bipolar disorder was observed for SNPs in an Sp4 binding site in the promoter of ADRBK2 (beta adrenergic receptor kinase 2; earlier G-protein receptor kinase 3)46 as well as in SP4 itself.47 SP4 mutant mice showed decreased granule cell density in the hippocampal dentate gyrus,48 deficits in sensorimotor gating and contextual learning,49 and infertility in surviving male knockout mice despite histologically intact testes and mature sperm, suggesting a possible behavioral deficit.50 In our data, association is observed primarily in females; it may be noteworthy that Sp4 forms gene-regulating complexes with estrogen receptors.51 Sp4 may also have a role in glutamate-induced neurotoxicity.52, 53
GRM7 encodes metabotropic glutamate receptor 7, which may be involved in mood regulation54, 55 Chronic treatment with mood stabilizers (lithium or valproate) decreased a hippocampal micro-RNA, increasing GRM7 expression.56 An metabotropic glutamate receptor 7 agonist (AMN082) had antidepressant-like effects in mice that were blocked by knockout of GRM7,57 and chronic antidepressant treatment with citalopram in rodents decreased metabotropic glutamate receptor 7 immunoreactivity in hippocampus and frontal cortex.58 This is the third GWAS to report evidence of association to mood disorders in this long gene (880 kb). Our lowest P-value (7.11 × 10−7) was at 7.5 Mb (3p26.1), with P-values less than 10−4 extending to 7.56 Mb. In the German/Swiss recurrent MDD GWAS,16 the lowest P-value (0.0001) was at 7.68 Mb, with P-values around 0.01 overlapping our signals. In the Wellcome Trust Case-Control Consortium bipolar disorder GWAS,59 the best P-value in GRM7 (0.0001 in a genotypic analyses) was at 7.63 Mb. Larger samples will be required to determine the significance of these findings, but the biological evidence suggests that GRM7 merits further investigation.
The most strongly associated non-genic regions contain multiple peaks of high regulatory potential, but no known regulatory elements. Strong associations in non-genic regions should not be ignored; for example, several cancers are strongly associated with non-genic SNPs on chromosome 8q24,60 whose functional relevance is now under intensive study. In our secondary analyses, very low P-values were observed in non-genic regions (3q26.32 in females, Broad phenotype, P=3.85 × 10−8; 3p14.1 in males, Narrow phenotype, P=3.81 × 10−8). These values are not significant after accounting for multiple testing, and on 3q26.32 there is no support from other SNPs in the region (Figure S16).
For the Narrow (recurrent early-onset) phenotype, the strongest signal was in chromosome 18q22.1. The SNP with the lowest P-value had low imputation r2 values, but two other nearby SNPs had P-values less than 10−5. This region has previously been of interest in linkage studies of both bipolar disorder and MDD (see discussion in the companion paper4), and given that support for this region varied widely across our three samples, one might wonder whether they differed with respect to bipolar features, but we lacked the relevant data to compare the data sets. GenRED provided the strongest support as well as had the most specific procedures to exclude bipolar disorder in probands and relatives, although the severe, recurrent, early-onset phenotype more closely resembles bipolar disorder. The next strongest signals were in a non-genic region of 5p13.2, 220-kb upstream of GDNF (glial cell-derived neurotrophic factor); and in a cluster of histone genes on 6p22.1, in the same region in which significant association to schizophrenia was recently observed.12, 13, 14 The latter finding was detected in a meta-analysis that included MGS, using a superset of the GenRED/STAR*D controls. However, MGS contributed very little of the statistical support for 6p22.1 association to schizophrenia.
Our meta-analysis findings were generally not more strongly supported by the Narrow analysis, but that sample was also smaller (55% of cases). Narrow cases provided most of the support for such signals in the Broad analysis as ATP6V1B2, GRM7, SP4, PLEKHA7, ITPK1/C14orf109 and regions 10p11.23, 10q11.21, 6p23 and 2q22.1 (Tables 5 and 6 and Supplementary Files). Larger samples of cases with this phenotype might prove useful.
Several candidate genes were supported primarily in one gender such as SP4 (females) and PLEKHA7 (males). PLEKHA7, which encodes a poorly understood gene (pleckstrin homology domain containing, family A member 7), is associated with systolic blood pressure.61 Sex differences are likely to exist for genetic effects in MDD.
The strongest signal in the published GAIN–MDD GWAS was in PCLO (P=7.7 × 10−7),5 encoding Piccolo, a protein involved in cycling of synaptic vesicles including at monoaminergic synapses. The association was supported in only one of five follow-up data sets (that totaled 6079 cases and 5893 controls), and it (like GAIN–MDD) was population-based, suggesting possible phenotypic heterogeneity. P-values in PCLO were less significant in our meta-analyses (∼10−5) than in GAIN–MDD alone. Recurrent early-onset cases provided most of the evidence for association in GAIN–MDD, but the lowest P-value in the GenRED sample was 0.017. We have no independent data to test whether association is stronger in population-based samples.
In conclusion, a meta-analysis of three GWAS data sets did not detect genome-wide significant evidence for association to MDD. Of the best-supported genes and regions, GRM7 has the greatest previous biological support for involvement in processes such as mediation of response to antidepressant and antimanic drugs. It is likely that much larger samples will be required to clarify the role of common SNPs in genetic susceptibility to MDD. We are participating in the efforts of the Psychiatric GWAS Consortium10, 62 to carry out meta-analyses incorporating additional samples. Given the moderate heritability and clinical heterogeneity of MDD, larger samples with careful phenotypic characterization would be useful.
The STAR*D GWAS study acknowledges Shaun Purcell (Broad Institute) for technical assistance and Eric Jorgenson (UCSF) for helpful discussion. Genotyping of STAR*D was supported by an NIMH grant to SPH (MH072802), and made possible by the laboratory of Pui Kwok (UCSF) and the UCSF Institute for Human Genetics. This work was further supported by NIMH training funds to SIS (R25 MH060482 & T32 MH19126) and to HAG (F32 MH082562 & T32 MH19552); a NARSAD Young Investigators Award to HAG (A109584); the State of New York, which provided partial support to PJM for this work. The authors appreciate the efforts of the STAR*D Investigator Team for acquiring, compiling and sharing the STAR*D clinical data set. STAR*D was funded by the National Institute of Mental Health through a contract (N01MH90003) to the University of Texas Southwestern Medical Center at Dallas (A John Rush, principal investigator). The authors thank Stephen Wisniewski, PhD, Director, STAR*D Data Coordinating Center, University of Pittsburgh, for demographic data. The GenRED project is supported by grants from NIMH (see online Supplementary Acknowledgements). We acknowledge the contributions of Dr George S Zubenko and Dr Wendy N Zubenko, Department of Psychiatry, University of Pittsburgh School of Medicine, to the GenRED I project. The NIMH Cell Repository at Rutgers University and the NIMH Center for Collaborative Genetic Studies on Mental Disorders made essential contributions to this project. Genotyping was carried out by the Broad Institute Center for Genotyping and Analysis with support from grant U54 RR020278 (which partially subsidized the genotyping of the GenRED cases) from the National Center for Research Resources. The meta-analysis was supported by grants from NIMH and the National Cancer Institute, and by support from the State of New York. GWAS data for the GAIN–MDD data set were accessed by DFL through the Genetic Association Information Network (GAIN), through dbGaP accession number phs000020.v1.p1 (http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000020.v2.p1); samples and associated phenotype data for Major Depression: Stage 1 Genome-wide Association in Population-Based Samples were provided by P Sullivan. Data for Molecular Genetics of Schizophrenia (MGS) control subjects was used here by permission of the MGS project. Collection and quality control analyses of the control data set were supported by grants from NIMH and the National Alliance for Research on Schizophrenia and Depression. Genotyping of the controls was supported by grants from NIMH and by the Genetic Association Information Network (GAIN) (http://www.fnih.org/index.php?option=com_content&task=view&id=338&Itemid=454). Control data are available through dbGAP (http://www.ncbi.nlm.nih.gov/gap). We are grateful to Knowledge Networks, Inc. (Menlo Park, CA, USA) for assistance in collecting the control data set. The authors express their profound appreciation to the individuals who participated in this project, and to the many clinicians who facilitated the referral of participants to the study.
About this article
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)