Original Article

Molecular Psychiatry (2011) 16, 1117–1129; doi:10.1038/mp.2010.96; published online 14 September 2010

GWA study data mining and independent replication identify cardiomyopathy-associated 5 (CMYA5) as a risk gene for schizophrenia

X Chen1,2, G Lee1, B S Maher1, A H Fanous1,3,4, J Chen1, Z Zhao5, A Guo5, E van den Oord1,6, P F Sullivan7, J Shi8, D F Levinson8, P V Gejman9, A Sanders9, J Duan9, M J Owen10, N J Craddock10, M C O'Donovan10, J Blackman10, D Lewis10, G K Kirov10, W Qin11, S Schwab11, D Wildenauer11, K Chowdari12, V Nimgaonkar12, R E Straub13, D R Weinberger13, F A O'Neill14, D Walsh15, M Bronstein16, A Darvasi16, T Lencz17, A K Malhotra17, D Rujescu18, I Giegling18, T Werge19, T Hansen19, A Ingason19, M M Nöethen20, M Rietschel21,22, S Cichon20,23, S Djurovic24, O A Andreassen24, R M Cantor25, R Ophoff25,26, A Corvin27, D W Morris27, M Gill27, C N Pato28, M T Pato28, A Macedo29, H M D Gurling30, A McQuillin30, J Pimm30, C Hultman31, P Lichtenstein31, P Sklar32,33, S M Purcell32,33, E Scolnick32,33, D St Clair34, D H R Blackwood35 and K S Kendler1,2 and the GROUP investigators, the International Schizophrenia Consortium36

  1. 1Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA
  2. 2Department of Molecular and Human Genetics, Virginia Commonwealth University, Richmond, VA, USA
  3. 3Washington VA Medical Center, Washington, DC, USA
  4. 4Department of Psychiatry, Georgetown University School of Medicine, Washington, DC, USA
  5. 5Departments of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
  6. 6Department of Pharmacy, Virginia Commonwealth University, Richmond, VA, USA
  7. 7Departments of Genetics, Psychiatry, and Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
  8. 8Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
  9. 9Center for Psychiatric Genetics, NorthShore University HealthSystem Research Institute, Evanston, IL, USA
  10. 10Department of Psychological Medicine, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff, UK
  11. 11School of Psychiatry, University of Western Australia and Western Australian Institute for Medical Research, Perth, WA, Australia
  12. 12Department of Psychiatry, University of Pittsburgh School of Medicine and Graduate School of Public Health, Pittsburgh, PA, USA
  13. 13Genes Cognition and Psychosis Program, Clinical Brain Disorders Branch, National Institute of Mental Health, National Institutes of Health, Bethesda, MD, USA
  14. 14The Department of Psychiatry, The Queens University, Belfast, Northern Ireland, UK
  15. 15The Health Research Board, Dublin, Ireland
  16. 16Department of Genetics, The Hebrew University of Jerusalem, Jerusalem, Israel
  17. 17Zucker Hillside Hospital, Psychiatry Research, North Shore-Long Island Jewish Health System, Glen Oaks, NY, USA
  18. 18Division of Molecular and Clinical Neurobiology, Ludwig-Maximilians-University, Munich, Germany
  19. 19Institute of Psychiatry, Mental Health Center Sct. Hans, Copenhagen University Hospital, Roskilde Denmark
  20. 20Department of Genomics, Life & Brain Center, Institute of Human Genetics, University of Bonn, Bonn, Germany
  21. 21Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health Mannheim, University of Heidelberg, Mannheim, Germany
  22. 22Department of Psychiatry, University of Bonn, Bonn, Germany
  23. 23Institute of Neuroscience and Medicine (INM-1), Research Center Juelich, Juelich, Germany
  24. 24Institute of Psychiatry, University of Oslo and Oslo University Hospital, Oslo, Norway
  25. 25Department of Human Genetics, UCLA Center for Neurobehavioral Genetics, Los Angeles, CA, USA
  26. 26Department of Medical Genetics, University Medical Center Utrecht, CG Utrecht, The Netherlands
  27. 27Department of Psychiatry and Institute of Molecular Medicine, Neuropsychiatric Genetics Research Group, Trinity College Dublin, Dublin, Ireland
  28. 28Center for Genomic Psychiatry, University of Southern California, Los Angeles, CA, USA
  29. 29Department of Psychiatry University of Coimbra, Coimbra, Portugal
  30. 30Research Department of Mental Health Sciences, Molecular Psychiatry laboratory, Windeyer Institute of Medical Sciences, University College London Medical School, London, UK
  31. 31Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
  32. 32Psychiatric and Neurodevelopmental Genetics Unit, Boston, MA, USA
  33. 33Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA, USA
  34. 34Institute of Medical Sciences, University of Aberdeen, Aberdeen, UK
  35. 35Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, UK

Correspondence: Dr X Chen, Department of Psychiatry, Virginia Commonwealth University, 800 E. Leigh Street, Suite 390, Richmond, VA 23298-0424, USA. E-mail: xchen@vcu.edu

36Members of the GROUP Investigators and the International Schizophrenia Consortium are listed in Appendix.

Received 10 April 2010; Revised 3 August 2010; Accepted 11 August 2010; Published online 14 September 2010.



We conducted data-mining analyses using the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) and molecular genetics of schizophrenia genome-wide association study supported by the genetic association information network (MGS-GAIN) schizophrenia data sets and performed bioinformatic prioritization for all the markers with P-values less than or equal to0.05 in both data sets. In this process, we found that in the CMYA5 gene, there were two non-synonymous markers, rs3828611 and rs10043986, showing nominal significance in both the CATIE and MGS-GAIN samples. In a combined analysis of both the CATIE and MGS-GAIN samples, rs4704591 was identified as the most significant marker in the gene. Linkage disequilibrium analyses indicated that these markers were in low LD (3828611–rs10043986, r2=0.008; rs10043986–rs4704591, r2=0.204). In addition, CMYA5 was reported to be physically interacting with the DTNBP1 gene, a promising candidate for schizophrenia, suggesting that CMYA5 may be involved in the same biological pathway and process. On the basis of this information, we performed replication studies for these three single-nucleotide polymorphisms. The rs3828611 was found to have conflicting results in our Irish samples and was dropped out without further investigation. The other two markers were verified in 23 other independent data sets. In a meta-analysis of all 23 replication samples (family samples, 912 families with 4160 subjects; case–control samples, 11380 cases and 15021 controls), we found that both markers are significantly associated with schizophrenia (rs10043986, odds ratio (OR)=1.11, 95% confidence interval (CI)=1.04–1.18, P=8.2 × 10−4 and rs4704591, OR=1.07, 95% CI=1.03–1.11, P=3.0 × 10−4). The results were also significant for the 22 Caucasian replication samples (rs10043986, OR=1.11, 95% CI=1.03–1.17, P=0.0026 and rs4704591, OR=1.07, 95% CI=1.02–1.11, P=0.0015). Furthermore, haplotype conditioned analyses indicated that the association signals observed at these two markers are independent. On the basis of these results, we concluded that CMYA5 is associated with schizophrenia and further investigation of the gene is warranted.


association study; cardiomyopathy; GWA data mining; meta-analysis; schizophrenia



Schizophrenia is a psychiatric disorder with a worldwide prevalence of 1%. It is characterized by delusions, hallucinations and deficits of cognition and emotion. There is sufficient data from family and twin studies, suggesting that genetic factors have significant functions in the etiology of the disease. In recent years, a large number of genetic association studies have identified many candidate genes for the disease; however, most of these genes do not have satisfactory replications. Most recently, several genome-wide associations (GWAs) have been reported.1, 2, 3, 4, 5, 6 Of the many potential leads discovered by these studies, the broad region in chromosome 6p is the most consistent finding.2, 3, 4 Another gene, ZNF804A, has reached global significance when samples from both schizophrenia and bipolar disorder are combined.6 Other genes, although not reaching genome-wide significance in initial samples, have consistent replications with many independent samples.7, 8

These recent GWA studies of schizophrenia are not only promising, but also illustrate their limitations in detecting individual candidate genes with small effects on disease risk. Alternative approaches are also needed. In this study, we implemented a method that combines data mining of GWA data sets and bioinformatic prioritization to select promising candidate genes and follows by verification and meta-analyses of a large number of independent data sets. Specifically, we conducted GWA analyses of the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) and molecular genetics of schizophrenia GWA study supported by the genetic association information network (MGS-GAIN) samples and selected all candidate single-nucleotide polymorphisms (SNPs) with P-values less than or equal to0.05 in both CATIE and MGS-GAIN data sets. These markers were then analyzed by comprehensive bioinformatic prioritization procedures. Top candidates emerging from these analyses were further verified by independent samples. Using this approach, we analyzed 25 independent samples with a total of over 33000 individuals and identified two SNPs, including a non-synonymous SNP, in and around the CMYA5 gene to be significantly associated with schizophrenia. Here, we report the results from this study.


Materials and methods

Subjects and genotyping

In this study, we used 25 samples with a total of 33834 subjects, including 912 families with 4160 subjects, 13038 cases and 16636 controls (the overlapping subjects between the CATIE and MGS-GAIN and MGS non-GAIN were excluded from these numbers). The CATIE and MGS-GAIN samples were used as our data-mining and hypothesis-generating samples in the first stage of our two-stage study. The other 23 samples were used as replication samples. Twenty-four of the 25 samples were of Caucasian ancestry, one sample, MGS-GAIN-AA, was of African American ancestry. Of these samples, 20 samples were used in GWA studies by individual groups and the subjects in these samples were typed by either the Affymetrix or Illumina microarray methods. Five samples, the Irish family (IFAM), Irish case–control (ICC), Bonn, Pittsburgh and Ashkenazi were typed by the TaqMan method.9 The quality of genotyping was assessed by individual groups to be satisfactory. The principle investigators, sample size and genotyping method were listed in Table 1.

Data mining and bioinformatic prioritization

We used the PLINK program10 to conduct the GWA analyses. The GWA analyses were conducted with the quality-control filtered markers from the NIMH (http://nimhgenetics.org/) and GAIN (http://www.ncbi.nlm.nih.gov/sites/entrez?Db=gap) repositories for the CATIE and MGS-GAIN samples, respectively. In these analyses, only Caucasian subjects (CATIE, 492 cases and 523 controls; MGS-GAIN, 1166 cases and 1368 controls including the 236 overlapping controls between the two samples) were used and markers with a minor allele frequency <1% or a Hardy–Weinberg equilibrium P-value <0.0001 were excluded. For the CATIE data set, the seven principle components identified in the previous study1 were used as covariates and a total of 446225 markers were analyzed. For the MGS-GAIN sample, based on previous analyses that there was no significant stratification found in the sample,2 no covariate was used. The number of markers analyzed for the MGS GAIN was 727905. Note that we did not know at the time of GWA analyses that there were some overlapping subjects between the CATIE and MGS-GAIN samples; therefore, the two samples used in the data mining and bioinformatic prioritization were not completely independent. In the subsequent analyses for the common markers between the two data sets, the 236 overlapping subjects were excluded.

For bioinformatic prioritization, we first selected all markers with P-values less than or equal to0.05 in the two data sets, and matched them against each other. After the matching, there were 1128 SNPs with unadjusted P-values less than or equal to0.05 in both the CATIE and GAIN samples. We then conducted bioinformatic prioritization of these 1128 SNPs based on whether they are located in the evolutionarily conserved regions, genic regions (exons, introns, untranslated regions, or within 2kb of a gene), transcription factor-binding sites, or whether they are located in known schizophrenia candidate genes (as listed in the sczgene database http://www.schizophreniaforum.org/res/sczgene/default.asp by June 2008) or whether the SNPs are non-synonymous. SNPs in each of these categories were assigned an empirical score: 2 for the non-synonymous and known schizophrenia candidate gene categories, 1 for the evolutionary conserved region, transcription factor-binding site, untranslated region and synonymous SNP category and 0.5 for the ‘within 2kb of a gene’ category. Finally, SNPs were ranked by the sum of the scores.11

When the CMYA5 gene was identified as the leading candidate, we performed LD structure analyses of the gene using the HAPLOVIEW program.12 We extracted all markers in the gene plus 20kb upstream and downstream sequences for the CATIE and MGS-GAIN samples, and selected the common markers between the two data sets. Data from the two data sets were combined. Association analysis for the combined samples was also conducted using UNPHASED program.13

Replication and meta-analyses of independent samples

On the basis of the prioritization, we initiated genotyping for three SNPs, rs3828611, rs10043986 and rs4704591, in our IFAM and ICC samples. For rs10043986 and rs4704591, the results from our Irish samples were consistent with that observed in the CATIE and MGS-GAIN data sets. The rs3828611 had inconsistent results between our Irish samples; therefore, was dropped without further investigation. To verify the association observed for rs10043986 and rs4704591, we requested genotyping of two additional samples (Bonn and Pittsburgh) and solicited data from GWA studies from 21 independent samples (see Table 1). The MGS-non-GAIN sample also had 208 overlapping control subjects with the CATIE data set. To maintain the independence among the samples used in the replication study, these overlapping subjects were removed from the MGS-non-GAIN sample.

Meta-analyses for all samples and replication samples only were conducted. We generated combined odds ratios (ORs) of the family-based and case–control samples using the information included in the primary analyses and standard meta-analytic techniques. For the IFAM sample, we used a PDT-like approach to generate the OR.14 The PDT statistic compares the number of times a given parental allele (‘risk’ allele) is transmitted versus non-transmitted and examines allele sharing between affected and unaffected sibling pairs, whereas standard case–control approaches examine allele frequencies in cases versus controls. The parental transmission OR is constructed as (a/c)/(b/d), where a is the transmissions of the high-risk allele, c is the non-transmissions of the high-risk allele, b is the transmissions of all other alleles and d is the non-transmissions of all other alleles. In the sibling pair sample and the population-based samples, which compare case to control allele frequencies, we construct an OR as (a/c)/(b/d), where a is the number of major alleles present in cases, c is the number of minor alleles present in cases, b is the number of major alleles in controls and d is the number of minor alleles present in controls. In each of the EA case–control samples, we construct an OR as (a/c)/(b/d), where a is the number of major alleles present in cases, c is the number of minor alleles present in cases, b is the number of major alleles in controls and d is the number of minor alleles present in controls. In the AA sample, we fit a logistic regression model including the first principal component of population stratification as a covariate. The regression coefficient of the effect of the SNP allowed us to estimate an OR and variance for inclusion in the meta-analysis.

We used formal meta-analytic techniques to combine ORs across study types. We performed a fixed-effects (Mantel–Haenszel) approach to meta-analysis.15 Before pooling, we performed Cochran's (Q) χ2 test of heterogeneity to ensure that each group of studies was suitable for meta-analysis. Generally, in meta-analysis, when significant heterogeneity is found, the studies are deemed unsuitable for pooling through a fixed-effects approach. In the summary meta-analysis of all studies, including the discovery and replication samples, there was a known overlap in controls between the MGS-GAIN and CATIE samples. We calculated the asymptotic correlation between the Z-scores of the two studies and performed a Z-score-based meta-analysis correcting for the correlation because of the shared controls to ensure appropriate type-I error rate.16

Testing independent effect between rs10043986 and rs4704951

As there were two SNPs showing association in the CMYA5 gene, we evaluated whether the association signals observed at rs10043986 and rs4704591 were statistically independent. We took the approach implemented in the PLINK program17 that compares the risk of haplotypes with identical alleles in the background locus, but different alleles at the locus to be evaluated. In this case, we inferred all four haplotypes for rs10043986–rs4704591, and tested the effects of haplotypes with the same allele at rs4704591, but different alleles at rs10043986. Our aim was to evaluate whether the effect of rs10043986 is independent of rs4704591. We use the UNPHASED program13 to conduct this analysis as, unlike PLINK, it is able to combine family data and case–control data for such haplotype-based analyses.



GWA studies data mining and bioinformatic prioritization

From the GWA analyses of the CATIE and MGS-GAIN data sets, there were 24160 and 68371 markers with unadjusted P-values less than or equal to0.05, respectively. Although none of the markers in the CATIE and MGS-GAIN reached genome-wide significance, the number of markers reaching nominal significance (that is 68371) was significantly larger than the expected (that is 37725) in the MGS-GAIN sample, suggesting that there were markers with true effects in this pool of nominally significant markers. Of these markers, there were 1228 markers having Pless than or equal to0.05 in both data sets (Supplementary Table S1). These markers constituted the pool we used for further bioinformatic prioritization. From these markers, the informatics procedures revealed several top candidate genes (Table 2). Of these top candidates, CMYA5 and PTPN21 each had two non-synonymous SNPs. As the two non-synonymous markers in CMYA5, rs3828611 and rs10043986, had different frequencies, and were located in two different exons of the gene, we thought they may represent independent association signals. In contrast, the two markers in the PTPN21 gene had very similar frequency. Therefore, we decided to focus on the CMYA5 gene. There were other genes that had multiple markers with different frequencies (Table 2). These included LRP1B, COLQ, SERINC1, PTPN21, EML5, NTRK3 and NUTF2. Further analyses of these genes may be necessary to verify their functions in schizophrenia.

We conducted literature search for the CMYA5 gene and found that it was reported to be physically interacting with DTNBP1, a leading candidate for schizophrenia that was first reported in our IFAM sample18 included in this study. We also analyzed single marker association for the shared SNPs between the CATIE and MGS-GAIN in this interval by combining CATIE and MGS-GAIN samples together. These analyses identified the most significant marker, rs4704591, which is located about 9kb downstream of the gene. Note that at the time of our GWA analyses, we did not realize that there were overlapping subjects between the CATIE and MGS-GAIN studies. After removing the overlapping subjects between the CATIE and MGS-GAIN data sets, an analysis of the combined samples was performed. The P-values for the three markers were 0.0078 (OR=1.31, 95% confidence interval (CI)=1.07–1.60); 0.0050 (OR; 1.19, 95% CI=1.06–1.30) and 0.00032 (OR=1.17, 95% CI=1.08–1.24) for rs3828611, rs10043986 and rs4704591, respectively (Figure 1a). The LD analyses of the 27 common markers shared by the CATIE and MGS-GAIN studies were performed using the HAPLOVIEW program12 for the gene and 20kb flanking sequences (Figure 1b). The LD between rs3828611 and rs10043986 was 0.008 (r2), the LD between rs10043986 and rs4704591 was 0.208 (r2) and the LD between rs3828611 and rs4704591 was 0.016 (r2). As the LDs among these three markers were relatively low, it was likely that they represented different association signals. There were two other markers showing similar level of association as rs3828611. The rs6880680 was in high LD with rs3828611 (r2=0.713), its effect may not be independent. The rs6870619 was in low LD with all other markers in this region. However, as its signal was not as strong as rs4704591 and it did not reach nominal significance in the CATIE sample, it was not pursued.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

(a) Association analysis of the combined samples. The markers selected for replication were highlighted. (b) LD structure of the 27 markers typed in both CATIE and MGS-GAIN samples. Pair-wise LD values (r2) were shown. The three markers studied were in low LD, suggesting that the association signals observed may be different at these markers.

Full figure and legend (315K)Download PowerPoint slide (3,182 KB)

Verification of CMYA5 association in the Irish samples

On the basis of the data mining and bioinformatic prioritization, we initiated confirmation study using our IFAM and ICC samples for these three SNPs (rs3828611, rs10043986 and rs4704591). We used the UNPHASED program,13 which was designed to combine case–control and family samples and to analyze our combined samples. The results of our Irish samples support the association of rs10043986 and rs4704951. For rs10043986, both the case–control and family samples showed the same direction of association for the same allele as that in the CATIE and MGS-GAIN data sets. However, neither the individual samples (case–control and family samples) nor the combined samples reached significance. For rs4704591, the case–control sample had a P-value of 0.2066 and the family sample had a P-value of 0.0083. The combined case–control and family samples had a P-value of 0.0041. The association of rs3828611 was in the opposite directions between our ICC and family samples (data not shown). Owing to these conflict results, rs3828611 was dropped without further investigation.

Meta-analysis of rs10043986 and rs4704591

The results from ICC and family samples were encouraging. For further confirmation, we solicited data and replication from 23 more independent samples for rs10043986 and rs4704591. The information of all samples is summarized in Table 1, including the CATIE and MGS-GAIN used in our data-mining exercise. Overall, we had a total sample size of 33834 subjects, including 912 families with 4160 subjects, 13038 cases and 16636 controls (the overlapping subjects were excluded from these numbers). Genotyping was conducted by individual groups using a variety of techniques (see Table 1). To ensure the quality, we examined the intensity plots for these two markers. Then, meta-analyses were performed. For the family samples, the counts for transmitted and untransmitted alleles were used. For the case–control samples, allele counts for cases and controls were used. In a meta-analysis of all 23 replication samples (family samples, 912 families with 4160 subjects; case–control samples, 11380 cases and 15021 controls), we found that both markers are significantly associated with schizophrenia (rs10043986, OR=1.11, 95% CI=1.04–1.18, P=8.2 × 10−4 and rs4704591, OR=1.07, 95% CI=1.03–1.11, P=3.0 × 10−4; Table 3). There was no significant heterogeneity among the samples (test of heterogeneity: rs10043986, Q=13.88, d.f.=16, P=0.61; rs4704591, Q=17.15, d.f.=19, P=0.58). The results were also significant for the 22 Caucasian replication samples (rs10043986, OR=1.11, 95% CI=1.03–1.17, P=0.0026; rs4704591, OR=1.07, 95% CI=1.02–1.11, P=0.0015). The results for the combined sample including CATIE and MGS GAIN and accounting for the overlap yields a combined P-value for rs4704591 of 1.11 × 10−5 (Z=4.39) and for rs10043986 1.47 × 10−5 (Z=4.33).

rs10043986 and rs4704591 are independently associated with schizophrenia

As we observed that two SNPs in the CMYA5 gene are significantly associated with schizophrenia, we sought to evaluate whether these two association signals are independent. In the data-mining data sets, the LD between these two SNPs was relatively low (r2=0.208) and similar results were obtained in our combined European samples (r2=0.212), including the MGS-GAIN and CATIE samples. To test whether the effect of rs10043986 is independent of that of rs4704591, we inferred the haplotypes from these two markers for the combined European samples and evaluated whether those haplotypes sharing identical alleles at rs4704591, but different alleles at rs10043986, have different disease risk. If these haplotypes have significantly different risks, then the effects of these two markers would be at least partially independent. Table 4 summarized our results. From the table, it was clear that haplotypes sharing the same allele background at rs4704591 locus, that is C-C versus T-C and C-G versus T-G, showed significantly different risks to the disease. In other words, rs10043986 had an effect independent of that of rs4704591. We also checked the analyses with the PLINK program using all European case–control samples as PLINK could not combine family data with case–control data for such analysis. In this analysis, we checked the independent effect of rs10043986 by comparing the haplotypes sharing the same alleles at rs4704591 and the result was significant (OR=1.07, P=0.0006).



In recent years, GWA studies have identified promising candidates in a number of complex disorders such as type 2 diabetes,19, 20, 21 lung cancer,22, 23, 24 Parkinson's disease,25, 26 rheumatoid arthritis27 and systemic lupus erythematosus.28 The results for schizophrenia have generally been less successful. Except the broad region in 6p and the TCF4 and NRGN regions,4 individual GWA studies have not produced candidates reaching genome-wide significance yet. Of the many possible factors leading to the outcomes, insufficient power in these individual studies and the need to correct for a large number of markers tested may be important factors. However, as aggregated analyses indicated that there may be true findings among those markers passing nominal significance,3 we believe that this is one of the most important contributions of GWA studies. Given the fact that there are markers/genes with true effects buried in the large number of tested markers, how to identify those markers with true effects is a practical issue facing the field. In this study, we adapted a two-stage approach, leading to the identification of two markers in the CMYA5 gene. In the first hypothesis-generating stage, we conducted GWA analyses for two publicly available data sets, the CATIE and MGS-GAIN data sets, and selected and ranked markers by statistic and bioinformatic procedures. These procedures combined statistic and biological evaluations of markers with emphasis on the relevance of potential functions in disease. In this study, the finding of two non-synonymous markers in the CMYA5 gene reaching nominal significance and the low LD between these markers had an important function in the selection of the gene for further verification and replication. The reported direct interaction between CMYA5 and DTNBP1,29 a leading candidate gene for schizophrenia, suggested that these genes may be involved in a common pathway or biological process. This piece of information moved the CMYA5 gene to the top of our ranking list. In the second stage, a total of 23 independent data sets were used to evaluate the significance of these markers by standard meta-analyses. With these approaches, we were able to find that both markers in the gene are significantly associated with schizophrenia and there is no heterogeneity across the samples used in this study, including the MGS-GAIN-AA sample. Furthermore, the association signals observed in these two markers are independent. As we used a two-stage design in this study, the results should be evaluated by the number of markers tested in the second stage despite that we data mined two GWA data sets in our discovery stage. On the basis of this criterion and considering the large number of independent samples and the combined sample size, our results are significant. Importantly, one of our markers may have direct functional consequences as it changes the 4063rd amino acid of the protein from proline to leucin that would result in a change of residue size and hydrophobicity at the C-terminus of the protein, a region that was reported to interact with protein kinase regulator subunit.30 The function of this non-synonymous SNP provides an opportunity to directly test its effect in the biology of schizophrenia.

Our motivation for this study was to find a way to reduce the penalty imposed by GWAs and enable us to identify markers with true effects, but not necessarily reaching conventional levels of genome-wide significance. GWA study is a great tool. Its systematic and hypothesis-free approach is objective and has great potential. However, in order to accomplish its aim, sufficient power and/or sample homogeneity are required to compensate for the steep penalty that has to be paid for testing hundreds of thousands markers. This creates a situation in which many markers may have true associations, but fail to reach GWA standards. On the basis of this rationale, we took the approach described in this study, leading to the identification of the CMYA5 gene as a candidate for schizophrenia.

In retrospect, several aspects of the approach could have been improved. First, we did not take into account the differences in sample size and power between the MGS-GAIN and CATIE studies and used the same cutoff (Pless than or equal to0.05) for both samples. Second, in matching the markers selected from the two data sets, we did not consider the sign (direction) of the association. A more objective approach might have been to perform a formal meta-analysis of the selected markers for the two data sets and take the meta-analysis P-values into consideration when ranking the markers. Third, in our bioinformatic prioritization, we focused on single SNP markers. We could have extended these properties to markers in high LD with these markers, including imputation of untyped markers in the near neighborhood. Fourth, for a gene or region that had multiple markers associated with the disease, a haplotype analysis and testing of independent effects could have been conducted to select the best and independent markers for verification.

Our study provides an example that there are markers with true effects in the GWA studies, and given sufficiently large sample sizes, these markers can be identified. In this study, the observed ORs for rs10043986 and rs4704591 were 1.11 and 1.07, respectively, comparable with that observed in the ZNF804A gene.6 For the CMYA5 gene, there may be other association signals. The rs3828611 is the other non-synonymous marker selected by our data-mining procedures that has low LD with both rs10043986 and rs4704591. We did not pursue it further after the conflicting results from our ICC and family samples. In retrospect, our termination of rs3828611 may be premature.

The CMYA5 gene, also known as myospryn, was first identified as a gene associated with cardiomyopathy.31 The gene is highly expressed in skeletal muscle and heart, and is modestly expressed in brain (unpublished data). It is reported to be associated with left ventricular wall thickness in hypertension patients.32 However, the function of the gene remains unknown. It has been reported to interact with DTNBP1,29, 33 the regulator subunit of protein kinase A30 and desmin34 in muscle cells. The interaction with DTNBP1, another leading candidate for schizophrenia,35, 36 is an interesting lead. DTNBP1 was first reported to be associated with schizophrenia in our Irish sample.18 Subsequently, many studies, including several studies that used samples37, 38, 39, 40, 41 included in this paper, provided supporting evidence for the association. This interaction suggests that CMYA5 may also be involved in the biogenesis of lysosome-related organelles complex 1 (BLOC-1) processes that have been suggested to be involved in schizophrenia.42, 43, 44, 45 The interaction with the regulatory subunit of protein kinase A suggests that CMYA5 may be involved in the regulation of cAMP signal pathway, which is also implicated in schizophrenia.46, 47 These potential connections indicate that further studies may test epistatic interaction between these interacting partners, and examine their functions in the molecular, developmental and pathophysiological processes in schizophrenia.

In summary, using a two-stage design and with one of the largest sample sizes reported in recent literature, we report evidence that two SNPs with relatively low LD to each other in the CMYA5 gene are independently associated with schizophrenia. These results suggest that there may be many markers in GWA data sets that have true but small effects. To identify these markers, a large sample size and collaborative work across many groups are essential.


Conflict of interest

The authors declare no conflict of interest.



  1. Sullivan PF, Lin D, Tzeng JY, van den OE, Perkins D, Stroup TS et al. Genomewide association for schizophrenia in the CATIE study: results of stage 1. Mol Psychiatry 2008; 13: 570–584. | Article | PubMed | ISI | ChemPort |
  2. Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe’er I et al. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature 2009; 460: 753–757. | Article | PubMed | ISI | ChemPort |
  3. Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009; 460: 748–752. | Article | PubMed | ISI | ChemPort |
  4. Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D et al. Common variants conferring risk of schizophrenia. Nature 2009; 460: 744–747. | Article | PubMed | ISI | ChemPort |
  5. Lencz T, Morgan TV, Athanasiou M, Dain B, Reed CR, Kane CR et al. Converging evidence for a pseudoautosomal cytokine receptor gene locus in schizophrenia. Mol Psychiatry 2007; 12: 572–580. | Article | PubMed | ISI | ChemPort |
  6. O’Donovan MC, Craddock N, Norton N, Williams H, Peirce T, Moskvina V et al. Identification of loci associated with schizophrenia by genome-wide association and follow-up. Nat Genet 2008; 40: 1053–1055. | Article | PubMed | ISI | ChemPort |
  7. O’Donovan MC, Norton N, Williams H, Peirce T, Moskvina V, Nikolov I et al. Analysis of 10 independent samples provides evidence for association between schizophrenia and a SNP flanking fibroblast growth factor receptor 2. Mol Psychiatry 2009; 14: 30–36. | Article | PubMed | ChemPort |
  8. Ingason A, Giegling I, Cichon S, Hansen T, Rasmussen HB, Nielsen J et al. A large replication study and meta-analysis in European samples provides further support for association of AHI1 markers with schizophrenia. Hum Mol Genet 2010; 19: 1379–1386. | Article | PubMed |
  9. Livak KJ. Allelic discrimination using fluorogenic probes and the 5′ nuclease assay. Genet Anal 1999; 14: 143–149. | Article | PubMed | ISI | ChemPort |
  10. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575. | Article | PubMed | ISI | ChemPort |
  11. Sun J, Jia P, Fanous AH, Webb BT, van den Oord EJ, Chen X et al. A multi-dimensional evidence-based candidate gene prioritization approach for complex diseases-schizophrenia as a case. Bioinformatics 2009; 25: 2595–6602. | Article | PubMed |
  12. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 2005; 21: 263–265. | Article | PubMed | ISI | ChemPort |
  13. Dudbridge F. Likelihood-based association analysis for nuclear families and unrelated subjects with missing genotype data. Hum Hered 2008; 66: 87–98. | Article | PubMed | ISI |
  14. Martin ER, Monks SA, Warren LL, Kaplan NL. A test for linkage and association in general pedigrees: the pedigree disequilibrium test. Am J Hum Genet 2000; 67: 146–154. | Article | PubMed | ISI | ChemPort |
  15. Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst 1959; 22: 719–748. | PubMed | ChemPort |
  16. Shyn SI, Shi J, Kraft JB, Potash JB, Knowles JA, Weissman MM et al. Novel loci for major depression identified by genome-wide association study of sequenced treatment alternatives to relieve depression and meta-analysis of three studies. Mol Psychiatry 2010 (in press).
  17. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007; 81: 559–575. | Article | PubMed | ISI | ChemPort |
  18. Straub RE, Jiang Y, MacLean CJ, Ma Y, Webb BT, Myakishev MV et al. Genetic variation in the 6p22.3 gene DTNBP1, the human ortholog of the mouse dysbindin gene, is associated with schizophrenia. Am J Hum Genet 2002; 71: 337–348. | Article | PubMed | ISI | ChemPort |
  19. Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL et al. A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science 2007; 316: 1341–1345. | Article | PubMed | ISI | ChemPort |
  20. Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 2007; 445: 881–885. | Article | PubMed | ISI | ChemPort |
  21. Zeggini E, Weedon MN, Lindgren CM, Frayling TM, Elliott KS, Lango H et al. Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 2007; 316: 1336–1341. | Article | PubMed | ISI | ChemPort |
  22. Amos CI, Wu X, Broderick P, Gorlov IP, Gu J, Eisen T et al. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1. Nat Genet 2008; 40: 616–622. | Article | PubMed | ISI | ChemPort |
  23. Spitz MR, Amos CI, Dong Q, Lin J, Wu X. The CHRNA5-A3 region on chromosome 15q24-25.1 is a risk factor both for nicotine dependence and for lung cancer. J Natl Cancer Inst 2008; 100: 1552–1556. | Article | PubMed | ChemPort |
  24. Thorgeirsson TE, Geller F, Sulem P, Rafnar T, Wiste A, Magnusson KP et al. A variant associated with nicotine dependence, lung cancer and peripheral arterial disease. Nature 2008; 452: 638–642. | Article | PubMed | ISI | ChemPort |
  25. Satake W, Nakabayashi Y, Mizuta I, Hirota Y, Ito C, Kubo M et al. Genome-wide association study identifies common variants at four loci as genetic risk factors for Parkinson's disease. Nat Genet 2009; 41: 1303–1307. | Article | PubMed | ISI | ChemPort |
  26. Simon-Sanchez J, Schulte C, Bras JM, Sharma M, Gibbs JR, Berg D et al. Genome-wide association study reveals genetic risk underlying Parkinson's disease. Nat Genet 2009; 41: 1308–1312. | Article | PubMed | ISI | ChemPort |
  27. Raychaudhuri S, Remmers EF, Lee AT, Hackett R, Guiducci C, Burtt NP et al. Common variants at CD40 and other loci confer risk of rheumatoid arthritis. Nat Genet 2008; 40: 1216–1223. | Article | PubMed | ISI | ChemPort |
  28. Graham RR, Cotsapas C, Davies L, Hackett R, Lessard CJ, Leon JM et al. Genetic variants near TNFAIP3 on 6q23 are associated with systemic lupus erythematosus. Nat Genet 2008; 40: 1059–1061. | Article | PubMed | ISI | ChemPort |
  29. Benson MA, Tinsley CL, Blake DJ. Myospryn is a novel binding partner for dysbindin in muscle. J Biol Chem 2004; 279: 10450–10458. | Article | PubMed | ISI | ChemPort |
  30. Reynolds JG, McCalmon SA, Tomczyk T, Naya FJ. Identification and mapping of protein kinase A binding sites in the costameric protein myospryn. Biochim Biophys Acta 2007; 1773: 891–902. | Article | PubMed |
  31. Sarparanta J. Biology of myospryn: what's known? J Muscle Res Cell Motil 2008; 29: 177–180. | Article | PubMed |
  32. Nakagami H, Kikuchi Y, Katsuya T, Morishita R, Akasaka H, Saitoh S et al. Gene polymorphism of myospryn (cardiomyopathy-associated 5) is associated with left ventricular wall thickness in patients with hypertension. Hypertens Res 2007; 30: 1239–1246. | Article | PubMed |
  33. Talbot K, Cho DS, Ong WY, Benson MA, Han LY, Kazi HA et al. Dysbindin-1 is a synaptic and microtubular protein that binds brain snapin. Hum Mol Genet 2006; 15: 3041–3054. | Article | PubMed | ISI | ChemPort |
  34. Kouloumenta A, Mavroidis M, Capetanaki Y. Proper perinuclear localization of the TRIM-like protein myospryn requires its binding partner desmin. J Biol Chem 2007; 282: 35211–35221. | Article | PubMed |
  35. Owen MJ, Williams NM, O’Donovan MC. Dysbindin-1 and schizophrenia: from genetics to neuropathology. J Clin Invest 2004; 113: 1255–1257. | Article | PubMed | ChemPort |
  36. Williams NM, O’Donovan MC, Owen MJ. Is the dysbindin gene (DTNBP1) a susceptibility gene for schizophrenia? Schizophr Bull 2005; 31: 800–805. | Article | PubMed | ISI |
  37. Duan J, Martinez M, Sanders AR, Hou C, Burrell GJ, Krasner AJ et al. DTNBP1 (Dystrobrevin Binding Protein 1) and schizophrenia: association evidence in the 3′ end of the gene. Hum Hered 2007; 64: 97–106. | Article | PubMed |
  38. Kirov G, Ivanov D, Williams NM, Preece A, Nikolov I, Milev R et al. Strong evidence for association between the dystrobrevin binding protein 1 gene (DTNBP1) and schizophrenia in 488 parent-offspring trios from Bulgaria. Biol Psychiatry 2004; 55: 971–975. | Article | PubMed | ISI | ChemPort |
  39. Riley B, Kuo PH, Maher BS, Fanous AH, Sun J, Wormley B et al. The dystrobrevin binding protein 1 (DTNBP1) gene is associated with schizophrenia in the Irish Case Control Study of Schizophrenia (ICCSS) sample. Schizophr Res 2009; 115: 245–253. | Article | PubMed |
  40. Williams NM, Preece A, Morris DW, Spurlock G, Bray NJ, Stephens M et al. Identification in 2 independent samples of a novel schizophrenia risk haplotype of the dystrobrevin binding protein gene (DTNBP1). Arch Gen Psychiatry 2004; 61: 336–344. | Article | PubMed | ISI | ChemPort |
  41. Schwab SG, Knapp M, Mondabon S, Hallmayer J, Borrmann-Hassenbach M, Albus M et al. Support for association of schizophrenia with genetic variation in the 6p22.3 gene, dysbindin, in sib-pair families with linkage and in an additional sample of triad families. Am J Hum Genet 2003; 72: 185–190. | Article | PubMed | ISI | ChemPort |
  42. Ghiani CA, Starcevic M, Rodriguez-Fernandez IA, Nazarian R, Cheli VT, Chan LN et al. The dysbindin-containing complex (BLOC-1) in brain: developmental regulation, interaction with SNARE proteins and role in neurite outgrowth. Mol Psychiatry 2010; 15: 115, 204–15. | Article | PubMed | ChemPort |
  43. Iizuka Y, Sei Y, Weinberger DR, Straub RE. Evidence that the BLOC-1 protein dysbindin modulates dopamine D2 receptor internalization and signaling but not D1 internalization. J Neurosci 2007; 27: 12390–12395. | Article | PubMed | ISI | ChemPort |
  44. Morris DW, Murphy K, Kenny N, Purcell SM, McGhee KA, Schwaiger S et al. Dysbindin (DTNBP1) and the biogenesis of lysosome-related organelles complex 1 (BLOC-1): main and epistatic gene effects are potential contributors to schizophrenia susceptibility. Biol Psychiatry 2008; 63: 24–31. | Article | PubMed | ISI | ChemPort |
  45. Guo AY, Sun J, Riley BP, Thiselton DL, Kendler KS, Zhao Z. The dystrobrevin-binding protein 1 gene: features and networks. Mol Psychiatry 2009; 14: 18–29. | Article | PubMed | ChemPort |
  46. Molteni R, Calabrese F, Racagni G, Fumagalli F, Riva MA. Antipsychotic drug actions on gene modulation and signaling mechanisms. Pharmacol Ther 2009; 124: 74–85. | Article | PubMed |
  47. Siuciak JA. The role of phosphodiesterases in schizophrenia: therapeutic implications. CNS Drugs 2008; 22: 983–993. | Article | PubMed |
  48. Chen X, Wang X, Hossain S, O’Neill FA, Walsh D, Pless L et al. Haplotypes spanning SPEC2, PDZ-G EF2 and ACSL6 genes are associated with schizophrenia. Hum Mol Genet 2006; 15: 3329–3342. | Article | PubMed | ISI | ChemPort |
  49. Chowdari KV, Mirnics K, Semwal P, Wood J, Lawrence E, Bhatia T et al. Association and linkage analyses of RGS4 polymorphisms in schizophrenia. Hum Mol Genet 2002; 11: 1373–1380. | Article | PubMed | ISI | ChemPort |
  50. Schwab SG, Hoefgen B, Hanses C, Hassenbach MB, Albus M, Lerer B et al. Further evidence for association of variants in the AKT1 gene with schizophrenia in a sample of European sib-pair families. Biol Psychiatry 2005; 58: 446–450. | Article | PubMed | ISI | ChemPort |
  51. Egan MF, Goldberg TE, Kolachana BS, Callicott JH, Mazzanti CM, Straub RE et al. Effect of COMT Val108/158 Met genotype on frontal lobe function and risk for schizophrenia. Proc Natl Acad Sci USA 2001; 98: 6917–6922. | Article | PubMed | ChemPort |
  52. Olsen L, Hansen T, Jakobsen KD, Djurovic S, Melle I, Agartz I et al. The estrogen hypothesis of schizophrenia implicates glucose metabolism: association study in three independent samples. BMC Med Genet 2008; 9: 39. | Article | PubMed |
  53. Kahler AK, Djurovic S, Kulle B, Jonsson EG, Agartz I, Hall H et al. Association analysis of schizophrenia on 18 genes involved in neuronal migration: MDGA1 as a new susceptibility gene. Am J Med Genet B Neuropsychiatr Genet 2008; 147B: 1089–1100.
  54. Shifman S, Johannesson M, Bronstein M, Chen SX, Collier DA, Craddock NJ et al. Genome-wide association identifies a common variant in the reelin gene that increases the risk of schizophrenia only in women. PLoS Genet 2008; 4: e28. | Article | PubMed | ChemPort |



The members of the Genetic Risk and Outcome in Psychosis (GROUP):

René S Kahn1, Don H Linszen2, Jim van Os3, Durk Wiersma4, Richard Bruggeman4, Wiepke Cahn1, Lieuwe de Haan2, Lydia Krabbendam3 and Inez Myin-Germeys3

1Department of Psychiatry, Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht, Postbus 85060, 3508 AB, Utrecht, The Netherlands; 2Department of Psychiatry, Academic Medical Centre University of Amsterdam, Amsterdam, NL326 Groot-Amsterdam, The Netherlands; 3Maastricht University Medical Centre, South Limburg Mental Health Research and Teaching Network, P. Debyelaan 25, 6229 HX Maastricht, Maastricht, The Netherlands; 4Department of Psychiatry, University Medical Center Groningen, University of Groningen, PO Box 30.001, 9700 RB Groningen, The Netherlands.

The members of the International schizophrenia consortium:

Cardiff University: Michael C O’Donovan6, George K Kirov6, Nick J Craddock6, Peter A Holmans6, Nigel M Williams6, Lyudmila Georgieva6, Ivan Nikolov6, N Norton6, H Williams6, Draga Toncheva16, Vihra Milanova17, Michael J Owen6; Karolinska Institutet/University of North Carolina at Chapel Hill: Christina M Hultman11,12, Paul Lichtenstein11, Emma F Thelander11, Patrick Sullivan7; Trinity College Dublin: Derek W Morris9, Colm T O’Dushlaine9, Elaine Kenny9, Emma M Quinn9, Michael Gill9, Aiden Corvin9; University College London: Andrew McQuillin8, Khalid Choudhury8, Susmita Datta8, Jonathan Pimm8, Srinivasa Thirumalai18, Vinay Puri8, Robert Krasucki8, Jacob Lawrence8, Digby Quested19, Nicholas Bass8, Hugh Gurling8; University of Aberdeen: Caroline Crombie15, Gillian Fraser15, Soh Leh Kuan14, Nicholas Walker20, David St Clair14; University of Edinburgh: Douglas HR Blackwood10, Walter J Muir10, Kevin A McGhee10, Ben Pickard10, Pat Malloy10, Alan W Maclean10, Margaret Van Beck10; Queensland Institute of Medical Research: Naomi R Wray5, Stuart Macgregor5, Peter M Visscher5; University of Southern California: Michele T Pato13, Helena Medeiros13, Frank Middleton21, Celia Carvalho13, Christopher Morley21, Ayman Fanous13,22,23,24, David Conti13, James A Knowles13, Carlos Paz Ferreira25, Antonio Macedo26, M Helena Azevedo26, Carlos N Pato13; Massachusetts General Hospital: Jennifer L Stone1,2,3,4, Douglas M Ruderfer1,2,3,4, Andrew N Kirby2,3,4, Manuel AR Ferreira1,2,3,4, Mark J Daly2,3,4, Shaun M Purcell1,2,3,4, Pamela Sklar1,2,3,4; Stanley Center for Psychiatric Research and Broad Institute of MIT and Harvard: Shaun M Purcell1,2,3,4, Jennifer L Stone1,2,3,4, Kimberly Chambert3,4, Douglas M Ruderfer1,2,3,4, Finny Kuruvilla4, Stacey B Gabriel4, Kristin Ardlie4, Jennifer L Moran4, Mark J Daly2,3,4, Edward M Scolnick3,4, Pamela Sklar1,2,3,4.

1Psychiatric and Neurodevelopmental Genetics Unit, 2Center for Human Genetic Research, Massachusetts General Hospital, 185 Cambridge Street, Boston, MA 02114, USA; 3Stanley Center for Psychiatric Research, The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; 4The Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; 5Queensland Institute of Medical Research, 300 Herston Road, Brisbane, Queensland 4006, Australia; 6Department of Psychological Medicine, MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University, Cardiff C14 4XN, UK; 7Departments of Genetics, Psychiatry, and Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; 8Research Department of Mental Health Sciences, Molecular Psychiatry Laboratory, University College London Medical School, Windeyer Institute of Medical Sciences, 46 Cleveland Street, LondonW1T4JF, UK; 9Department of Psychiatry and Institute of Molecular Medicine, NeuropsychiatricGenetics Research Group, Trinity College Dublin, Dublin 2, Ireland; 10Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh EH10 5HF, UK; 11Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, SE-171 77 Stockholm, Sweden; 12Department of Neuroscience, Psychiatry, Ulleråker, Uppsala University, SE-750 17 Uppsala, Sweden; 13Center for Genomic Psychiatry, University of Southern California, Los Angeles, CA 90033, USA; 14Institute of Medical Sciences, 15Department of Mental Health, University of Aberdeen, Aberdeen AB25 2ZD, UK; 16Department of Medical Genetics, University Hospital Maichin Dom, Sofia 1431, Bulgaria; 17Department of Psychiatry, First Psychiatric Clinic, Alexander University Hospital, Sofia 1431, Bulgaria; 18West Berkshire NHS Trust, 25 Erleigh Road, Reading RG3 5LR, UK; 19Department of Psychiatry, University of Oxford, Warneford Hospital, Headington, Oxford OX3 7JX, UK; 20Ravenscraig Hospital, Inverkip Road, Greenock PA16 9HA, UK; 21State University of New York—Upstate Medical University, Syracuse, NY 13210, USA; 22Washington VA Medical Center, Washington DC 20422, USA; 23Department of Psychiatry, Georgetown University School of Medicine, Washington DC 20057, USA; 24Department of Psychiatry, Virginia Commonwealth University, Richmond, VA 23298, USA; 25Department of Psychiatry, Sao Miguel, 9500-310 Azores, Portugal; 26Department of Psychiatry University of Coimbra, 3004-504 Coimbra, Portugal.



We thank the volunteers, patients and their family members for participating in this study. This study was supported in part by a research grant (07R-1770) from the Stanley Medical Research Institute and an Independent Investigator Award from NARSAD to XC, and by grants to investigators involved in the collection and analyses of the samples from CATIE, GAIN, the international schizophrenia consortium and other independent samples (National Institutes of Health (MH41953, MH63480, MH56242, MH078075); NARSAD Young Investigator Award; Donald & Barbara Zucker Foundation, USA; the Medical Research Council and the Wellcome Trust Foundation, UK; the Research Council of Norway (Grant No. 163070/V50, 167153/V50); the South-Eastern Norway Health Authority (123/2004); Science Foundation Ireland and Health Research Board, Ireland; the Lundbeck Foundation and Danish National Advanced Technology Foundation, Denmark). The Ashkenazi samples are part of the Hebrew University Genetic Resource (HUGR). The principal investigators of the CATIE trial were Jeffrey A Lieberman, T Scott Stroup and Joseph P McEvoy. The CATIE trial was funded by a grant from the National Institute of Mental Health (N01 MH900001) along with MH074027 (PI PF Sullivan). Genotyping was funded by Eli Lilly and Company. The principle investigators for the MGS were Pablo Gejman and Douglas Levinson. MGS study was supported by funding from the National Institute of Mental Health and the National Alliance for Research on Schizophrenia and Depression. Genotyping of part of the sample was supported by GAIN and the Paul Michael Donovan Charitable Foundation. Genotyping was carried out by the Center for Genotyping and Analysis at the Broad Institute of Harvard and MIT with support from the National Center for Research Resources.

Supplementary Information accompanies the paper on the Molecular Psychiatry website