INTRODUCTION

Susceptibility to schizophrenia is determined by multiple genetic and possibly environmental factors. Recent studies addressing the role of high-penetrant rare variants (Walsh et al, 2008; Xu et al, 2008, 2009, 2011, 2012) or common genetic variants with low effect (ISC, 2008; Lee et al, 2012; O’Donovan et al, 2008; Ripke et al, 2011; Shi et al, 2009, 2011; Stefansson et al, 2009) suggest that patient genomes contain risk alleles at a wide range of frequencies, some driving and some merely modifying the disease risk and expression, which in concert may affect the structure and function of neural circuits (ISC, 2008; O'Donovan et al, 2008; Rodriguez-Murillo et al, 2012; Shi et al, 2009; Stefansson et al, 2009; Xu et al, 2008, 2009, 2011).

In complex diseases, the genetic structure of linkage signals most likely involves one or several rare alleles with strong effect on disease risk, or a combination of rare and common alleles, in the same or different genes (Bowden et al, 2010). Also, linkage analyses of inbred mice have shown that more than one gene can contribute to the same linkage signal for a given QTL trait (Karst et al, 2011). Along the same lines, association studies coupled with targeted resequencing have suggested that the same genes carrying common risk variants can also show an excess of rare risk variants implicated in the disease (Cirulli and Goldstein, 2010; Di Rienzo, 2006; Manolio et al, 2009; Trynka et al, 2011).

A 9-cM genome-wide linkage scan on families from the European descent Afrikaner population from South Africa identified three linkage signals on chromosomes 1, 9, and 13 (Abecasis et al, 2004). Subsequently, we increased the genomic coverage to better define the linkage regions, and performed a 2-cM genome-wide linkage scan on an extended set of Afrikaner families. The results from this genome scan identified chromosome 13q32–34 as the most robustly linked locus in this population. We also addressed the contribution of rare CNVs to schizophrenia in this cohort and found that, at the level of resolution of the linkage scan, none of the linkage signals observed in these families may be caused by the presence of CNVs within these genomic intervals (Xu et al, 2009).

Here, we present the results of our ongoing systematic effort to elucidate the genetic structure of our 13q32–34 linkage peak obtained in our 2-cM genome scan, by analyzing the contribution of local common variants via a multistage association study. In addition to genuine contributions to the risk associated with a given linkage signal, even in cases where the linkage signal is accounted for only by rare variants, common variants may in some cases help pinpoint with more accuracy the location of rare risk variants (Dickson et al, 2010; Lin et al, 2004; Sanna et al, 2011). With this aim, first, we genotyped 1223 individuals from 415 Afrikaner families for 723 SNPs localized within 13q32–34. Subsequently, the most significant SNPs were followed up in two independent family-based replication samples of European origin. One SNP showed replicated association in one of the two independent samples and remained significant after meta-analysis and correction for multiple testing. This SNP is located within the MYO16 (myosin XVI) gene (Patel et al, 2001; Yokoyama et al, 2011). Second, we performed a comprehensive fine-scale mapping of the genetic contribution of this gene with respect to common variation, by genotyping an independent set of families from the Afrikaner population for 102 SNPs within MYO16 and by imputing the rest of the HapMap SNPs within the gene boundaries. These analyses identified a preponderance of common variants implicated in schizophrenia within introns 2–6 of the gene MYO16. Furthermore, expression analysis of the MYO16 gene in brain samples from patients and controls identified a significantly elevated level of expression in patients with schizophrenia.

MATERIALS AND METHODS

We used a family-based approach studying families with at least one affected individual per family. Data sets are presented in Supplementary Table 1.

Afrikaner Cohorts

Affected families were recruited and diagnosed as part of our ongoing, large-scale genetic study of schizophrenia in the European descent Afrikaner population from South Africa, as described previously (Abecasis et al, 2004; Karayiorgou et al, 2004; Xu et al, 2008, 2009). Affected subjects were classified as either narrowly or broadly affected. The narrow diagnosis includes subjects with schizophrenia or schizoaffective disorder-depressive type, as described previously (Abecasis et al, 2004; Xu et al, 2009). The broad diagnosis includes all individuals classified under the narrow definition, as well as individuals with schizoaffective disorder-bipolar type (Xu et al, 2009).

Afrikaner Set 1

This data set includes the 143 families used for the linkage scan, plus an additional 272 families. The entire set comprises 474 affected individuals who meet the narrow diagnostic criteria or 741 who meet the broad diagnostic criteria.

Afrikaner Set 2

This data set includes 237 families, 85 of whom have family history of schizophrenia in the previous two generations. Two hundred and thirty-two individuals in these families meet the narrow diagnostic criteria, whereas 266 individuals meet the broad diagnostic criteria.

Rutgers Families

From the entire set of families collected under the NIMH Schizophrenia Genetics Initiative, maintained by the Rutgers University Cell and DNA Repository, we selected a subset of 301 families matched according to ethnicity. Our selected Caucasian, European ancestry, subset includes a total of 1241 individuals (631 affected with schizophrenia).

US Families

Two hundred and ten trios (consisting of one affected individual and both biological unaffected parents for a total of 630 individuals) were included in this sample of Caucasian, European descent families recruited from the United States. All probands met full diagnostic criteria for schizophrenia or schizoaffective disorder. Description of this data set and the methods of subject selection and clinical evaluation have been described previously by Sobin et al (2001, 2003).

GAIN Data Set

This study is part of the Genetic Association Information Network (GAIN) (ID phs000021.v.2.p1). Details on inclusion criteria and participants are available at dbGap (database of Genotype and Phenotype) (see URLs) (Suarez et al, 2006). In total, 1314 cases and 1368 controls of European descent were included in the final set.

MGS_nonGAIN Data Set

This study is part of the Molecular Genetics of Schizophrenia (MGS) genome-wide association study (ID phs000167). Details on inclusion criteria and participants are available at dbGap (see URLs). In all, 1405 cases and 1347 controls of European descent were included in the final set.

PGC Data Set

This data set is part of the Schizophrenia Psychiatric Genome-wide association study consortium (Ripke et al, 2011). We included the results from stage 1 mega-analysis published by Ripke et al (2011) that correspond to the MYO16 gene region overlapping SNPs genotyped or imputed in our Afrikaner Set 2 (SAF2) data set. This data set included 9394 schizophrenia cases.

Genotyping, Quality Control, and Imputation

SAF1

Family members were genotyped for 723 SNPs covering 14.65 Mb under the 13q32–34 linkage peak and within candidate genes (ZIC2, ZIC5, NALCN, FGF14, G72, and EFNB2) in the immediate vicinity of the linkage peak (Supplementary Table 2), on the Illumina GoldenGate platform at the Center for Inherited Disease Research (CIDR).

Rutgers and US samples

Family members were genotyped for 22 SNPs on a Taq Man Open Array Genotyping Platform (Applied Biosystems). These 22 SNPs were chosen among the top-associated SNPs resulting from the association analysis in stage 1 (Afrikaner Set 1 (SAF1)) or surrogates of those (ie, in strong LD with at least one of the top-associated SNPs).

SAF2

This set of families was genotyped as part of a wider genotyping project on a Human Genome-Wide SNP Array 5.0 (Affymetrix), which contains 500 568 SNPs (manuscript in preparation). Samples were processed as described previously (Xu et al, 2008). Average call rate on arrays used in this study was 99.43%. All microarray experiments were performed in the Vanderbilt Microarray Shared Resource.

GAIN and MGS

Individual genotypes as well as phenotypic information were available to download from the dbGap website. Only individuals of European descent were included in the analysis.

For all data sets, quality control procedures per family, individual, and marker were performed with PLINK (Purcell et al, 2007) and PedStats (see URLs). All data sets went through quality control and we only selected samples with a call rate >95%. We eliminated from the analysis duplicated SNPs, monomorphic SNPs, and SNPs with Hardy–Weinberg exact test P<10−6. Only SNPs with a minor allele frequency over 0.01 were included in the downstream analyses. We also checked for Mendelian inheritance errors among families, and removed SNPs with more than four Mendelian errors in the total sample. For the case–control data sets, we corrected for population stratification with the program EIGENSTRAT, eliminating outliers from the downstream analyses.

Imputation of non-genotyped HapMap SNPs for SAF2, GAIN and MGS data sets was performed with MACH (see URLs) using 100 Markov iterations with the two-step procedure recommended in the manual. HapMap Phased Haplotypes (release 22) on CEU subjects were used in the imputation. After imputation, only SNPs with a MACH R2 over 0.3 were further considered. This estimates the correlation between imputed and true genotypes; a value <0.3 flags poorly imputed SNPs (Li et al, 2010). In addition, Mendelian checks (for the family-based samples) and Hardy–Weinberg equilibrium tests were performed to eliminate unreliable imputation calls to include imputed genotypes in downstream analyses. Imputed SNPs were then analyzed as the genotyped SNPs.

Statistical Analyses

Family-based association testing for single SNPs was performed using LAMP (see URLs) (Li et al, 2005, 2006). We adopted a free model for the analysis that does not constrain the penetrances for the three genotypes. Haplotype-based associations were assessed by means of the transmission disequilibrium test (TDT) for haplotypes implemented in PLINK. For the case–control data sets, a trend test was performed to evaluate the SNP association. We applied Bonferroni correction in all tests to obtain an α-corrected threshold. We calculated the number of independent tests in each case based on LD patterns between SNP pairs. These procedures were performed in PLINK (see URLs).

Meta-analysis of the results for the independent samples was performed with Metal (see URLs). The algorithm checks for heterogeneity and performs meta-analysis under a fixed-effects model. All base pair positions are based on the current Human genome assembly (hg19) (see URLs).

To identify duplicated individuals and family relationships between individuals across data sets, we performed identity-by-descent analysis of GAIN, MGS_nonGAIN, and Rutgers samples merged together using PLINK. Duplicated and related individuals across data sets were removed from all but one of the data sets to avoid bias in the analysis. Specifically, GAIN and MGS_nonGAIN included 10 duplicated individuals that were removed from the larger MGS_nonGAIN data set.

Expression Analysis

Total RNA from frontal cortex was obtained from the Stanley Medical Research Institute (SMRI) (Bethesda, MD) (see URLs). The SMRI Array Collection includes 35 individual subjects in each of three groups: control, schizophrenia, and bipolar disorder subjects (Torrey et al, 2000). qRT-PCR was performed with pre-designed TaqMan Gene Expression assay by ABI (Applied Biosystems; ABI assay number Hs01031284_m1) on a 7900HT Fast Real-Time PCR system (Applied Biosystems). Human glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used as the endogenous control. Relative quantitation of expression comparing the three groups (schizophrenia, bipolar disorder, and controls) was tested with generalized linear models (GLM) and incorporating covariates into the model. Descriptive statistics, means comparison, and GLM were analyzed with R statistics software.

RESULTS

SNP Association in a Discovery set of Afrikaner Families, Replication, and Meta-Analysis

To follow-up the 13q32–34 linkage signal obtained through our 2-cM coverage linkage scan, we genotyped 723 SNPs from chromosome 13, on 1223 individuals from the 143 Afrikaner families included in the 2-cM linkage scan plus 272 additional families from the same homogeneous population (SAF1). We performed family-based association tests on these 415 families using LAMP (Li et al, 2005, 2006). One hundred and fifteen SNPs reached nominal significance at this stage (Figure 1 and Supplementary Table 2). None of these SNPs were within candidate genes (ZIC2, ZIC5, NALCN, FGF14, G72, and EFNB2) abutting the linkage peak. Twenty-two associated SNPs were followed-up in two independent samples of European descent (Rutgers and US samples, see Materials and methods). These 22 SNPs were selected from the top-associated SNPs, or surrogates of these, chosen based on the LD structure, each one representing one independent LD block, and the availability of a genotyping assay in a TaqMan Open Array Genotyping platform (Applied Biosystems). Family members from the replication samples (Rutgers and US) were genotyped for these 22 SNPs (Table 1). Following quality control procedures, seven SNPs were removed from the analysis, two because of bad calls, four because of Mendelian errors (these were not concentrated in specific families), and one because of deviations from Hardy–Weinberg equilibrium. The remaining 15 SNPs were tested for association in the replication samples by using LAMP. Table 1 shows the p-values and odd ratios for the association in the discovery sample (SAF1) and in the replication samples (Rutgers and US). Subsequently, meta-analysis was performed combining p-values obtained from the SAF1 and both replication samples. Meta-analysis identified one SNP with combined P-values that survive Bonferroni correction for multiple testing (Table 1) (αcorrected=0.0023). The top-associated SNP, rs9583277, has a meta-analysis p-value of 1.86 × 10−4 and 2.25 × 10−4 for the combined sample SAF1–Rutgers–US families, for both narrow and broad definition of schizophrenia, respectively. It is worth noting that the Bonferroni correction we used to declare significance reflects the number of independent tests (n) we performed (n=22, αcorrected=0.05/22=0.0023) and therefore are not as stringent as thresholds used in genome-wide association study (GWAS) that reflect corrections for 1 million tests performed. The identified variant (rs9583277) maps to 109 333 749 bp on chromosome 13q33.3, within the second intron of the MYO16 gene. Our previous linkage analysis indicated dominant inheritance for the risk locus at 13q32–34 (Xu et al, 2009). Consistent with this finding, we did not detect any excess of homozygosity (an indication of recessive mode of inheritance) at rs9583277, either when the entire SAF1 data set was considered or upon stratified analysis including only families linked to 13q32–34 (data not shown). Overall, our analysis, using densely spaced SNPs to fine map the prior 13q32–34 linkage peak region on a discovery and two replication family samples (a total of 923 families), highlighted a potential contribution of the MYO16 gene locus.

Figure 1
figure 1

Genome-wide linkage and fine mapping. Bd, broad; LOD, logarithm of the odds; Na, narrow; SNP, single-nucleotide polymorphism.

PowerPoint slide

Table 1 LAMP P-Values and Risk Alleles for the Discovery (SAF1) and Replication (Rutgers and US) Family Samples

Fine Mapping of the Common Variant Association Signal Using MYO16 HapMap SNPs

Having identified a significant association with a genetic variant (rs9583277) within the MYO16 gene, we then performed a comprehensive fine-scale mapping of the common variant association signal in an independent set of families and cases. To this end, we genotyped, or imputed when necessary, all HapMap SNPs within the MYO16 gene boundaries, according to UCSC genome browser genome positions (hg19). First, we examined a sample of 228 Afrikaner families with an average of 3.2 individuals per family (SAF2). Following quality control procedures (see Materials and methods), 102 genotyped and 470 imputed HapMap SNPs were available for analysis of the MYO16 gene locus with respect to underlying common risk variants. It should be noted that there is no overlap with MYO16 SNPs genotyped in SAF1 and that the SNP previously found associated in the SAF1 sample (rs9583277) was not genotyped in the SAF2 sample as it is not a HapMap SNP nor is it present in the common genotyping platforms. Therefore, this stage is not intended to be a replication of the previous findings, but a deeper characterization of the common variation within MYO16 in the context of schizophrenia. Even though several SNPs genotyped at this stage are located in the general vicinity of rs9583277 within the MYO16 gene, rs9583277 is in a region of low LD.

Figure 2 shows LAMP p-values for the association of MYO16 SNPs for both narrow and broad definitions of schizophrenia, along with the recombination frequency across the region. Table 2 shows the top-associated SNPs within this data set. Notably, four SNPs showed significant association with narrow definition schizophrenia after correction for multiple testing (αcorrected=2.17 × 10−4, based on a Bonferroni correction after estimating the number of independent tests to 230, taking into account the LD pattern among SNP pairs). Three of these four SNPs also showed association with the disease under its broader definition. All these four SNPs were located within intron 3 of the MYO16 gene, within an LD block that expands from intron 2 to 6.

Figure 2
figure 2

Plot depicts the negative logarithm of the SNP association p values for the Afrikaner Set 2 data set with narrow (SAF2(NARROW)) and broad (SAF2(BROAD)) schizophrenia diagnosis, as well as the meta-analysis p-values for the combined sample SAF2(BROAD)/GAIN/MGS data sets (META(BROAD)). The background graph represents the recombination rate throughout the region. GAIN, Genetic Association Information Network; MGS, Molecular Genetics of Schizophrenia.

PowerPoint slide

Table 2 LAMP P-values and Maximum-Likelihood Estimates of Penetrance, GRR, PAR, and OR for Top-Associated SNPs in SAF2

We also investigated if there was any specific configuration of alleles or haplotypes conferring susceptibility to schizophrenia for either narrow or broad definition. Only directly genotyped SNPs were used to test association on haplotypes. Haplotype-based association in Afrikaner families was assayed with the TDT. First, we estimated haplotype blocks based on the LD structure by means of the default procedure implemented in Haploview. Subsequently, each haplotype within each block was tested for association with the hap-tdt option implemented in PLINK. In this manner, we tested 74 haplotypes, each comprising 2–10 SNPs. Table 3 shows the top-associated haplotypes for either schizophrenia definition. Two distinct two-SNP haplotypes show undertransmission and significant association with schizophrenia (αcorrected=6.76 × 10−4, 74 independent tests). It is worth noting that, of the two haplotypes with significant P-values, the CG haplotype including SNPs rs558322 and rs4976845 is associated with the narrow definition, and the GA haplotype including rs4578513 and rs10492418 is associated with the broad definition. Notably, these two haplotypes reside within distinct haplotype blocks, suggesting that the observed association signals are independent of each other. Of note, there are other examples where independent haplotypes are associated with distinct forms of a disease (Cruz et al, 2008). In our study, the two independent haplotypes might be acting as modifiers of the clinical presentation or reflect two distinct patient sub-populations.

Table 3 Haplotypic Association

We extended our follow-up studies to two additional, independent, case–control data sets, which are part of genome-wide genotyping projects (GAIN and MGS). To this end, we extracted SNP genotypes located within the MYO16 gene boundaries and also imputed non-genotyped HapMap SNPs from this region to facilitate comparison with the SNPs in our SAF2 data set. The SNPs extracted from GAIN and MGS data sets matched the SNPs in SAF2, and so the LD patterns were equivalent. Therefore, we used the same significance threshold for the SAF2 as well as the GAIN and MGS data sets. Following quality control procedures, 572 single SNP P-values from the three data sets, SAF2, GAIN, and MGS (a total of 3307 cases), were meta-analyzed. The lowest combined P-value after meta-analysis was 1.1 × 10−3 for the combined sample SAF2 (broad status of SCZ)-GAIN-MGS for SNP rs4772996 (Table 4). Although this SNP does not survive the correction for multiple testing when considering 230 independent tests (corrected α level, 0.05/230=2.17 × 10−4), it is important to note that the direction of association is consistent across all three data sets for the top-associated SNPs. Moreover, all top-associated SNPs following meta-analysis are located within intron 4 of the gene, in complete LD with the top-associated SNPs in the SAF2 data set (D′=1), strongly suggesting that the association signal obtained upon meta-analysis points to the same associated region within the MYO16 gene. The fact that these SNPs do not reach significance after correction for multiple testing likely reflects the presence of heterogeneity across data sets (Table 4). Sample heterogeneity also likely explains change of ranking among top SNPs. Specifically, although the top-associated SNPs in the SAF2 data set continue to show nominally significant association in the meta-analysis, they are not present among the top-ranking SNPs (Table 4). However, top-ranking SNPs from either data set are in high LD with each other and likely represent the same association signal.

Table 4 P-Values for the Individual Replication Samples and p-Values Following Meta-Analysis (Meta-p)

We further compared the results obtained in our SAF2 data set with recently available results from the Schizophrenia Psychiatric GWAS Consortium (Ripke et al, 2011) The PGC study is a meta-analysis that combines various data sets, including GAIN and MGS. Therefore, meta-analysis of our data set and the PGC data set is not intended as a replication, but as a test of our hypothesis using a more extensive set of data. We extracted results for SNPs mapping within MYO16 and performed a meta-analysis following the same procedure as with the SAF2-GAIN-MGS data sets. We meta-analyzed 248 SNPs overlapping across data sets, for a total of 22 640 individuals. The top-associated SNP following meta-analysis is rs9284246 (Supplementary Table 3) located within intron 2 of the MYO16 gene (109 327 788 bp). This finding further points to the region across introns 2–6 as the most likely region to harbor common variants implicated in schizophrenia.

Expression Analysis

In seeking convergent supporting evidence, we also tested the expression levels of the MYO16 gene in brains of patients with schizophrenia. Our analysis of the SMRI Array Collection using qRT-PCR showed that mean levels of expression of MYO16 were significantly higher in the frontal cortex of schizophrenia patients as compared with controls (F(1, 66)=4.2; p=0.044). The significance holds when we incorporate either sex or age at death (F(3,64)=3.008; p=0.037) or brain pH and post-mortem interval (F(3,64)=2.778; p=0.048) as covariates in our analysis. The comparison of the bipolar group to controls did not result in a significant difference, although mean levels of expression were slightly higher in the bipolar group (see Supplementary Figure 1 for a scatter plot of expression levels). Furthermore, 6 out of 11 expression studies that have profiled the SMRI Array Collection samples using array technology reported increased levels of MYO16 expression in schizophrenia patients vs controls.

We also tested 28 SNPs genotyped in the Stanley Array Collection, located at both ends of our significant SNPs, but none of these SNPs showed association with MYO16 expression levels (p>0.05). It should be noted that SNP rs9583277 as well as most of the significant SNPs in SAF2 were not included in this set as they had not been genotyped in the Stanley Array Collection.

DISCUSSION

This study used seven patient cohorts and a dense array of SNPs to fine map the prior linkage region at the 13q32–34 locus. We provide evidence suggesting that variants within MYO16 contribute to the genetic liability to schizophrenia conferred by the 13q32–34 locus. The MYO16 gene stretches along 611 856 bp on chromosome 13q33. It consists of 35 exons and has several isoforms. All associated SNPs from SAF1, SAF2, and meta-analysis, and one haplotype from SAF2 are located within introns 2–6 of the gene. Considering that there was no significant excess of total genotyped SNPs in this region, this finding indicates that the signal related to common risk variation from this gene is likely localized in this region of the gene. It should be noted that incorporating the initial findings in the meta-analyses is necessary because of the small effect sizes of common variants and the need to increase the power of our association study, albeit the potential for introducing biases (Zeggini and Ioannidis, 2009). The effect of the associated SNPs on the function of the MYO16 gene remains unknown. It should be noted, however, that one of the most significant SNPs in the SAF2 data set (rs9301323, p-value=1.7 × 10−3), which is in strong LD with the top-associated SNP in the same data set (rs9520990), is located within a splice site region in intron 6 (see URLs), and could affect the pattern of splicing of the MYO16 gene. This position is conserved in the mouse. The Human Splicing Finder program (Desmet et al, 2009) indicates that the minor allele (G) of rs9301323 disrupts a predicted branch point sequence in intron 6.

Myosin XVI appeared very recently during the evolution of mammals and is unique in both its structure and function (Thompson and Langford, 2002). Earlier evidence suggested that MYO16 is important for neuronal migration and brain development (Patel et al, 2001). More recently, MYO16 has been implicated in neuronal phosphoinositide 3-kinase (PI3K) signaling (Yokoyama et al, 2011), an extensively studied pathway involved in neuronal function and morphogenesis, as well as in a number of neurological and psychiatric disorders, including schizophrenia and autism (Waite and Eickholt, 2010). MYO16 is a member of the neuronal tyrosine-phosphorylated adaptor for the PI3K (NYAP) family of phosphoproteins, which is comprised of NYAP1, NYAP2, and Myo16/NYAP3. The NYAPs are expressed predominantly in developing neurons and upon stimulation with Contactin5, they are tyrosine phosphorylated by Fyn. Phosphorylated NYAPs interact with PI3K p85 and activate PI3K, Akt, and Rac1. In addition, NYAPs interact with the WAVE1 complex, thus serving as a bridge for a PI3K–WAVE1 interaction, which mediates PI3K-dependent remodeling of the actin cytoskeleton. Importantly, disruption of the NYAP genes in mice affects brain size and neurite elongation (Yokoyama et al, 2011). Notably, meta-analysis of the SAF2, GAIN, and MGS data sets (a total of 2956 cases) showed a gene-wise significant association (p-value of 1.8 × 10−5) with a SNP located within the third intron of the NYAP2 gene (rs1897227), suggesting that variation within this gene family may be modulating the risk of schizophrenia (data not shown).

Additional supporting evidence was provided by expression analysis in brain samples (frontal cortex), which revealed a significant increase in the levels of MYO16 expression in schizophrenia patients compared with controls. Finally, convergent supporting evidence could be found in the existing literature. First, according to the SCAN database (see URLs), the top-associated MYO16 SNP rs9583277 is a potential trans-acting eQTL (expression quantitative trait locus) for MAP3K13 (mitogen-activated protein kinase 13) gene on chromosome 3q27 (p=8 × 10−5). Given a potential convergence of MAP3K13 and PI3K pathways (Ambacher et al, 2012), regulation in trans of MAP3K13 may be mediated by altered MYO16 activity. Interestingly, MAP3K13 can phosphorylate MAP2K7 (mitogen-activated kinase protein 7), which has been recently implicated in schizophrenia (Winchester et al, 2012). In addition, the seven top-associated SNPs identified by our meta-analysis of the SAF2-GAIN-MGS data sets (Table 4) are reported by the SCAN database to have a trans-acting effect on the expression of PAG1 (phosphoprotein associated with glycosphingolipid microdomains 1) on chromosome 8q21.23 (p=2 × 10−6), a gene implicated in brain maturation (Lindquist et al, 2011). Notably, we have previously reported a non-synonymous de novo mutation within PAG1 in a schizophrenia proband (Xu et al, 2011, 2012). Finally, a recent study (Nakayama et al, 2002) reported a physical interaction between the gene products of MYO16 and NRXN1, a synaptic neuronal adhesion molecule that connects presynaptic and postsynaptic neurons and has an important role in cognitive process (Sudhof, 2008). Rare and recurrent deletions disrupting NRXN1 have been reported in patients with schizophrenia and neurodevelopmental disorders. Furthermore, MYO16 has been identified as a candidate risk gene in a genome-wide association study of autism where suggestive association signals were reported in two independent discovery cohorts (Wang et al, 2009) as well as in GWAS of alcohol response (Joslyn et al, 2010) and smoking cessation (Rose et al, 2010).

Although our results suggest that common variation within MYO16 may contribute to the genetic liability to schizophrenia, we cannot exclude the possibility that common variants within MYO16 act in combination with or as surrogates of rare alleles with strong effect in the same or different genes to generate the observed linkage signal in the 13q32–34 locus. We started addressing this question using inherited exonic variant data extracted from our recent whole-exome sequencing study in 146 Afrikaner and 85 US parent–proband trios afflicted with schizophrenia or schizoaffective disorder (Xu et al, 2012). Trios used in the present study and the study by Xu et al (2012) overlap by 50% (72% if we considered just the South African sample). None of the MYO16 variants located in exons 2–6 (Supplementary Table 4) are in LD with associated SNPs, show differential enrichment in cases vs controls, or show strong allele transmission distortion in affected families. Also, no homozygous or compound heterozygous carriers were identified. Although further analysis in expanded samples and in linked families is required, these results suggest that the association observed with common variants of the MYO16 gene is unlikely to be due to rare exonic variants.

Our results establish MYO16 as a novel candidate gene for schizophrenia. Interpretation of our findings awaits replication in independent data sets.

FUNDING AND DISCLOSURE

This work was supported in part by National Institute of Mental Health (NIMH) Grant MH061399 (to MK) and the Lieber Center for Schizophrenia Research at Columbia University. LRM was partially supported by a Gray Matters Fellowship and BX was partially supported by an NARSAD Young Investigator Award. The authors declare no conflict of interest.