We agree with Dr Costas1 that the investigation of rare genetic variants in multifactorial disorders raises particular methodological concerns. We disagree, however, with his conclusions regarding our experimental evidence implicating rare SH3 and multiple ankyrin repeat domains 2 (SHANK2) variants in schizophrenia (SCZ).

First, Dr Costas comments on the potential bias introduced by using a minor allele frequency of <1% in controls to define rare variants, as this may lead to an increase in type I errors. We entirely agree with Dr Costas on the issue of potential bias, and were, of course, aware of this during the study design phase. At that time, we opted to perform an initial genetic discovery step followed by functional validation in order to test the adverse functional consequences of newly identified patient-specific variants. It is essential to clarify here that in the genetic discovery step, our primary aim was not to minimize type I errors but rather to minimize type II errors. The acceptance of an increase in type I errors is an inherent aspect of this approach. In contrast to the impression created by Dr Costas, the authors of the two cited references2, 3 discuss these opposing effects, and refrain from drawing simple conclusions. Lemire,2 for example, states that an approach such as that used by our group (that is, defining rare variants on the basis of allele frequency in controls) has certain advantages, namely, ‘working with a set of rare variants defined as those with a frequency calculated in the controls below a certain threshold (as opposed to, say, a frequency estimated from the combined sample of cases and controls) is that this procedure imposes no bounds on how high the frequency may get in the cases, which is a desirable effect.’ Similarly, Pearson3 expresses caution when he suggests that researchers should ‘use a definition of a rare variant as those SNPs having minor allele frequency below some threshold (for example<1%) in the combined set of cases and controls, possibly in addition to an analysis based on the frequency in controls alone’. The approach suggested by Dr Costas (that is, defining rare variants on the basis of allele frequency in the combined case–control sample) raises an additional source of type II error in a situation where both rare risk variants and rare protective variants are present. If the combined approach is used in this scenario, the opposing effects may result in a (false) negative finding. Indeed, the recent study by Duan et al.,4 which investigated the contribution of rare variants at the MIR137/MIR2682 locus in SCZ and bipolar disorder, defined variants as being rare if they fell below a given threshold in either cases or controls in order to capture both risk and protective variants. Therefore, we consider our definition of rare variants to be appropriate for our particular study design.

The second point raised by Dr Costas was that the observed association is likely to be attributable to population stratification. His argument is based on frequency differences for a single variant (p.Y967C) between controls of European origin from public databases and our German controls. No significant frequency differences exist for any other variants in SHANK2. In view of this, we find it difficult to follow Dr Costas’s reasoning in stating that there is ‘strong evidence for population stratification’ and in failing to consider other sources of type I error. For the disease-associated variant p.A1731S, we have now performed a more detailed investigation of the possibility of a founder effect because the identity-by-state analysis presented in our publication only excludes a very close familial relationship among the carriers. For this purpose, we performed a phased haplotype analysis of the four A1731S carriers using markers spanning a 380-kb region flanking the SHANK2-A1731S variant. To obtain frequency estimates for the observed haplotypes, we performed a similar investigation in 120 population-based controls. This analysis generated no evidence for a founder effect, as neither a rare nor a common haplotype was shared among all four carriers. Three carriers shared a haplotype, but this was the most common haplotype in the population. These findings render the possibility of a founder effect unlikely and, consequently, provide no support for the hypothesis that population stratification is the cause of the observed association.

Dr Costas states that Peykov et al.5 selected patient-specific variants for functional assays on the basis of a hypothesis. However, this is a misunderstanding, and was not the case. Functional data for variants confined to controls, and those for variants found in both patients and controls, were already available,6 and we were therefore able to focus our functional analysis on the patient-specific variants. Thus, data for all three classes of variants were available. We refer Dr Costas to Table S8 of our publication, which provides a comprehensive overview of all functionally analyzed variants in controls and patients from both studies.5, 6 Table S8 in Peykov et al. also lists the effect of SHANK2 variants on synaptic density in: (1) controls; (2) SCZ patients and controls; and (3) SCZ patients only. Our experiments generated unequivocal evidence that SCZ-specific SHANK2 variants lead to a more pronounced decrease in synapse number compared to variants detected in controls. For the four analyzed SCZ patient variants, a substantial reduction in synaptic density was detected. Only one of the six variants from controls showed a comparable effect size.5, 6 In summary, the approach used by both Peykov et al.5 and Leblond et al.6 has clearly demonstrated that the functional impact of variants identified in SCZ and autism spectrum disorder patients is significantly more pronounced than that of variants found in controls. The only valid point raised by Dr Costas in this respect is that ideally, functional analyses should be performed for all of the variants detected in patients and controls. However, the selection of specific variants for functional studies is a widely used and accepted validation strategy.

In conclusion, while we naturally agree that replication studies in larger cohorts should be performed to substantiate our findings, we stand firm in our opinion that the combination of genetic and functional data presented in our paper is sufficiently strong to suggest a causative role for the rare SHANK2 variants in SCZ.