Schizophrenia has a strong genetic component involving rare and common alleles distributed across many genes1,2. Common alleles confer weak effects (OR < 1.1 in general) but collectively account for a substantial proportion of genetic liability to the disorder3. On the other hand, some rare genetic variants, such as copy number variants (CNVs), confer larger risks for schizophrenia4,5. A combined analysis of published datasets has shown strong evidence for association between 11 individual CNVs and risk of schizophrenia and weaker evidence for 4 additional risk CNVs6. Duplications of 22q11.2 have also been associated with reduced risk of schizophrenia7.

A gender-bias in prevalence has been repeatedly observed in neurodevelopmental disorders. In autism spectrum disorder (ASD), the mean male-to-female ratio is about 4:18, whereas in schizophrenia the male-to-female ratio is 1.4:1, though this varies by age reflecting sex related differences in age at onset9. Sex differences in schizophrenia have also been reported for structural and functional brain abnormalities whose origins are thought to occur during brain development10,11,12,13,14.

A recent study found an excess of deleterious autosomal CNVs in females compared with males in a neurodevelopmental disorders cohort and in an ASD cohort15. Females with ASD have also been found to possess a higher burden of de novo CNVs and genes disrupted by de novo CNVs than males with ASD16,17,18. The lower prevalence of ASD and higher mutational burden in females with the disorder is consistent with a different liability threshold for females compared to males whereby females require a greater risk factor load to manifest neurodevelopmental disorders15,19,20. A higher female autosomal burden of large, rare CNVs, attributed to lower rates of foetal loss, has also been reported in the general population21.

Given the excess burdens of CNVs in females in both neurodevelopmental and control samples, we tested whether similar differences in burden exist in schizophrenia using a large case control dataset (N = 31,139 individuals). We evaluated the CNV burden in females and males at a genome-wide level and for 11 strongly associated schizophrenia risk loci. We also tested whether association between specific CNVs and schizophrenia are robust to adjusting for gender. Consistent with earlier findings, we found an increased CNV burden for large (≥500 Kb), rare (<1%) CNVs in female controls and now report an excess in female cases with the disorder. Although female cases also had an excess burden compared with males for 11 CNVs that have been implicated in schizophrenia, the association between those CNVs and schizophrenia was not diminished after controlling for gender.


Females carry an excess of CNVs both genome-wide and at specific loci

Females had a genome-wide excess of large (≥500 kb) and rare (<1%) CNVs in cases (OR = 1.11, 95% CI = 1.00–1.23, P = 0.045) and controls (OR = 1.17, 95% CI = 1.06–1.29, P = 0.0012) (Fig. 1, Supplementary Table S1) although only in female controls would this survive correction (P = 0.0048) for 4 independent tests (CNVs < 500kb and CNV ≥ 500 kb separately for cases and controls). While the excess CNV burden in female cases does not survive correction for multiple testing, importantly the size of the effect is not significantly different from that in female controls (Z-test P = 0.24). The effect size for schizophrenia conferred by all large, rare CNVs in females (OR = 1.24, 95% CI = 1.12–1.38, P = 8.22 × 10−5) and in males (OR = 1.32, 95% CI = 1.20–1.45, P = 4.13 × 10−9) did not statistically differ (Z-test P = 0.19).

Figure 1
figure 1

Gender CNV burden in the combined dataset.

CNV burden is compared between males and females for both case and control samples. The combined CLOZUK, MGS and ISC dataset consists of 9,172 male and 4,104 female case samples and 8,807 male and 9,056 female control samples.

When deletions and duplications were analysed separately (Supplementary Table S1), we found that the contribution from large deletions to the excess of CNVs in control females was significantly greater than that from large duplications (Z-test P = 0.013). In cases, no significant difference was observed between the contribution of large deletions and duplications to the CNV excess in females (Z-test P = 0.44, Supplementary Table S1). No difference in the burden of smaller CNVs (<500 kb) was observed between males and females for cases or controls (Fig. 1, Supplementary Table S1). A breakdown of CNV burden in the datasets that constitute our case and control samples can be found in Supplementary Table S2. Despite the excess burden of large deletions in female controls, we found no significant difference between males and females for the number of genes overlapping CNVs for any class of CNV tested (Supplementary Table S3).

The burden of 11 schizophrenia risk CNVs was significantly higher in female cases than male cases (OR = 1.38, 95% CI = 1.10–1.73, P = 0.0055) but not in female controls compared with male controls (OR = 1.12, 95% CI = 0.80–1.55, P = 0.52) (Fig. 1, Supplementary Table S1). However, the difference in effect size between these two tests did not significantly differ (Z-test P = 0.15). When we compared the risk for schizophrenia conferred by the set of 11 CNV loci in females (OR = 3.56, 95% CI = 2.66–4.75, P = 2.06 × 10−18) and males (OR = 2.85, 95% CI = 2.15–3.76, P = 3.66 × 10−15), we found no significant difference in effect size (Z test P = 0.14). Hence, the non-significant test by sex in the controls might reflect lower power given that (by definition) there are fewer observations of these pathogenic CNVs in controls.

Association of specific CNVs with schizophrenia is not confounded by gender

Table 1 shows association statistics between individual CNVs and schizophrenia. These CNVs include 11 known schizophrenia risk loci, 4 additional risk CNVs that have yet to be implicated in schizophrenia with strong evidence and 1 protective CNV (see methods for detail). Controlling for gender had a negligible effect on the significance of association between specific CNVs and schizophrenia. Table S4 presents a breakdown of these associations for the individual datasets (CLOZUK, ISC and MGS).

Table 1 Effect of gender stratification on schizophrenia CNV associations.


We have carried out an analysis of gender stratified CNV burden in schizophrenia at both a genome-wide level and for previously associated loci. We restricted our analysis to the autosome as sex chromosomal CNVs were not available for all samples. We found a genome-wide excess of large, rare CNVs in females compared with males, irrespective of disease status. Similar observations have been reported in ASD and general population cohorts15,21. We also report an excess burden of CNVs at 11 schizophrenia risk loci in female cases compared with male cases, an observation that is reminiscent of an excess of deleterious variants in females with childhood neurodevelopmental disorders15. However, when males and females were analysed separately, we found no difference in effect size for risk of schizophrenia conferred by the total burden of the 11 associated loci. Moreover, association of individual CNV loci was not affected when statistical tests were used that accounted for gender. These results confirm that associations between schizophrenia and 11 specific risk CNVs and the protective 22q11.2 duplication are robust to differences in gender proportion in case and control samples. The weaker evidence for 4 additional risk CNVs cannot be attributed to gender related heterogeneity. Interestingly, despite the excess number of CNVs in females, we found no significant difference by sex in the number of genes disrupted by CNVs, suggesting that on average, CNVs are less gene-dense in females than they are in males.

At a genome-wide level, an excess female CNV burden was only observed for large CNVs (≥500 kb). However, we cannot confidently exclude a similar effect from smaller CNVs given the limited resolution to which microarrays can accurately call CNVs. When deletions and duplications ≥500 Kb were analysed separately, we found that deletions primarily accounted for the excess genome-wide CNV burden seen in female controls.

Recent findings of an excess mutational burden in females provide support for the view that females are relatively protected from neurodevelopmental disorders15,17,18,19,22,23. Schizophrenia is now generally considered to be at least in part a neurodevelopmental disorder; it is known to share genetic risk alleles with autism and other neurodevelopmental disorders24,25 and there is also an excess risk of the disorder in males, albeit much more modest than it is for other neurodevelopmental disorders such as ID, ASD and ADHD. The excess burden of CNVs in females might be considered evidence for female robustness to schizophrenia but caution is required given the pattern of findings in controls. As we and others21 have shown, female controls also have higher rates of large, rare CNVs, which in general are the class of CNV that contribute to neurodevelopmental disorders. This elevation in rate of CNVs in both cases and controls could potentially point to a mechanism that goes beyond protection from disorders such as ASD and schizophrenia: for example, protection from female foetal loss or premature death (pre- or post-natal). However, to address this question requires truly representative population cohorts as the gender CNV bias in controls might be an artefact resulting from female protection from conditions (including neurodevelopmental disorders) that would normally result in exclusion from control cohorts.

It is clear that CNV burden differs between males and females in both patients diagnosed with neurodevelopmental disorders, such as schizophrenia and also in the general population. However, our data indicate that variation in gender CNV burden does not impact on known schizophrenia-CNV associations. We provide further evidence that in both case and control samples, the rate of deleterious CNVs is greater in females than in males, although investigating the reason for this will require large population cohorts.


Study samples and QC

Case and control CNVs were derived from three published samples: CLOZUK6, the ISC4 and the MGS26. A full description of these samples, the arrays they were genotyped on and CNV calling procedures can be found in the original publications4,6,26. Quality control was performed to remove low quality samples (see original publications) and details of those samples that passed quality control in each study can be found elsewhere6. We excluded samples of unknown gender (sex chromosome molecular genetic data were not available for all samples) and putative CNVs < 15 kb in size and/or covered by <15 probes. After filtering, a total of 31,139 individuals were included in the current study (17,959 from CLOZUK and its corresponding controls, 6,600 from MGS and 6,580 from the ISC, Table 2). All CNVs have a population frequency <1%.

Table 2 Female and male samples included in the schizophrenia datasets.

Selection of specific schizophrenia-associated CNVs

We tested the burden of 11 schizophrenia risk CNV loci that showed significant association with schizophrenia in our previous publication6; deletions at 1q21.1, NRXN1 (exonic CNVs only), 3q29, 15q11.2, 15q13.3 and 22q11.2; duplications at 1q21.1, 16p11.2 and 16p13.11, the Prader-Willi/Angelman syndrome (PWS/AS) region and Williams-Beuren syndrome (WBS) region. In our analysis of individual CNV loci, we tested these 11 schizophrenia risk CNVs, along with previously implicated risk CNVs for which the overall evidence is less strong6, postulating some of the variable evidence here might be due to gender related heterogeneity. These included duplications of VIPR2, deletions of 17p12, 17q12 and distal 16p11.2. We also tested the impact of gender on protective effects for 22q11.2 duplication7.

Statistical analysis

We tested for CNV burden (≥500 kb; <500 kb) and burden of 11 schizophrenia risk loci in females versus males by comparing the change in deviance between the following logistic regression models (R glm function, family = binomial(“logit”)) using a two-sided test (ANOVA). Odds ratios (ORs) were derived from the correlation coefficients for number of CNVs.

  1. 1

     logit (pr (gender)) ~ number of CNVs + study source + microarray platform

  2. 2

     logit (pr (gender)) ~ study source + microarray platform

To test whether females had more genes disrupted by CNVs than males, a similar analysis was performed where ‘number of CNVs’ in the above model was replaced with ‘number of genes overlapping CNVs’.

To test whether the risk of schizophrenia conferred by large CNVs (≥500 kb) or the set of 11 specific schizophrenia CNVs is significantly different for females and males, we compared the effect size in females with the effect size in males using the coefficients and standard errors (Z-test, see below) derived from the following logistic regression model (R glm function, family = binomial(“logit”)):

logit (pr(case)) ~ number of schizophrenia CNVs + study source + microarray platform

We note that almost identical P values were obtained when the above model was replaced with Firth’s procedure27 (data not shown).

Differences in effect size between tests were evaluated with a Z-test, which was constructed using the equation (1)

where β is the coefficient for the variable of interest taken from the regression model and σ is that variable’s standard error. P values were generated by comparing the absolute of the Z statistic against the cumulative distribution function of the standard normal distribution.

To assess whether a difference in gender proportion between cases and controls confounds association between schizophrenia and specific CNVs, we calculated individual locus association statistics using a Fisher’s Exact test (two-sided) and a Cochran-Mantel-Haenszel Exact test (two-sided) that was stratified by gender.

All statistical tests were conducted using R Statistical Software (

Additional Information

How to cite this article: Han, J. et al. Gender differences in CNV burden do not confound schizophrenia CNV associations. Sci. Rep. 6, 25986; doi: 10.1038/srep25986 (2016).