Schizophrenia (SCZ) is a severe and debilitating neuropsychiatric disorder with an estimated heritability of ~80%. Recently, de novo mutations, identified by next-generation sequencing (NGS) technology, have been suggested to contribute to the risk of developing SCZ. Although these studies show an overall excess of de novo mutations among patients compared with controls, it is not easy to pinpoint specific genes hit by de novo mutations as actually involved in the disease process. Importantly, support for a specific gene can be provided by the identification of additional alterations in several independent patients. We took advantage of existing genome-wide single-nucleotide polymorphism data sets to screen for deletions or duplications (copy number variations, CNVs) in genes previously implicated by NGS studies. Our approach was based on the observation that CNVs constitute part of the mutational spectrum in many human disease-associated genes. In a discovery step, we investigated whether CNVs in 55 candidate genes, suggested from NGS studies, were more frequent among 1637 patients compared with 1627 controls. Duplications in RB1CC1 were overrepresented among patients. This finding was followed-up in large, independent European sample sets. In the combined analysis, totaling 8461 patients and 112 871 controls, duplications in RB1CC1 were found to be associated with SCZ (P=1.29 × 10−5; odds ratio=8.58). Our study provides evidence for rare duplications in RB1CC1 as a risk factor for SCZ.
Schizophrenia (SCZ) is a severe neuropsychiatric disorder characterized by impaired thinking, emotions and behavior. Based on a meta-analysis of published twin studies, its heritability was estimated to be ~80%.1 Last year, two studies were published reporting results from exome-wide next-generation sequencing of SCZ patients and their parents.2,3 Both studies implicated de novo mutations as increasing susceptibility to SCZ. Girard et al.2 sequenced the exomes of 14 patients with SCZ and their parents, whereas Xu et al.3 analyzed the exomes of 53 patients with SCZ, 22 unaffected individuals and their parents. Girard et al.2 detected a higher frequency of de novo mutations among patients than expected, identifying a total of 15 de novo mutations in eight patients. Xu et al.3 identified 40 de novo mutations in 27 patients and showed that these were likely to affect protein structure and function. The large number of genes reported to carry de novo mutations, together with the very low frequency of mutations among the patients, makes it difficult to implicate specific genes in disease pathogenesis. Not every gene hit by a de novo mutation is necessarily involved in the development of SCZ. Therefore, genetic studies in independent samples are warranted, as identification of additional alterations in patients will provide important support for specific genes. It is logical for these follow-up studies to include the investigation of copy number variants (CNVs), as it is known that in many human disease-associated genes, both deletions and duplications contribute substantially to the mutational spectrum. Furthermore, CNVs, including deletions in 1q21.1,4, 5, 6 15q11.24,7 and 15q13.3,4, 5, 6 and duplications at 7q36.36,8 and 16p11.2,6,8,9 have been implicated as risk factors for SCZ. In this study, we took advantage of an existing genome-wide single-nucleotide polymorphism (SNP) array data set to screen 1637 patients with SCZ or schizoaffective disorder and 1627 controls for the presence of CNVs in genes reported to carry a de novo mutation. Our top finding was followed-up in additional 6824 patients and 111 244 controls.
Materials and methods
The study was approved by the ethics committees of all study centers. Each participant provided written informed consent before inclusion, and all aspects of the study complied with the Declaration of Helsinki. All individuals were of German descent according to self-reported ancestry.
A total of 1831 patients were recruited from consecutive admissions to psychiatric inpatient units. A lifetime ‘best estimate’ diagnosis10 of SCZ or schizoaffective disorder, according to DSM-IV criteria,11 was assigned on the basis of a Structured Clinical Interview12 or the OPCRIT,13 medical records and family history.
In addition, 1643 controls were included; a detailed phenotypic description of these is provided elsewhere.14
Our top finding was followed-up using several independent samples: (i) 3111 patients and 2267 controls from the International Schizophrenia Consortium;4 (ii) 1564 patients and 6944 controls from the Welcome Trust Case Control Consortium 2; (iii) 604 patients and 497 controls from Munich, Germany4,15 (Munich I–III); (iv) 834 Dutch patients and 672 controls;16 and 711 patients and 100 864 population-based controls from deCODE genetics. Details are provided in Supplementary Table 1.
Genotyping, CNV detection and quality control
Venous blood samples were drawn and genotyped using Illumina BeadArrays HumanHap550v3, Human610-Quadv1 and Human660W-Quad (Illumina, San Diego, CA, USA). Only those markers shared by all three chips were used for CNV detection.
To avoid technical artifacts in CNV calling, stringent quality control criteria were applied before computational CNV prediction. BeadArray data were analyzed with QuantiSNP (version 2.1, http://www.well.ox.ac.uk/QuantiSNP)17 and PennCNV (version 01 May 2010, http://www.openbioinformatics.org/penncnv/)18. A detailed description of quality control measures applied and the CNV detection protocol are provided in Degenhardt et al.14
Information regarding genotyping is provided in Supplementary Table 1.
Identification of CNVs in genes reported to carry a de novo mutation
We analyzed our discovery sample for the presence of CNVs in 55 genes reported to carry a de novo mutation in patients with SCZ in whole-exome studies2,3 (Supplementary Table 2). The transcription start and end position of each RefSeq gene was determined according to NCBI build 36, using the UCSC Genome Browser (http://genome.ucsc.edu/cgi-bin/hgGateway). In order to be considered for downstream analyses, the CNV had to fulfill the following criteria: (i) lie within 20 kb up- or downstream of the boundaries of the longest isoform in a RefSeq gene; (ii) be detected by both QuantiSNP and PennCNV; (iii) span 10 consecutive SNPs; and (iv) have a log Bayes Factor (lBF; QuantiSNP) or confidence value (PennCNV) of 10. In addition, only genes in which at least two patients carried a CNV were taken forward for further analysis. Data from the X-chromosome were not analyzed.
As different CNV detection platforms were used for the various samples (with unequal spacing of the markers between the platforms), the CNVs in the follow-up sample were not filtered based on the number of consecutive SNPs, rather on the size of the genomic region they spanned. In order to be included in our study, CNVs had to fulfill the following criteria: (i) be 100 kb in size and (ii) lie within 20 kb up- or downstream of the boundaries of the longest isoform in a RefSeq gene.
Technical verification of predicted CNVs
All CNVs detected in the discovery sample were visually inspected in GenomeStudio (v2011.1, http://www.illumina.com/software/genomestudio_software.ilmn). For CNV verification, we used TaqMan Copy Number Assays (Applied Biosystem, Foster City, CA, USA). For CNVs in RB1-inducible coiled-coil 1 (RB1CC1), we used three pre-designed assays (Hs02263567_cn, Hs06225510_cn and Hs06195263_cn) and for CNVs in OR4C46, we used two pre-designed (Hs04402646_cn and Hs03290371_cn) and one custom-made assay. Copy numbers were calculated using the ΔΔCt method implemented in CopyCaller Software (v1.0, http://www.appliedbiosystems.com/support/software/copycaller/). CNVs in DGCR2 were not subject to technical verification.
Statistical analyses of CNVs
To test for an association between SCZ and CNVs in the selected candidate genes, P-values and odds ratios (ORs) were calculated with a two-sided Fisher's exact test using R version 184.108.40.206 The P-values for CNVs in RB1CC1 in the discovery–follow-up sample were calculated using both the Fisher's exact test as well as the Cochran–Mantel–Haenszel (CMH) test.
After quality control, intensity data from 1637 patients and 1627 controls from the discovery sample were available, and both QuantiSNP and PennCNV detected CNVs in seven different genes (Supplementary Table 3); however, only CNVs in three genes fulfilled all of our filter criteria and were detected in at least two patients in the discovery sample. These were: (i) duplications at chromosome 8q11.23 affecting RB1CC1 (five patients and one control); (ii) duplications at chromosome 11p11.2 affecting OR4C46 (five patients and two controls); and (iii) deletions affecting DGCR2 on chromosome 22q11.2 (two patients).
CNVs spanning RB1CC1 were successfully verified in our patients. One control carried a duplication in RB1CC1. No DNA was available from this control to allow technical verification; however, based on the 100% technical verification rate in the patients’ samples, the duplication in the control was also presumed to be genuine. We were not able to unambiguously verify the duplications in OR4C46 and they were, therefore, removed from our data set. Deletions spanning DGCR2 were not technically verified for two reasons: (i) based on their size (>500 SNPs; >2.5 Mb), the predicted CNVs were highly likely to be genuine; and (ii) the CNVs overlapped 80% with a chromosome 22q11.2 deletion previously reported to be associated with SCZ.4,6,8,20
Duplications affecting RB1CC1
In the discovery sample, duplications at chromosome 8q11.23 affecting the gene RB1CC1 were detected in five patients (0.3%) and one control (0.06%) (P-value=0.218; OR=4.98; 95% confidence interval (CI): 0.56 – 235.50). The six CNVs in this region had different putative break points and spanned 12–66 consecutive SNPs (Figure 1). This CNV was followed-up in additional 6824 patients and 111 244 controls. Among those individuals, duplications in four additional patients and 13 controls were identified (P-value=0.015; OR=5.02; 95% CI: 1.19–16.25) (Figure 1).
In the combined analysis (discovery+follow-up samples), duplications in RB1CC1 were significantly associated with SCZ (Fisher’s exact test: P-value=1.29 × 10−5; OR=8.58; CMH: P-value=0.058; OR=4.86; Table 1). Owing to their large number, the Icelandic deCODE controls could potentially lead to a bias. After excluding them, duplications in RB1CC1 were still associated with SCZ (Fisher’s exact test: P-value=0.035; OR=4.26; CMH: P-value=0.049; OR=5.29; Table 1).
The Database of Genomic Variants (beta version April, 2012, http://dgvbeta.tcag.ca/dgv/app/home?ref=NCBI36/hg18)21 is a catalog of human genomic structural variation containing CNV data from 37 healthy control studies. The Copy Number Variation project at the Children's Hospital of Philadelphia lists CNVs from 2026 healthy children, all genotyped on Illumina BeadArrays HumanHap550v3 (http://cnv.chop.edu/).22 Neither Database of Genomic Variants nor Children's Hospital of Philadelphia contained any duplications affecting RB1CC1.
In addition, we checked the interactive web-based database DECIPHER (Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources; http://decipher.sanger.ac.uk/about) for CNVs in RB1CC1. This database collects deletions and duplications identified in patients with developmental disorders and/or congenital malformations. In DECIPHER, seven individuals with a duplication and one patient with a deletion affecting RB1CC1 were listed. In patient 256062, the identified duplication spanned only RB1CC1 and did not affect any other gene. No additional CNV was detected in the genome of this patient. The patient was diagnosed with mild developmental delay. The CNV was inherited from the mother who was not reported to have a neuropsychiatric phenotype. The clinician in charge of this patient provided us with details of a second patient who had a duplication affecting only RB1CC1 and who is currently not listed in DECIPHER. This patient had a severe intellectual disability, facial dysmorphism and cortical gyration anomalies.
Deletions spanning DGCR2
In our discovery data set, we observed two patients with a deletion in the chromosomal region 22q11.21, spanning the gene DGCR2. Both deletions shared identical break points, spanned 504 SNPs and overlapped 80% with the typical larger 2.5 Mb deletion that is associated with SCZ.4,6,8,20 As the frequency of this deletion among patients with SCZ is well established, we did not aim for an additional replication of this CNV.
CNVs not overrepresented in patients
In four genes, we identified CNVs that passed all our quality filter criteria, but were only identified in a single patient. CNVs in ALS2CL, CASP4, PIK3CB and TRAK1 were, therefore, not analyzed further (Supplementary Table 3). These CNVs were not subject to technical verification but were visually confirmed in GenomeStudio.
Phenotypic characterization of RB1CC1 CNV carriers
Databases containing information regarding each patient carrying a RB1CC1 duplication were searched for data concerning the patient’s medical and family histories. As duplications in RB1CC1 have previously been described in children with intellectual disability,23 we gathered additional information regarding the highest educational qualification obtained by the patients. All patient RB1CC1 duplication carriers were diagnosed with SCZ. The pseudonymization of the patients (A–E) corresponds to Figure 1. Patient A is a 27-year-old male with a diagnosis of chronic residual type SCZ (DSM-IV: 295.30; age-at-onset (AAO), 17 years). He is single and lives alone. He attended school for 10 years, passed his final examinations and is currently unemployed. His disease course has been chronic and deterioration is evident. Lifetime symptoms include paranoia, delusions, anxiety, affective symptoms and suicidality. His symptoms have responded poorly to medication including clozapine. He has no history of pre-morbid cognitive impairment or somatic disorder. Patient A reports that a maternal aunt committed suicide, but that there are no known cases of SCZ in the family.
Patient B is a 33-year-old female with a diagnosis of disorganized SCZ and paranoia (DSM-IV: 295.10; AAO, 23 years). She left school without qualifications and failed to complete her job apprenticeship. She has since worked in unskilled positions. At the time of the interview, she had no prior history of inpatient treatment. Treatment with haloperidol lead to a fast reduction of positive symptoms, but to no improvement of negative symptoms. Extensive information on neurocognitive performance is available for patient B, and this indicates severe cognitive impairment (Supplementary Figure 1). Patient B reports an unremarkable family history.
Patient C is a 29-year-old female with a diagnosis of SCZ, disorganized type (DSM-IV: 295.10; AAO 25 years). She is currently going through a divorce. She attended school for 10 years. She failed to complete any form of apprenticeship and is currently employed. Her lifetime symptoms include affective symptoms, social anxiety, paranoia and auditory command hallucinations. She has a history of attempted suicide in response to these commands. Patient C has experienced multiple episodes of illness, which have shown only partial response to high-dose clozapine and benperidol. She has no history of somatic disorder or pre-morbid cognitive impairment. A maternal and a paternal aunt were diagnosed with SCZ, and her mother has major depressive disorder.
Patient D is a 57-year-old divorced male with a diagnosis of paranoid SCZ (DSM-IV: 295.10; AAO, 41 years). He attended school for 10 years, completed an apprenticeship and is now a self-employed craftsman. Patient D has experienced multiple episodes of illness with partial remission. During acute phases, he appears cognitively intact and social withdrawn, and experiences delusions of persecution and guilt. He also displays pronounced psychomotor tension and affective symptoms but no suicidality. The patient has limited insight and poor treatment compliance. He also has subacute eczema. He has no history of pre-morbid cognitive impairment. Two maternal aunts received treatment for unspecified psychiatric disorders.
Patient E is a 69-year-old female with a diagnosis of chronic residual type SCZ (DSM-IV: 295.30; AAO, 29 years). She was married and employed before the disorder started. She has no history of poor pre-morbid work and social adjustment or any pre-morbid personality disorder. Her disease course was characterized by several episodes with partial remission between episodes, and deterioration is evident. Lifetime symptoms include paranoia, delusions, auditory hallucinations, formal thought disorder, bizarre behavior, and blunted and inappropriate affect. Psychotic symptoms dominate the clinical picture, although occasional affective disturbance occurs. Her symptoms have responded to medication. The patient reports a family history of psychiatric disorders, but that there are no known cases of SCZ in the family. The results from the quantitative PCR indicated a mosaicism regarding the duplication.
These patients have SCZ, and experience severe paranoia and hallucinations. An affective component is also evident. They have shown only partial or no response to treatment. As a result, they have experienced either only partial remission between episodes or gradual deterioration. For patient B, extensive information on her neurocognitive performance was available (Supplementary Figure 1).
The examination of the neurocognitive profile (after being diagnosed with SCZ) of the latter patient indicated severely impaired cognitive functions compared with both a healthy control group and a group of SCZ patients, while not fulfilling the DSM-IV criteria for intellectual disability. Based on the educational history of this patient, it is likely that the cognitive deficits were already present before the onset of the neuropsychiatric disorder.
No phenotypic information was available for the duplication carriers identified in the follow-up samples.
We identified an association between rare duplications in RB1CC1 and SCZ in 8461 patients and 112 871 controls (OR=8.58). The brain expressed gene RB1-inducible coiled-coil 1 (OMIM *606837) is located in chromosomal region 8q11.23. So far, information regarding the biological function of RB1CC1 is limited. It has been implicated in cell cycle progression,24 cell growth, cell proliferation, cell survival, cell spreading/migration25 and neurodegeneration.26 In vitro, RB1CC1 insufficiency or dysfunction has been shown to cause neuronal cell atrophy and death.27 Wang et al.28 demonstrated in mice that the deletion of FIP200 (also known as Rb1cc1) caused a progressive loss of neural stem cells. Furthermore, in the postnatal brain of the mice carrying the FIP200 ablation, the neuronal differentiation was dysfunctional (Wang et al.28).
Applying exome-sequencing, Xu et al.3 identified a frameshift deletion in RB1CC1, which was predicted to be damaging based on predictions using PolyPhen. We hypothesize that both the frameshift deletion and the duplications lead to a change in gene dosage. As we did not have gene expression data for our CNV carriers, we were unable to examine the effect of the duplication on RB1CC1 gene expression.
Searching the literature, we discovered one additional study on neuropsychiatric phenotypes that reported duplications in RB1CC1. Cooper et al.23 created a CNV morbidity map of developmental delay based on data derived from 15 767 children with intellectual disability and/or developmental delay and various congenital anomalies. In total, 10 duplications, but no deletion in RB1CC1, were identified among these patients. Only one of 8329 controls carried a duplication in this gene (OR=5.29).23 The frequency observed in controls is consistent with the frequency in our study, providing further support for the validity of our finding. In the literature, there are several examples of specific CNVs conferring risk to both SCZ and intellectual disability, for example, deletions at chromosome 1q214,5,29 and duplications at chromosome 16p13.1.30,31
Patient B showed severe cognitive impairment, although there was only sparse or no information available on the cognitive performance of the other patients. The available information did not indicate any profound pre-morbid cognitive dysfunction; however, it is of note that, in the discovery sample, any patient with intellectual disability would have been excluded from the study.
Using a candidate gene approach, we were able to successfully identify an association of SCZ with duplications that have not previously been reported in the context of this disease. The failure of previous studies to detect this association may be explained by the rarity of the duplication of RB1CC1 and hence the limited power of their investigated samples.16
We were unable to unambiguously verify the CNVs detected in OR4C46 in the discovery sample. This is not surprising for two reasons: first, the CNVs were smaller in size (<20 kb) and their calls therefore less reliable compared with larger variants; and second, the gene is located close to the centromere. Previous studies have excluded this genomic region from CNV analyses, as it is known that the centromere is prone to producing false-positive CNV results.6,15,32 Therefore, we removed the duplications in OR4C46 from our analysis.
Adjustment for population stratification or sample-specific effects was difficult. Typically, a CMH test or a logistic regression with sample indicator covariates would be well suited for this purpose. In our case, however, their application was difficult as most sub-samples contained empty cells and, as a consequence, had OR estimates either 0 or infinite. For this reason, integration of the samples in a meta-analysis fashion was also not possible.
Furthermore, we followed-up on exome-sequencing data derived from 67 trios only. Among these individuals, de novo mutations were identified in 55 genes. Not every gene hit by a de novo mutation is necessarily involved in the development of SCZ. Therefore, it is quite likely that the risk gene list used for this study contained false positives. At the same time, many true risk genes have not been identified yet, as only a very limited number of individuals were included in the exome-sequencing studies, so far.
This study provides the first evidence that rare duplications in RB1CC1 are associated with SCZ. This interesting candidate gene was first implicated based on exome-sequencing data. Our study provides further evidence for the involvement of this gene in the development of SCZ. In order to better understand the mutational spectrum in this gene, more CNV and exome-sequencing studies in larger samples are warranted. Not least, functional studies are needed to obtain insights into pathophysiological consequences of the identified mutations.
We thank all of the patients for participating in this study. We are grateful to Professor H-E Wichmann for supplying the SNP-chip data from the KORA control cohort and Professor S Schreiber for providing access to the SNP-chip data from the PopGen control cohort. We thank all of the probands from the Heinz Nixdorf Recall (HNR) study and the MooDS Imaging controls. This study makes use of data generated by the DECIPHER Consortium. A full list of centers who contributed to the generation of the data is available from http://decipher.sanger.ac.uk and by email from email@example.com. Funding for the project DECIPHER was provided by the Wellcome Trust. This study was supported by the German Federal Ministry of Education and Research (BMBF) through the Integrated Genome Research Network (IG) MooDS (Systematic Investigation of the Molecular Causes of Major Mood Disorders and Schizophrenia; grant 01GS08144 to MMN, SC and HW, grant 01GS08147 to MR, under the auspices of the National Genome Research Network plus (NGFNplus). MMN also received support from the Alfried Krupp von Bohlen und Halbach-Stiftung. MR was also supported by the 7th Framework Programme of the European Union (ADAMS project, HEALTH-F4-2009-242257; CRESTAR project, HEALTH-2011-1.1-2) grant 279227. IN was supported by IZKF Jena (Junior Scientist Grant). In 2007 and 2008, Heinrich Sauer received fees for board membership by Lilly and Otsuka, a travel grant by Lilly and speaker's honoraria by Pfizer and Lilly, moreover, fees for a consultancy of Wyeth. The Heinz Nixdorf Recall cohort was established with the support of the Heinz Nixdorf Foundation. The WTCCC2 schizophrenia analysis was funded by the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z) and the Wellcome Trust (072894/Z/03/Z, 090532/Z/09/Z and 075491/Z/04/B). We acknowledge use of the British 1958 Birth Cohort DNA collection funded by the Medical Research Council (grant G0000934) and the Wellcome Trust (grant 068545/Z/02), the UK National Blood Service controls funded by the Wellcome Trust. These funding sources had no involvement in the study design, the collection, analysis and interpretation of data, the writing of the report or the decision to submit the paper for publication. We thank deCODE for sharing frequencies of the RB1CC1 duplications in their large genotyped sample of cases and controls.
Genetic Risk and Outcome in Psychosis (GROUP) Consortium Members:
René S Kahn1, Don H Linszen2, Jim van Os3, Durk Wiersma4, Richard Bruggeman4, Wiepke Cahn1, Lieuwe de Haan2, Lydia Krabbendam3 and Inez Myin-Germeys3
1Department of Psychiatry, Rudolf Magnus Institute of Neuroscience, University Medical Center Utrecht, Utrecht, The Netherlands; 2Academic Medical Centre University of Amsterdam, Department of Psychiatry, Amsterdam, The Netherlands; 3Maastricht University Medical Centre, South Limburg Mental Health Research and Teaching Network, Maastricht, The Netherlands and 4University Medical Center Groningen, Department of Psychiatry, University of Groningen, Groningen, The Netherlands.
About this article
Supplementary Information accompanies the paper on the Translational Psychiatry website (http://www.nature.com/tp)