Introduction

Autism spectrum disorder (ASD) represents a group of childhood-onset neurodevelopmental disorders characterized by abnormal social interactions, impaired verbal and nonverbal communication, and the presence of restricted interests and repetitive behaviors with long-term persistence of core features and functional impairment1,2. The prevalence of ASD is various across different regions with an increasing trend over the years3,4,5 and with male excess in a male-to-female ratio of approximately 5:16,7,8. In the USA, the prevalence of ASD increased in the past decade according to the report of Centers for Disease Control and Prevention of USA8. It was estimated that around 1 in 68 persons aged eight years in the USA in 2010 was affected with ASD8. However, the increasing trend of ASD prevalence was not observed in the UK9. The estimated prevalence of ASD in Chinese population ranged from 2.8 to 29.5 per 10,000 according to a recent review that summarized the findings in Chinese population from several areas3. The prevalence of ASD in Taiwan is approximately 0.3% based on the analysis of national health insurance research dataset10 and 1% based on the most recent Taiwan’s national survey of child and adolescent mental disorders11 with a male: female ratio of approximately 4: 110. Due to its high prevalence, long-term impairment resulting in a great impact on individuals, families, and society12,13 and strong evidence of genetic components in its etiology14, this severe developmental disorder has been prioritized for molecular genetic studies15.

The heritability estimate of ASD is greater than 90%, attesting that genetic factors play a major role in the pathogenesis of ASD16,17,18. However, the genetics of ASD is very complex. Several genome-wide association studies (GWAS) have identified some common single nucleotide polymorphisms (SNPs) associated with the risk of ASD, such as common variants on 20p12.1, 5p14.1, 1p13.219,20,21,22,23. These common SNPs, however, have only small effects on autism ASD risk, and of note, few if any of these SNPs were replicated in different studies. Furthermore, accumulating evidence suggests that rare genetic and genomic mutations also contribute to the genetics of ASD24. Conventional cytogenetic studies of ASD have revealed a variety of rare chromosomal abnormalities associated with ASD25,26,27,28 indicating aberrant genomic rearrangements are part of the genetic mechanism of ASD. Notably, the recent advent of array-based comparative genomic hybridization (aCGH) technology has discovered various submicroscopic copy number variations (CNVs) of genomic DNA associated with ASD29,30, leading further support to the idea that ASD is a genomic disorder in a subset of the patients. These ASD-associated CNVs are usually individually unique and of low frequency, but together they account for approximately 5–10% of idiopathic ASD29, hence, constituting a part of the genetic architecture of ASD22,31,32. The discovery of genomic mutations in ASD-associated CNVs not only helps decipher the genetic complexity of ASD23, but also helps shed some light on the neurobiology and pathogenesis of ASD32,33,34,35,36,37.

In our previous studies, we reported four pathogenic CNVs in certain ASD patients38,39, indicating that CNVs also play a role in the genetic architecture of ASD in our patients. To have a better understanding of the scope of rare genomic CNVs in our ASD patient population, we recruited a sample of more than 300 ASD patients and conducted a genome-wide CNV screening in this sample.

Results

Clinical characteristics

A total of 335 (95.7%) out of 350 cases and 1093 (98.4%) out of 1111 controls passed a series of quality control of CNV experiments. We investigated the ethnicity of cases and controls by performing principle component analysis (PCA) with SNP genotype data from all the participants of this study and the individuals included in HapMap study. The results demonstrated that the cases and controls are clustered together with the Han Chinese (Supplementary Figure 1). Therefore, the ethnicity of the participants of this study was confirmed to be the Han Chinese. Further, all the CNV data were subjected to the burden analysis. The patient group consisted of 299 boys and 36 girls with the mean age of 9.4 ± 4.0 years, while the control group consisted of 525 males and 568 females with the mean age of 68.1 ± 10.1 years. The ADI-R (Autism Diagnostic Interview-Revised) interviews revealed that the 335 patients scored 20.43 ± 6.12 in the “qualitative abnormalities in reciprocal social interaction”, 14.75 ± 4.32 in the “qualitative abnormalities in communication, verbal”, 8.19 ± 3.33 in the “qualitative abnormalities in communication, nonverbal”, and 6.95 ± 2.47 in the “restricted, repetitive and stereotyped patterns of behaviors.” All the participants with ASD were noted to have had abnormal development at or before 36 months of age. Their current average intelligence quotients (IQ) were 94.85 ± 22.55 (range, 40 to 148) for full-scale IQ, 96.74 ± 2.04 (range, 41 to 145) for performance IQ, and 95.08 ± 23.79 (range, 44 to 148) for verbal IQ. Among the 335 ASD patients, nine had been diagnosed with epilepsy (3.04%), four had been suspected of seizure (1.35%), and 19 had ever had a febrile convulsion (6.42%). These data are also provided in the Supplementary Table 1.

CNV findings

The rates of rare CNV (<1% in the patients) at autosomes and X-chromosome were examined between the patient and control groups. CNV regions on autosomes were analyzed in all samples while CNV regions on sex chromosomes were analyzed in male samples only. We found a significant excess of the overall rate of rare CNV at autosomes in the ASD patients (2.71) compared with the control subjects (0.77). The over-representation of rare autosomal CNV rate in ASD was still present when the rare CNVs were grouped into deletion and duplication or classified according to the size as <100 kb, 100–400 kb, and >400 kb (Table 1). In the analysis of rare CNV at X-chromosome, we compared only the rate of rare CNV between male patients and male control subjects. A significant excess of the overall rate of rare CNV was observed in the male ASD patients (0.214) compared with the male control subjects (0.011). The excess rate of rare CNV at X-chromosome was still present when the CNVs were stratified into deletion/duplication, or different size groups (Table 2). We did not compare the rate of rare CNV at X-chromosome between female patients and female controls, because of the random inactivation of X-chromosome in the females. Additionally, the sample size of the female patients is small (n = 36) compared to the female controls (n = 568), and the skewed proportion of female patients vs. female controls (0.11 vs. 0.52).

Table 1 Comparisons of rare autosomal CNVs in patients with autism spectrum disorders and control subjects.
Table 2 Comparisons of rare X-chromosome CNVs in male patients with autism spectrum disorders and male controls.

CNVs at “hot spots”

We compared the rare CNVs found in our patients with the selected genetic “hot spots” of ASD as reported in the paper entitled “Clinical genetics evaluation in identifying the etiology of autism spectrum disorders: 2013 guideline revisions” from the practice guideline of the American College of Medical Genetics and Genomics40. The “hot spots” was defined as CNVs that have an especially strong association with ASD according to this paper. We identified a total of 14 patients who had pathogenic CNVs located at several of the “hot spots.” The detailed information of these CNVs including the locations, sizes, and genes encompassed in the CNV region are listed in Table 3, while the clinical data of each patient are listed in the Supplementary Table 2.

Table 3 CNVs at the “hot spots” identified in this study.

Other rare pathogenic CNVs

Besides the detection of CNVs at the “hot spots” in our sample, we further identified a total of 49 rare putative pathogenic CNVs according to the “American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants”41 in our sample. The pathogenic CNV was defined as “documented as clinically significant in multiple peer-reviewed publications, even if penetrance and expressivity of the CNV are known to be variable. This category includes large CNVs, which may not be described in the medical literature at the size observed in the patient but which overlap a smaller interval with clearly established clinical significance”41. These putative pathogenic CNVs overlapped with the pathogenic CNVs reported in the Clinical Genome Resources CNVs and DECIPHER. Table 4 presents the detailed information of these CNVs including locations, types, origins, and genes encompassed. The clinical data of each patient are provided in the Supplementary Table 3.

Table 4 Other rare pathogenic CNVs identified in patients and control subjects in this study.

Two-hit CNVs

Five patients were found to have two different putative pathogenic CNVs simultaneously in this study. Patient U-2075 inherited the 4q amplification and the 5p deletion from his mother and father, respectively (Fig. 1A). Patient U-1753 had the 8q amplification and the 8p deletion transmitted from his mother and father, respectively (Fig. 1B). Patient U-1255 acquired the 10q amplification and 18p amplification from his mother and father, respectively (Fig. 1C). Patient U-1414 had the 8p amplification from his father and a de novo 9q duplication (Fig. 1D). Patient U-1999 had two de novo amplifications at 17q25.3 simultaneously (Fig. 1E). All the parents of these five patients were carefully assessed, and none of them had ASD based on the self-administered questionnaires and clinical evaluation by the corresponding author. The detailed information of these CNVs including the locations, sizes, and genes encompassed by these CNVs are listed in Table 5, and the clinical data of each patient are provided in the Supplementary Table 4.

Figure 1
figure 1

The pedigrees of five patients who carry two CNVs and the origins of these CNVs. Dup: duplication, Del: deletion.

Table 5 Locations, sizes and types of CNVs in patients with two hits.

Discussion

In this study, we compared the frequencies of rare CNVs (<1%) between 335 patients with ASD and 1093 control subjects from Taiwan. We found a significantly higher frequency of global rare CNVs in patients with ASD compared to the control group. The significantly higher frequencies of rare CNVs in the ASD group were still present when the CNVs were subdivided into different groups based on deletion/duplication or the sizes. Our data are compatible with several previous studies24,33,42. Pinto and colleagues conducted a genome-wide CNV analysis of 996 ASD individuals of European ancestry and 1,287 matched controls. They found a higher global burden of rare genic CNVs in ASD patients43. The findings were replicated by the same group in another genome-wide CNV analysis consisted of 2,446 families with ASD33. In our study, we did not limit our CNV analysis to genic CNVs only, as non-genic CNVs may have position effect to affect the expression of genes outside the CNV regions. Our findings of increased global burden of rare CNVs in ASD indicate that genomic rearrangement is one of the genetic mechanisms of ASD.

Some other studies reported increased burden of CNV in female patients. Jacquemont and colleagues reported that in a sample of 762 ASD families, they found a 3-fold increase in deleterious autosomal CNVs in female patients compared to male probands44. Desachy and colleagues recently reported that mothers of patients with autism had a higher deletion burden than control mothers in a matched case–control population. Also, to their surprise, they found a higher autosomal burden of large, rare CNVs in females in the population. They speculated that the increased rare CNV burden in females in general population might contribute to the decreased female fetal loss in the population, but the ASD-specific maternal CNV burden may contribute to high sibling recurrence45. In our study, we did not conduct the similar analysis because of the relatively small sample size of female patients.

In our CNV analysis, we identified 14 patients who had CNVs located at the ASD “hot spots” as reported in the “Clinical genetics evaluation in identifying the etiology of autism spectrum disorders: 2013 guideline revisions”40. Our findings support the strong association of the CNVs at “hot spots” with ASD in our patient population. Among these CNVs, CNVs located at 22q11.2, 22q13.3, and 15q11-13 are the most common in our sample. Besides the CNVs located at the “hot spots,” we also detected 49 rare (<1%) CNVs larger than 400 kb that overlapped with the pathogenic CNVs reported in the Clinical Genome Resources and DECIPHER. These CNVs met the criteria of “pathogenic” according to the “American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants”41. Some of these rare putative pathogenic CNVs were inherited, while some were de novo mutations. The broad distributions of both “hot spots” CNVs and rare pathogenic CNVs detected in our patients suggest extremely high genetic heterogeneity of ASD in our patients. Some studies suggested that female patients have a higher burden of CNV than male patients33,44,45. But, in our study, all the patients who had CNVs at “hot spots” were male (Table 3), and there were only two female patients out of 49 who had other rare (<1%) pathogenic CNVs (Table 4). The discrepancy might be due to the disproportion of female patients in this study (36 females vs. 299 males).

In this study, five patients were found to have the concomitant presence of two rare pathogenic CNVs in their genome. The finding is consistent with our previous report of a patient who had inherited two CNVs from his parents and supported the two-hit model of ASD39. Several studies also proposed the idea of two-hit and the multiple-hit models of ASD, suggesting that genetic underpinnings of ASD stem from combinatorial effects of mutations of oligogenic or multiple genes in different loci39,46,47,48. Leblond and colleagues reported three patients with deletions at SHANK2 gene locus. Also, these three patients had another inherited CNV at 15q11-13 that was associated with other psychiatric disorders46. Two patients carried a duplication of nicotinic receptor CHRNA7, and one patient had a deletion of the synaptic translation repressor CYFIP1 46. Stenberg and Webber conducted a pathway-association test of target genes regulated by fragile-X mental retardation protein (FMRP) in ASD patients; they found rigorous support for the multiple-hit genetic etiology of ASD47. In fact, emerging evidence suggests the presence of multiple pathogenic CNVs in psychiatric patients is not rare. Hu and colleagues recently reported a novel maternally inherited 8q24.3 and a rare paternally inherited 14q23.3 CNVs in a family with neurodevelopmental disorders49. Williams and colleagues conducted CNV analysis in patients with velo-cardio-facial syndrome (VCFS), regardless of having psychosis or not. They found a significantly higher proportion of second CNV hit in patients with psychosis, suggesting the two-hit hypothesis may be relevant to a proportion of VCFS patients with psychosis50. Rudd and colleagues found a slightly higher proportion of multiple conservative CNVs in schizophrenia patients compared to controls, indicating a potential role for a multiple-hit model in schizophrenia51. Hence, it is likely that the multiple-hit (including two-hit) might be a commonly important genetic mechanism associated with ASD. In this study, we reported 5 patients who had two putative pathogenic CNVs larger than 400 kb. We believe that if CNVs smaller than 400 Kb were included for analysis in the future, we might find more patients with two-hit and multiple-hit of CNVs. In the family study of these 5 patients with two-hit CNVs, we found that some of these putative pathogenic CNVs were inherited from their parents, and some were a de novo mutation. However, the parents who carried one putative pathogenic CNV did not manifest ASD symptoms after careful clinical evaluation, suggesting the incomplete penetrance of these inherited pathogenic CNVs. Further, we searched for these CNVs in 1093 control subjects, and found none of these CNVs in the control group, except the duplication of 18p11.31-p11.2 (483 kb), which was found in 5 out of 1093 control subjects (0.46%). These data provided further evidence to support that these CNVs may confer increased risk to ASD. In the future, we might be able to identify unaffected carriers of high-risk CNVs if more data are accumulated.

The identification of pathogenic CNVs associated with ASD may help discover candidate genes of ASD. The genes encompassed by the CNVs in our patients are listed in Tables 3,4 and 5. Notably, several genes had been reported to be associated with autism or the other major psychiatric disorders, such as TACR1 52, CNTNAP5 53,54, ADGRL3 55, ZNF827 56, POU6F2 20, KMT2E 57, KCND2 58, SND1 59, CNTNAP2 60,61, CSMD1 62,63, PARD3 64, HERC2 65, FMN1 66, and RBFOX1 67. These findings not only provide further clues to indicate the highly genetic heterogeneity of ASD but also indicate the pleiotropic clinical effects of the mutation of these genes. Accumulating evidence showed that shared heritability and genetic mutations among different categories of psychiatric disorders seem to be regular rather than exceptional. A study of analyzing the genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, ASD and attention-deficit/hyperactivity disorder (ADHD) revealed moderate to high shared genetic etiology of these psychiatric disorders68. Li and colleagues recently reported that the prevalence of de novo mutations shared by four different categories of neuropsychiatric disorders: autism spectrum disorder, epileptic encephalopathy, intellectual disability, and schizophrenia, was significantly elevated69.

The present study has several limitations. First, to the best of our knowledge, our study has the largest sample size of Chinese population compared to the other studies70,71. However, the relatively limited sample size of this study is an apparent limitation to have a more comprehensive picture of CNVs in our patient population, especially in consideration of the high heterogeneity of CNVs associated with ASD. Second, in this study, we only searched for pathogenic CNVs larger than 400 kb because larger CNVs are more likely to be pathogenic72. We understand that small CNVs can be pathogenic. Hence, further analysis of the CNVs smaller than 400 kb will discover more ASD-associated CNVs in our patients. Third, the phenotypical interpretation of pathogenic CNVs found in our study is not straightforward given the incomplete penetrance, varied expressivity and pleiotropic effects of pathogenic CNVs identified in our sample. Also, we cannot exclude the interaction of these CNVs with the genetic background and other yet to be identified genetic or genomic mutations in the affected patients.

In conclusion, we found a significantly increased global burden of rare CNVs in our ASD patients compared to the control subjects, indicating that rare CNVs play a part in the genetic landscape of ASD in our population. Also, we identified several pathogenic CNVs at “hot spots” and various private putative pathogenic CNVs in our patients, suggesting high genetic heterogeneity of ASD in our patients. Our study also supports that high-resolution oligonucleotide SNP array is a useful tool to uncover the genetic underpinnings of patients with ASD. In the future, we will continue to analyze our data with the size of CNV smaller than <400 kb, and we expect to find more pathogenic CNVs associated with ASD, more candidate genes of ASD, and more patients with double-hit of CNV. Thus, we will have a better understanding of the genetic architecture of ASD in our population.

Materials and Methods

Participants and Procedures

The study protocol was approved by the Research Ethics Committee at National Taiwan University Hospital (approval number: 9561709027), Taipei, Taiwan, and Chang Gung Memorial Hospital-Linkou (approval number: 93-6244), Taiwan, for the recruitment of the patients with ASD, and Academia Sinica (approval number, AS-IBMS-MREC-91-10), Taiwan for the control group. All the experiments and informed consent procedures were performed in accordance with relevant guidelines and regulations set by the research ethics committees of the three institutes.

Patient Participants with ASD

The study was part of the molecular genetics study of patients with ASD who were Han Chinese residing in Taiwan. The detailed recruitment and evaluation of the patients and their family members were described in our previous publication73. In brief, patients aged 3 to 17 years old and met the clinical diagnosis of autistic disorder as defined by the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV)74 were recruited from the Children’s Mental Health Center, National Taiwan University Hospital, Taipei, Taiwan and Department of Psychiatry, Chang-Gung Memorial Hospital, Kuei-Shan, Taiwan. The clinical diagnoses were made by board-certified child psychiatrists experienced in the assessment and intervention for ASD and were further confirmed by interviewing the parents using the Chinese version of the Autism Diagnostic Interview-Revised (ADI-R)75,76. The ADI-R, translated into Mandarin by Gau and colleagues, was approved by Western Psychological Services in 2007 as the ADI-R in the Chinese language75,77. All these patient participants further received clinical evaluation according to the DSM-5 diagnostic criteria for ASD1, which revealed that all the 350 participants with DSM-IV autistic disorder met the diagnosis of DSM-5 ASD. Moreover, all these patient participants received intelligence tests. For ages of 3 to 7.5 years, the Wechsler Primary and Preschool Scale of Intelligence-Revised (WPPSI-R) was given; for ages of 6 to 16 years 11 months, the Wechsler Intelligence Scale for Children-3rd Edition (WISC-III) was given; for ages of 16 years and above, Wechsler Adult Intelligence Scale (WAIS) was given.

Patients with known chromosomal abnormalities and associated medical conditions including fragile X syndrome and Rett’s disorder based on DNA testing or clinical assessments were not included during the recruitment process28,78. Also, probands with previously identified chromosomal structural abnormality associated with autism or had any other major neurological or medical conditions were also excluded28. The study protocol was approved by the Research Ethics Committee of National Taiwan University Hospital (approval number, 9561709027) and Chang Gung Memorial Hospital (approval number, 93-6244), Taiwan. Written informed consents were obtained from the participants (if applicable, otherwise, child assent) and their parents after the procedures were fully explained. Genomic DNA was prepared from peripheral blood of each participant using Gentra Puregene Blood kit according to the manufacturer’s instructions (Qiagen, Hilden, Germany).

Healthy control subjects

The control subjects (n = 1111) were chosen from the Han Chinese Cell and Genome Bank (HCCGB) in Taiwan who received physical check-up and questionnaire screening to ensure that they did not have any abnormal physical condition and mental illness79. Written informed consents were obtained from the participants after the procedures were fully explained.

CNV analysis

We used Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) for genome-wide CNV screening. The SNP 6.0 array contains more than 1.8 million markers including more than 906,600 probes for SNPs and more than 946,000 probes for CNVs. These probes are evenly distributed across the whole genome with a median distance between probes of ~0.7 kb. The microarray experiment was conducted by the National Genotyping Center (Academia Sinica, Taipei, Taiwan) (http://ncgm.sinica.edu.tw/ncgm_02/index.html). The hybridization intensities were captured by GeneChip Scanner 3000 (Affymetrix, Santa Clara, CA). CNVs were called using Affymetrix Genotyping Console software v.4.1 (Affymetrix, Santa Clara, CA). The average call rate was 99.49 ± 0.29%, and all samples passed genotyping quality control (call rate >= 95%). The gender was called based on the cn-probe-chrXY-ratio_gender method from Affymetrix Power Tools (Affymetrix, CA, USA). Samples with mismatched gender between computed gender and case information were excluded from analysis. Duplicated samples detected by Kinship analysis using P-Link software were also excluded. Twenty contiguous deletion or duplication probe signals were called in this study. CNV regions overlapped with centromeric regions (hg19, UCSC), antibody variable regions (PennCNV, http://www.openbioinformatics.org/penncnv/penncnv_faq.html#ig) and T-cell receptor loci (NCBI Gene, http://www.ncbi.nlm.nih.gov/gene/) were filtered out. The CNVs with the size equal or larger than 10 Kb and with the frequency of less than 1% in the patients were selected for analysis in this study. Copy number variations were considered to localize at the same locus if they overlapped by at least 80% of their length. Genes overlapped with the CNV regions were reported according to UCSC genes (NCBI37/hg19). The ethnicity of cases and controls was assessed by performing principle component analysis (PCA) with SNP genotype data from all the participants of this study and the individuals included in HapMap study.

Burden assay

Both genic and non-genic CNVs were included for analysis. Likelihood Ratio Chi-square test was used to compare the difference of CNV rate between ASD and healthy controls with a pre-selected alpha value at P value less than 0.05. Bonferroni correction was used to adjust for the multiple testing. Thus, the significance level of the p-value was set at 0.005.

Real-time quantitative PCR (RT-qPCR)

RT-qPCR was used to validate the CNVs detected in this study and for a family study to identify their parental origin. RT-qPCR was performed using the SYBR-Green PCR reagents kit (Applied Biosystems, Forster City, California, USA), and the CNV was assessed using a relatively standard method in the laboratory38. The experiment was implemented using the ABI StepOnePlus following the manufacturer’s protocol (Applied Biosystems, Forster City, California, USA). The description of primer sequences, optimal annealing temperature, and the amplicon sizes is available upon request.

Pathogenic CNV evaluation

The pathogenic CNVs were evaluated according to two practice guidelines from the American College of Genetics and Genomics. First, according to the “Clinical genetics evaluation in identifying the etiology of autism spectrum disorders: 2013 guideline revisions”40, our CNVs results overlapped with those at the “hot spots” reported in this guideline are considered as pathogenic. Second, for rare CNVs outside the “hot spots,” we focused on the analysis of rare CNVs equal or larger than 400 kb for convenience’s sake. Although large CNVs are more likely to have clinical significance, we understand that small CNV can be pathogenic, and large CNV can be benign. CNVs overlapped with the pathogenic CNVs reported in the Clinical Genome Resources (https://www.clinicalgenome.org/), or DECIPHER (https://decipher.sanger.ac.uk/) were defined as pathogenic according to the “American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants”41. The sizes of CNVs and genes encompassed by the CNVs were generated according to the gene annotation of the UCSC genome browser (GRCh37/hg 19) (http://genome.ucsc.edu/cgi-bin/hgGateway).