High resolution analysis of rare copy number variants in patients with autism spectrum disorder from Taiwan

Rare genomic copy number variations (CNVs) (frequency <1%) contribute a part to the genetic underpinnings of autism spectrum disorders (ASD). The study aimed to understand the scope of rare CNV in Taiwanese patients with ASD. We conducted a genome-wide CNV screening of 335 ASD patients (299 males, 36 females) from Taiwan using Affymetrix Genome-Wide Human SNP Array 6.0 and compared the incidence of rare CNV with that of 1093 control subjects (525 males, 568 females). We found a significantly increased global burden of rare CNVs in the ASD group compared to the controls as a whole or when the rare CNVs were classified by the size and types of CNV. Further analysis confirmed the presence of several rare CNVs at regions strongly associated with ASD as reported in the literature in our sample. Additionally, we detected several new private pathogenic CNVs in our samples and five patients carrying two pathogenic CNVs. Our data indicate that rare genomic CNVs contribute a part to the genetic landscape of our ASD patients. These CNVs are highly heterogeneous, and the clinical interpretation of the pathogenic CNVs of ASD is not straightforward in consideration of the incomplete penetrance, varied expressivity, and individual genetic background.


Results
Clinical characteristics. A total of 335 (95.7%) out of 350 cases and 1093 (98.4%) out of 1111 controls passed a series of quality control of CNV experiments. We investigated the ethnicity of cases and controls by performing principle component analysis (PCA) with SNP genotype data from all the participants of this study and the individuals included in HapMap study. The results demonstrated that the cases and controls are clustered together with the Han Chinese (Supplementary Figure 1). Therefore, the ethnicity of the participants of this study was confirmed to be the Han Chinese. Further, all the CNV data were subjected to the burden analysis. The patient group consisted of 299 boys and 36 girls with the mean age of 9.4 ± 4.0 years, while the control group consisted of 525 males and 568 females with the mean age of 68.1 ± 10.1 years. The ADI-R (Autism Diagnostic Interview-Revised) interviews revealed that the 335 patients scored 20.43 ± 6.12 in the "qualitative abnormalities in reciprocal social interaction", 14.75 ± 4.32 in the "qualitative abnormalities in communication, verbal", 8.19 ± 3.33 in the "qualitative abnormalities in communication, nonverbal", and 6.95 ± 2.47 in the "restricted, repetitive and stereotyped patterns of behaviors. " All the participants with ASD were noted to have had abnormal development at or before 36 months of age. Their current average intelligence quotients (IQ) were 94.85 ± 22.55 (range, 40 to 148) for full-scale IQ, 96.74 ± 2.04 (range, 41 to 145) for performance IQ, and 95.08 ± 23.79 (range, 44 to 148) for verbal IQ. Among the 335 ASD patients, nine had been diagnosed with epilepsy (3.04%), four had been suspected of seizure (1.35%), and 19 had ever had a febrile convulsion (6.42%). These data are also provided in the Supplementary Table 1. CNV findings. The rates of rare CNV (<1% in the patients) at autosomes and X-chromosome were examined between the patient and control groups. CNV regions on autosomes were analyzed in all samples while CNV regions on sex chromosomes were analyzed in male samples only. We found a significant excess of the overall rate of rare CNV at autosomes in the ASD patients (2.71) compared with the control subjects (0.77). The over-representation of rare autosomal CNV rate in ASD was still present when the rare CNVs were grouped into deletion and duplication or classified according to the size as <100 kb, 100-400 kb, and >400 kb (Table 1). In the analysis of rare CNV at X-chromosome, we compared only the rate of rare CNV between male patients and male control subjects. A significant excess of the overall rate of rare CNV was observed in the male ASD patients (0.214) compared with the male control subjects (0.011). The excess rate of rare CNV at X-chromosome was still present when the CNVs were stratified into deletion/duplication, or different size groups (Table 2). We did not compare the rate of rare CNV at X-chromosome between female patients and female controls, because of the random inactivation of X-chromosome in the females. Additionally, the sample size of the female patients is CNV size ASD (n = 335) Controls (n = 1093) CNV rate ASD/CON Likelihood Ratio Chi-square P* CNVs at "hot spots". We compared the rare CNVs found in our patients with the selected genetic "hot spots" of ASD as reported in the paper entitled "Clinical genetics evaluation in identifying the etiology of autism spectrum disorders: 2013 guideline revisions" from the practice guideline of the American College of Medical Genetics and Genomics 40 . The "hot spots" was defined as CNVs that have an especially strong association with ASD according to this paper. We identified a total of 14 patients who had pathogenic CNVs located at several of the "hot spots. " The detailed information of these CNVs including the locations, sizes, and genes encompassed in the CNV region are listed in Table 3, while the clinical data of each patient are listed in the Supplementary Table 2.
Other rare pathogenic CNVs. Besides the detection of CNVs at the "hot spots" in our sample, we further identified a total of 49 rare putative pathogenic CNVs according to the "American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants" 41 in our sample. The pathogenic CNV was defined as "documented as clinically significant in multiple peer-reviewed publications, even if penetrance and expressivity of the CNV are known to be variable. This category includes large CNVs, which may not be described in the medical literature at the size observed in the patient but which overlap a smaller interval with clearly established clinical significance" 41 . These putative pathogenic CNVs overlapped with the pathogenic CNVs reported in the Clinical Genome Resources CNVs and DECIPHER. Table 4 presents the detailed information of these CNVs including locations, types, origins, and genes encompassed. The clinical data of each patient are provided in the Supplementary Table 3.
Two-hit CNVs. Five patients were found to have two different putative pathogenic CNVs simultaneously in this study. Patient U-2075 inherited the 4q amplification and the 5p deletion from his mother and father, respectively (Fig. 1A). Patient U-1753 had the 8q amplification and the 8p deletion transmitted from his mother CNV size ASD (n = 299) Controls (n = 525) CNV rate ASD/CON Likelihood Ratio Chi-square P*  Table 3. CNVs at the "hot spots" identified in this study. *Only male controls (n = 525) were screened for this CNV at X chromosome. and father, respectively (Fig. 1B). Patient U-1255 acquired the 10q amplification and 18p amplification from his mother and father, respectively (Fig. 1C). Patient U-1414 had the 8p amplification from his father and a de novo 9q duplication (Fig. 1D). Patient U-1999 had two de novo amplifications at 17q25.3 simultaneously (Fig. 1E). All the parents of these five patients were carefully assessed, and none of them had ASD based on the self-administered questionnaires and clinical evaluation by the corresponding author. The detailed information of these CNVs including the locations, sizes, and genes encompassed by these CNVs are listed in Table 5, and the clinical data of each patient are provided in the Supplementary Table 4.

Discussion
In this study, we compared the frequencies of rare CNVs (<1%) between 335 patients with ASD and 1093 control subjects from Taiwan. We found a significantly higher frequency of global rare CNVs in patients with ASD compared to the control group. The significantly higher frequencies of rare CNVs in the ASD group were still present when the CNVs were subdivided into different groups based on deletion/duplication or the sizes. Our data are compatible with several previous studies 24,33,42 . Pinto and colleagues conducted a genome-wide CNV analysis of 996 ASD individuals of European ancestry and 1,287 matched controls. They found a higher global burden of rare genic CNVs in ASD patients 43 . The findings were replicated by the same group in another genome-wide CNV analysis consisted of 2,446 families with ASD 33 . In our study, we did not limit our CNV analysis to genic CNVs only, as non-genic CNVs may have position effect to affect the expression of genes outside the CNV regions. Our findings of increased global burden of rare CNVs in ASD indicate that genomic rearrangement is one of the genetic mechanisms of ASD. Some other studies reported increased burden of CNV in female patients. Jacquemont and colleagues reported that in a sample of 762 ASD families, they found a 3-fold increase in deleterious autosomal CNVs in female patients compared to male probands 44 . Desachy and colleagues recently reported that mothers of patients with autism had a higher deletion burden than control mothers in a matched case-control population. Also, to their surprise, they found a higher autosomal burden of large, rare CNVs in females in the population. They speculated that the increased rare CNV burden in females in general population might contribute to the decreased female fetal loss in the population, but the ASD-specific maternal CNV burden may contribute to high sibling recurrence 45 . In our study, we did not conduct the similar analysis because of the relatively small sample size of female patients.
In our CNV analysis, we identified 14 patients who had CNVs located at the ASD "hot spots" as reported in the "Clinical genetics evaluation in identifying the etiology of autism spectrum disorders: 2013 guideline revisions" 40 . Our findings support the strong association of the CNVs at "hot spots" with ASD in our patient population. Among these CNVs, CNVs located at 22q11.2, 22q13.3, and 15q11-13 are the most common in our sample. Besides the CNVs located at the "hot spots, " we also detected 49 rare (<1%) CNVs larger than 400 kb that overlapped with the pathogenic CNVs reported in the Clinical Genome Resources and DECIPHER. These CNVs met the criteria of "pathogenic" according to the "American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants" 41 . Some of these rare putative pathogenic CNVs were inherited, while some were de novo mutations. The broad distributions of both "hot spots" CNVs and rare pathogenic CNVs detected in our patients suggest extremely high genetic heterogeneity of ASD in our patients. Some studies suggested that female patients have a higher burden of CNV than male patients 33,44,45 . But, in our study, all the patients who had CNVs at "hot spots" were male (Table 3), and there were only two female patients out of 49 who had other rare (<1%) pathogenic CNVs ( Table 4). The discrepancy might be due to the disproportion of female patients in this study (36 females vs. 299 males).
In this study, five patients were found to have the concomitant presence of two rare pathogenic CNVs in their genome. The finding is consistent with our previous report of a patient who had inherited two CNVs from his parents and supported the two-hit model of ASD 39 . Several studies also proposed the idea of two-hit and the multiple-hit models of ASD, suggesting that genetic underpinnings of ASD stem from combinatorial effects of mutations of oligogenic or multiple genes in different loci 39,[46][47][48] . Leblond and colleagues reported three patients with deletions at SHANK2 gene locus. Also, these three patients had another inherited CNV at 15q11-13 that was associated with other psychiatric disorders 46 . Two patients carried a duplication of nicotinic receptor CHRNA7, and one patient had a deletion of the synaptic translation repressor CYFIP1 46 . Stenberg and Webber conducted a pathway-association test of target genes regulated by fragile-X mental retardation protein (FMRP) in ASD patients; they found rigorous support for the multiple-hit genetic etiology of ASD 47 . In fact, emerging evidence suggests the presence of multiple pathogenic CNVs in psychiatric patients is not rare. Hu and colleagues recently reported a novel maternally inherited 8q24.3 and a rare paternally inherited 14q23.3 CNVs in a family with neurodevelopmental disorders 49 . Williams and colleagues conducted CNV analysis in patients with velo-cardio-facial syndrome (VCFS), regardless of having psychosis or not. They found a significantly higher proportion of second CNV hit in patients with psychosis, suggesting the two-hit hypothesis may be relevant to a proportion of VCFS patients with psychosis 50 . Rudd and colleagues found a slightly higher proportion of multiple conservative CNVs in schizophrenia patients compared to controls, indicating a potential role for a multiple-hit model in schizophrenia 51 . Hence, it is likely that the multiple-hit (including two-hit) might be a commonly important genetic mechanism associated with ASD. In this study, we reported 5 patients who had two putative pathogenic CNVs larger than 400 kb. We believe that if CNVs smaller than 400 Kb were included for analysis in the future, we might find more patients with two-hit and multiple-hit of CNVs. In the family study of these 5 patients with two-hit CNVs, we found that some of these putative pathogenic CNVs were inherited from their parents, and some were a de novo mutation. However, the parents who carried one putative pathogenic CNV did not manifest ASD symptoms after careful clinical evaluation, suggesting the incomplete penetrance of these inherited pathogenic CNVs. Further, we searched for these CNVs in 1093 control subjects, and found none of these CNVs in the control group, except the duplication of 18p11.31-p11.2 (483 kb), which was found in 5 out of 1093 control subjects  Table 4. Other rare pathogenic CNVs identified in patients and control subjects in this study. *Only male controls (n = 525) were screened for CNVs at X chromosome.
(0.46%). These data provided further evidence to support that these CNVs may confer increased risk to ASD. In the future, we might be able to identify unaffected carriers of high-risk CNVs if more data are accumulated. The identification of pathogenic CNVs associated with ASD may help discover candidate genes of ASD. The genes encompassed by the CNVs in our patients are listed in Tables 3,4 and 5. Notably, several genes had been reported to be associated with autism or the other major psychiatric disorders, such as TACR1 52 66 , and RBFOX1 67 . These findings not only provide further clues to indicate the highly genetic heterogeneity of ASD but also indicate the pleiotropic clinical effects of the mutation of these genes. Accumulating evidence showed that shared heritability and genetic mutations among different categories of psychiatric disorders seem to be regular rather than exceptional. A study of analyzing the genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, ASD and attention-deficit/hyperactivity disorder (ADHD) revealed moderate to high shared genetic etiology of these psychiatric disorders 68 . Li and colleagues recently reported that the prevalence of de novo mutations shared by four different categories of neuropsychiatric disorders: autism spectrum disorder, epileptic encephalopathy, intellectual disability, and schizophrenia, was significantly elevated 69 .
The present study has several limitations. First, to the best of our knowledge, our study has the largest sample size of Chinese population compared to the other studies 70,71 . However, the relatively limited sample size of this study is an apparent limitation to have a more comprehensive picture of CNVs in our patient population, especially in consideration of the high heterogeneity of CNVs associated with ASD. Second, in this study, we only searched for pathogenic CNVs larger than 400 kb because larger CNVs are more likely to be pathogenic 72 . We understand that small CNVs can be pathogenic. Hence, further analysis of the CNVs smaller than 400 kb will discover more ASD-associated CNVs in our patients. Third, the phenotypical interpretation of pathogenic CNVs found in our study is not straightforward given the incomplete penetrance, varied expressivity and pleiotropic effects of pathogenic CNVs identified in our sample. Also, we cannot exclude the interaction of these CNVs with the genetic background and other yet to be identified genetic or genomic mutations in the affected patients.
In conclusion, we found a significantly increased global burden of rare CNVs in our ASD patients compared to the control subjects, indicating that rare CNVs play a part in the genetic landscape of ASD in our population. Also, we identified several pathogenic CNVs at "hot spots" and various private putative pathogenic CNVs in our patients, suggesting high genetic heterogeneity of ASD in our patients. Our study also supports that high-resolution oligonucleotide SNP array is a useful tool to uncover the genetic underpinnings of patients with ASD. In the future, we will continue to analyze our data with the size of CNV smaller than <400 kb, and we expect to find more pathogenic CNVs associated with ASD, more candidate genes of ASD, and more patients with double-hit of CNV. Thus, we will have a better understanding of the genetic architecture of ASD in our population.

Materials and Methods
Participants and Procedures. The study protocol was approved by the Research Ethics Committee at National Taiwan University Hospital (approval number: 9561709027), Taipei, Taiwan, and Chang Gung Memorial Hospital-Linkou (approval number: 93-6244), Taiwan, for the recruitment of the patients with ASD, and Academia Sinica (approval number, AS-IBMS-MREC-91-10), Taiwan for the control group. All the experiments and informed consent procedures were performed in accordance with relevant guidelines and regulations set by the research ethics committees of the three institutes.
Patient Participants with ASD. The study was part of the molecular genetics study of patients with ASD who were Han Chinese residing in Taiwan. The detailed recruitment and evaluation of the patients and their family members were described in our previous publication 73 . In brief, patients aged 3 to 17 years old and met the clinical diagnosis of autistic disorder as defined by the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV) 74 were recruited from the Children's Mental Health Center, National Taiwan University Hospital, Taipei, Taiwan and Department of Psychiatry, Chang-Gung Memorial Hospital, Kuei-Shan, Taiwan. The clinical diagnoses were made by board-certified child psychiatrists experienced in the assessment and intervention for ASD and were further confirmed by interviewing the parents using the Chinese version of the Autism Diagnostic Interview-Revised (ADI-R) 75,76 . The ADI-R, translated into Mandarin by Gau and colleagues, was approved by Western Psychological Services in 2007 as the ADI-R in the Chinese language 75,77 . All these patient participants further received clinical evaluation according to the DSM-5 diagnostic criteria for ASD 1 , which revealed that all the 350 participants with DSM-IV autistic disorder met the diagnosis of DSM-5 ASD. Moreover, all these patient participants received intelligence tests. For ages of 3 to 7.5 years, the Wechsler Primary and Preschool Scale of Intelligence-Revised (WPPSI-R) was given; for ages of 6 to 16 years 11 months, the Wechsler Intelligence Scale for Children-3 rd Edition (WISC-III) was given; for ages of 16 years and above, Wechsler Adult Intelligence Scale (WAIS) was given.
Patients with known chromosomal abnormalities and associated medical conditions including fragile X syndrome and Rett's disorder based on DNA testing or clinical assessments were not included during the recruitment process 28,78 . Also, probands with previously identified chromosomal structural abnormality associated with autism or had any other major neurological or medical conditions were also excluded 28   9561709027) and Chang Gung Memorial Hospital (approval number, 93-6244), Taiwan. Written informed consents were obtained from the participants (if applicable, otherwise, child assent) and their parents after the procedures were fully explained. Genomic DNA was prepared from peripheral blood of each participant using Gentra Puregene Blood kit according to the manufacturer's instructions (Qiagen, Hilden, Germany).
Healthy control subjects. The control subjects (n = 1111) were chosen from the Han Chinese Cell and Genome Bank (HCCGB) in Taiwan who received physical check-up and questionnaire screening to ensure that they did not have any abnormal physical condition and mental illness 79 . Written informed consents were obtained from the participants after the procedures were fully explained.
CNV analysis. We used Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara, CA, USA) for genome-wide CNV screening. The SNP 6.0 array contains more than 1.8 million markers including more than 906,600 probes for SNPs and more than 946,000 probes for CNVs. These probes are evenly distributed across the whole genome with a median distance between probes of ~0.7 kb. The microarray experiment was conducted by the National Genotyping Center (Academia Sinica, Taipei Burden assay. Both genic and non-genic CNVs were included for analysis. Likelihood Ratio Chi-square test was used to compare the difference of CNV rate between ASD and healthy controls with a pre-selected alpha value at P value less than 0.05. Bonferroni correction was used to adjust for the multiple testing. Thus, the significance level of the p-value was set at 0.005.

Real-time quantitative PCR (RT-qPCR).
RT-qPCR was used to validate the CNVs detected in this study and for a family study to identify their parental origin. RT-qPCR was performed using the SYBR-Green PCR reagents kit (Applied Biosystems, Forster City, California, USA), and the CNV was assessed using a relatively standard method in the laboratory 38 . The experiment was implemented using the ABI StepOnePlus following the manufacturer's protocol (Applied Biosystems, Forster City, California, USA). The description of primer sequences, optimal annealing temperature, and the amplicon sizes is available upon request.
Pathogenic CNV evaluation. The pathogenic CNVs were evaluated according to two practice guidelines from the American College of Genetics and Genomics. First, according to the "Clinical genetics evaluation in identifying the etiology of autism spectrum disorders: 2013 guideline revisions" 40 , our CNVs results overlapped with those at the "hot spots" reported in this guideline are considered as pathogenic. Second, for rare CNVs outside the "hot spots, " we focused on the analysis of rare CNVs equal or larger than 400 kb for convenience's sake. Although large CNVs are more likely to have clinical significance, we understand that small CNV can be pathogenic, and large CNV can be benign. CNVs overlapped with the pathogenic CNVs reported in the Clinical Genome Resources (https://www.clinicalgenome.org/), or DECIPHER (https://decipher.sanger.ac.uk/) were defined as pathogenic according to the "American College of Medical Genetics standards and guidelines for interpretation and reporting of postnatal constitutional copy number variants" 41 . The sizes of CNVs and genes encompassed by the CNVs were generated according to the gene annotation of the UCSC genome browser (GRCh37/hg 19) (http://genome.ucsc.edu/cgi-bin/hgGateway).