Autism is a pervasive developmental disorder characterized by a restricted, repetitive range of behaviors and interests, and challenges with social interactions and communication. Autism spectrum disorders (ASD) typically present before 3 years of age, affect as many as 1% of children, and are diagnosed in four times the number of boys as girls. Autism is highly heritable, with up to 90% concordance in monozygotic twins.1

Recent evidence suggests autism is associated with genomic structural variation both genome wide and at hotspots, with de novo and inherited CNVs, and with overrepresentation of certain genes within rare CNVs.2, 3, 4, 5, 6, 7, 8, 9, 10 Furthermore, many structural variants associated with autism overlap CNVs associated with other neuropsychiatric conditions or syndromes.11, 12, 13 These findings indicate that autism is characterized by marked genetic heterogeneity; with the involvement of many different highly-penetrant mutations across many different genes and pathways. Although the association between rare copy number mutations and autism is well established, specific causal mechanisms remain unclear, particularly in the case of CNVs transmitted by unaffected parents.

The objective of this study was to determine whether, using denser array comparative genomic hybridization (aCGH) platforms, association between autism and rare CNVs could be extended to de novo and inherited structural variants as small as 10 kb. Secondarily, we sought to identify biologically relevant changes in expression after de novo mutation or between affected probands and unaffected transmitting parents.

Materials and methods

Subjects included 41 youth with ASD and their parents.14 Of the 41 cases, 35 were male and 6 were female. Cases were predominantly European American (EA; 29/41), and also included three Asian Americans, two Hispanic and two African Americans (AA), and five subjects who failed to report ancestry. For this study, we only analyzed complete mother–father–child trios as verified by forensic mircosatellite markers. Controls were 367 individuals (319 EA and 48 AA) that were at least 30 years of age with no history of psychiatric symptoms by self-report.15

DNA from individuals with autism, their parents, and controls was hybridized to Nimblegen HD2 (Roche Nimblegen, Madison, WI, USA) microarray slides against a well-characterized reference subject, SKN1. The HD2 array features 2.1 million probes spaced at an average of 1.2 kb across the genome. aCGH data was normalized and CNVs were called using a sliding-window algorithm with a 10-probe (10 kb) minimum size threshold. Parent-of-origin for CNVs in autism probands was determined by probe-by-probe signal comparison of the proband and both parents across the CNV. Controls were split into two groups: 244 were used for the identification of normal variation and 123 were used to test for differences in rare CNV prevalence compared with autism cases. Rare CNVs were defined as having <30% overlap with a CNV identified in our control sample or in the Database of Genomic Variants.16 We decided in advance to use <30% in order to permit some overlap with known polymorphisms, while also stringently removing known CNVs that have different reported break points due to differences in signal strength or between aCGH platforms and detection algorithms. Enrichment of functional gene classes tested using PANTHER17 followed up by direct case versus control group comparison.

For a selected set of CNVs in regions with previous evidence of association to neurological phenotypes and of dosage-sensitivity or allele-specific effects we tested for either a novel stable transcript or a change in the expression in affected genes relative to the transmitting parents and/or wild-type unaffected subjects. In both the cases, total RNA or cDNA was prepared from transformed lymphoblasts from each trio and one healthy wild-type control. Expression levels were estimated using TaqMan expression assays run in triplicate with TBP as an internal control following the manufacturer's protocol (


In comparison with controls, we found enrichment in the percent of cases harboring rare genic CNVs 10 kb or larger (odds ratio (OR)=2.42; 95% CI: 1.07–6.10; P=0.04) (Table 1). The increased carrier frequency was accompanied by a significant increase in CNV- and gene burden in cases, as evaluated by Wilcoxon test. Odds ratios for carrying a rare or de novo CNV were greater for deletions, and increased with CNV size (Figure 1). A log-linear test for trend of OR across size classes was significant at P=3 × 10−9 for deletions, but not significant for duplications. The majority of rare CNVs identified in autism cases were inherited (133/136) based on aCGH analysis, indicating a minimal false-positive call rate for CNVs included in our analysis. The other three CNVs in cases, all deletions, were predicted to be de novo. Break points of these de novo deletions were confirmed by PCR and their presence verified in DNA of cases and their absence verified in the parents. Supplementary Table 1 contains all rare CNVs identified in cases and controls.

Table 1 Rates of rare CNVs in autism cases and controls
Figure 1
figure 1

Size-stratified odds ratios (OR) for rare genic deletions and duplications. OR generated by comparing the number of cases and controls carrying one or more rare genic deletions or duplications of the given size class. Significance tested using χ2-test. All size classes are independently significant for deletions at P<0.05, whereas for duplications only mutations 50–200 kb show a significant OR.

Considering gene representation within rare CNVs, three functional classes were identified as enriched in cases, whereas no functional classes were enriched in controls (Supplementary Table 2). The strongest enrichment was for transcription-related functions (OR=9.73; P=0.0008) with 15% of cases harboring a deletion overlapping a gene annotated to this term versus less than 2% of controls. Other functional classes significantly enriched in case events were receptor and receptor activity (OR=7.84; P=0.00002), and nervous system development (OR=6.46; P=0.0004). Overall, 15/41 (37%) cases had at least one deleted gene annotated to one of these functional classes versus 8/123 (7%) of controls (OR=8.08; P=0.000002). Four patients (10%) carried both maternally and paternally transmitted deletions that affected enriched gene classes, thus supporting a two-hit model for some cases.18

We tested one predicted truncation and nine full-gene deletions or duplications for transcript level differences associated with rare case CNVs in genomic loci that have previous evidence supporting a link to autism (Figure 2). The specific genes we selected for expression analysis have evidence supporting dosage sensitivity or allele-specific effects, increasing the likelihood that expression changes linked to copy number variation have functional relevance. For the one predicted truncation, we detected a stable mutant transcript of ARID1B after de novo deletion. Sanger sequencing of the mutant transcript identified the deleted exons and the premature stop codon.

Figure 2
figure 2

CNVs with transcript-level differences between probands and parents. For all trios shown, individual 1 is the case, 2 is the mother, and 3 is the father. The left side of each panel shows the aCGH data and the right side shows the transcript effect. Error bars on expression plots represents ±2 SD. (a) Deletion within ARID1B that results in a truncated protein missing the ARID domain and the uncharacterized and DUF3518 domain. The stable transcript was amplified from cDNA in proband, but not in parents. Trace images showing deletion break points and new stop after codon 520. (b) De novo deletion on 22q11 with decrease in relative PRODH expression shown for the deletion carrier. (c) Maternally transmitted duplication overlapping the Beckwith–Wiedemann region at 11p15. Expression of ZNF214 in the trio and a control individual showing significant stepwise decrease in expression in the unaffected mother and affected proband. (d) Maternally transmitted promoter deletion near CNTNAP2. Relative expression of CNTNAP2 in the trio and a control individual shows significant stepwise decrease in the unaffected carrier mother and affected proband.

Of the genes that were fully deleted or duplicated by rare CNVs, we observed differential expression of three genes. We found decreased PRODH expression level after de novo deletion in the proband versus wild-type parents and one control (P=0.01–0.03). We observed decreased ZNF214 expression compared with the transmitting mother (P=0.003), the wild-type father (P=0.002), and a control individual (P=0.001). Expression of the transmitting mother was also significantly lower than the father (P=0.005) and control (P=0.001). No significant trend was observed in ZNF215 expression, which may also be affected by the CNV. Similarly, a case carrying a promoter deletion exhibited decreased CNTNAP2 expression relative to the transmitting mother (P=0.0004), the wild-type father (P=0.004), and control (P=0.002). Expression in the mother was also significantly decreased relative to the father (P=0.02) and control (P=0.02). We did not see an expression effect associated with an inherited intronic deletion in GRID1. Other genes tested (GRIK2, SIM1, SNRPN, AFF2) were not expressed at detectable levels in lymphoblasts.


Previous studies have demonstrated a clear link between rare CNVs and autism. Our analysis extends previous findings to indicate a relationship between smaller (to 10 kb size) rare and de novo CNVs, and demonstrates expression level phenotypes specific to affected carriers. Although heterogeneous, the affected genes identified in this study share common functions and regulatory mechanisms.

An inherent challenge to identifying rare disease-causing mutations is the lack of power required to demonstrate a statistically significant association for any given variant, as each mutations may only occur in a small number of affected individuals. However, for rare or common alleles, the ultimate proof of causality requires on the establishment of biological relevance. To this end, we demonstrate transcript level differences related to inherited and de novo CNVs.

Each of the genes for which we identified transcript level differences have evidence suggesting a potential causal relationship for autism. ARID1B mutations, such as the de novo deletion identified here, could have effects in autism similar to the affects of mutations in the related SMARC2A gene in schizophrenia.19, 20 Findings of decreased PRODH expression in autism cases carrying a deletion in this gene have been reported5 and PRODH is known to be dosage-sensitive in mouse neurodevelopment.21 Imprinting has been shown to cause allele-specific expression of ZNF215 but not the neighboring ZNF214 gene,22 and this region is associated with maternally transmitted balanced chromosomal abnormalities causing Beckwith–Wiedemann syndrome, which can be comorbid with autism.23 Described ZNF215 isoforms that contain a 3′ region that is antisense to ZNF214 are completely duplicated by the CNV reported here and these isoforms have been hypothesized to have a role in ZNF214 regulation.22 CNTNAP2 has been implicated in autism through association, linkage studies, and brain expression pattern,24, 25, 26 and there is some evidence for a parent-of-origin effect for CNTNAP2 mutations.25 Although other rare case CNVs overlapped regions with parent-of-origin effects (eg, SNRPN27 and GRIK228), the genes were not expressed in lymphoblasts, precluding expression analysis.

We report a strong enrichment in rare CNVs and specific gene classes in autism cases despite a relatively small number of trios profiled. The small number of genes and subjects tested and the use of lymphoblasts in the expression analysis are limitations of this study. Further characterization of normal variation in transcript abundance will be useful to determine whether the expression differences we report are biologically relevant. Ultimately, functional experiments such as assays for dosage-sensitive or allele-specific dominant phenotypes in neuronal cell lines are necessary. Finally, for expression analysis, we focused solely on genes with previous evidence suggesting a link to autism. There are additional case CNVs in loci associated with neurological processes and phenotypes that may be important for autism.

Although rare CNVs are overrepresented in autism, such mutations are also found in unaffected individuals. We show here that certain mutations are associated with differential expression relative to the transmitting parent. This suggests that allele-specific expression may control penetrance and explains why some carriers are unaffected. Our results indicate that screening and functional analysis combined with testing transcript-level phenotype may be an effective way to identify which rare mutations truly contribute to disease.