Introduction

Childhood apraxia of speech (CAS) is a rare, severe, and persistent speech sound disorder characterized by a deficit in planning/programming oral and laryngeal movements for speech.1 Speech motor profiles consistent with CAS appear to occur both as an idiopathic disorder limited to a core motor speech deficit and deficits in other speech processing domains, and in the context of complex neurodevelopmental disorders. In the latter context, CAS typically co-occurs with deficits in multiple domains, including intellectual disability, language impairment, nonverbal oral apraxia, dysarthria, and/or craniofacial and other dysmorphologies.2

The genetic origins of CAS are poorly understood. The most significant genomic finding to date is a mutation in the coding sequence affecting the forkhead box P2 (FOXP2) gene associated with CAS in approximately half of the widely cited multigenerational pedigree, the “KE” family.3,4,5 Reports of other affected individuals with sporadic and inherited translocations and disruptions affecting the FOXP2 locus have confirmed its role in speech and language impairment.6,7,8,9,10 FOXP2 is also the continuing focus of a large number of studies and discussions on the evolutionary biology of speech–language in humans.11,12 Recent findings have reported a speech disorder consistent with CAS in several complex neurodevelopmental disorders, including galactosemia13 and rolandic epilepsy.14

Identification of genes or loci that confer risk for CAS has been hampered by the low prevalence of CAS, the complexity of the phenotype, and the lack of a diagnostically conclusive assessment protocol. Although the latter two constraints have prohibited point and period prevalence estimates of idiopathic CAS, there is clinical consensus that CAS likely meets the criterion prevalence rate for a rare disorder in the United States of ~1/1,500.15 The goal of our report was to identify candidate causal genes or regions of interest in 24 well-characterized participants with idiopathic CAS using custom array comparative genomic hybridization analysis (aCGH). aCGH has not been used to date in studies of the genomic origins of pediatric motor speech disorders. Although not as comprehensive as whole-genome or whole-exome sequencing, aCGH is a well-established whole-genome-analysis method for initial study of nonsyndromic intellectual and developmental disabilities. As described, the increased coverage of regions associated with CAS provided the sensitivity to identify smaller potential copy-number variants (CNVs) within these regions (i.e., >1-kb as opposed to >100-kb gains or losses).

Materials and Methods

Participants

Participants were recruited and consented for a study of pediatric motor speech disorders approved by institutional review boards at the data collection and data analyses institutions. All participants carried the diagnosis of CAS or suspected CAS from referring clinicians. The participants were assessed by one of two examiners using the Madison Speech Assessment Protocol, a 2-h protocol developed for research in speech sound disorders across the lifespan, including CAS.16 The Madison Speech Assessment Protocol includes 15 measures that provide a range of speaking conditions for age–sex standardized scores that profile a speaker’s speech processing and speech production competence, precision, and stability. Digital recordings of responses to the Madison Speech Assessment Protocol speech tasks were processed using computer-aided methods for perceptual and acoustic analyses. Construct and concurrent validation studies have supported the diagnostic accuracy of four speech and prosodic signs to identify CAS across developmental periods (ref. 17 and unpublished data). All 24 participants were positive on at least three of the four signs of CAS recently validated as a behavioral marker of CAS. One of the four signs indexes transcoding (planning/programming) deficits in speech processing and the other three are acoustic-perceptual signs of deficits in phrasing, rate, and linguistic stress (unpublished data).

The Madison Speech Assessment Protocol also includes measures of intellectual function, receptive and expressive language, oral mechanism structure and function, oral-nonverbal motor function, and parental information on a participant’s developmental, educational, and behavioral histories.

Table 1 includes individual descriptive information for 12 of the 24 participants with genetic findings plausibly associated with CAS (to be described) and summarized information for the remaining 12 participants with noninformative aCGH findings. Individual participant data for the informative group in Table 1 is aggregated in developmental groups (preschool, early elementary, adolescence) without sex status information to maintain anonymity.

Table 1 Phenotype data for 12 participants with childhood apraxia of speech (CAS) and informative aCGH findings and summary data for 12 participants with CAS and noninformative aCGH findings

Beginning with the individual participant data in the informative group, participants had a 2:1 male:female ratio, consistent with sex ratios reported in the idiopathic CAS literature.1 Consistent with findings indicating that CAS is a persistent disorder even with treatment meeting the standard of care,1 individual participants in the informative group had been receiving speech services for CAS for as long as 10 years. The 70% familial aggregation rate, coded as positive if the proband had at least one other biological family member with any type of speech sound disorder and adjusted for missing data, is appreciably higher than the 56% familial aggregation rate estimated for children with speech delay, the most prevalent class of speech sound disorders. According to the parent informants, only 1 of the 24 probands in this subsample of a larger study group had another nuclear family member with a clinical diagnosis of CAS.

The profiles of cognitive, language, and motor impairment scores of the 12 participants were similar to those summarized in a technical report on CAS.1 Adjusted for missing data, 30% of the participants had an intellectual disability, 91% were delayed in the onset of speech–language, 64% had impairments in language comprehension, and 73% had impairments in language expression. Last, 73 and 80% of participants, respectively, had impairments in gross motor and oral–nonverbal movements. These behavioral profiles of participants with CAS are consistent with the perspective that their processing constraint in planning/programming the articulatory gestures for speech is appropriately viewed as the signature deficit in what is otherwise a multiple-domain disorder.17

Crucially for the goals of our study, the summarized data were similar for participants with and without CNVs detected by array testing. As shown in Table 1 , the summarized data for the two groups indicates similar average age and years of treatment and approximately similar (within 15 points) percentages of participants with positive findings on the seven behavioral measures. Additional analyses indicated that the two groups had approximately similar phenotypes indexing severity of CAS.

FOXP2 sequencing

All 24 participants evaluated by array were also evaluated for FOXP2 mutation status, by sequencing each of the seventeen FOXP2 coding exons (NCBI reference sequence: NM_014491.3). The exons were PCR amplified (AmpliTaq Gold PCR Master mix; Applied Biosystems, Carlsbad, CA) using oligos listed ( Supplementary Table S1 , online). PCR amplification and amplicon size were verified by gel electrophoresis. Sequences of each PCR amplicon were generated in both forward- and reverse-direction sequencing reactions with Big Dye Terminator v 3.1 (Applied Biosystems), purified with AxyPrep Mag DyeClean beads (Axygen Biosciences, Union City, CA), and run using either an ABI 3730×l or 3130×l. Exon 15 of participant 5 was sequenced in triplicate for confirmation of a heterozygous change to be described. DNASTAR SeqMan Pro v 9.1.1, Polyphen-2 (version 2.2.2), and UCSC genome browser NCBI36/hg18 (tracks: dbSNP build 135), HGDP Allele Freq (Human Genome Diversity Project), HapMap, DGV (Database of Genomic Variants), and Genome Variants (variant base calls from nine genomes) were used for data analysis and interpretation of variants.

aCGH analyses

Genomic DNA was purified using the Qiagen PureGene DNA extraction kit reagents (Qiagen, Valencia, CA). Subject DNA was labeled and cohybridized with sex mismatched labeled control genomic DNA (Promega). Copy-number analysis was performed using a customized 385K Nimblegen array (Roche Nimblegen) with increased coverage of genes and regions previously associated with CAS. Table 2 includes information on these areas, including 21 genes or regions of interest associated with CAS or language phenotypes associated with multiple-domain involvement (i.e., cognitive, language, motor).

Table 2 Genes and genomic regions of interest with additional probe coverage in the array Comparative Genetic Hybridization analyses. Genomic coordinates refer to NCBI36/hg18

Laboratory methods were performed according to manufacturer specifications. Data analysis was performed using CytoSure Interpret Software Version 3.4.3 (Oxford Gene Technologies, Begbroke, Oxfordshire, UK). Regions of benign CNV, as reported by the Database of Genomic Variants, ISCA database (The International Standards for Cytogenomic Arrays Consortium), and CHOP CNV databases (Children’s Hospital of Philadelphia), were excluded from the final results. No recurrent CNVs were introduced into the analysis by the custom design as evaluated across specimens with sex mismatched normal controls. The HG18 human genome build, NCBI build 36.1, was used in the analysis and mapping. Deletions and duplications were required to contain five contiguous probes. The log threshold factor for gains was set to 0.3 and that for losses was set to 0.6. Array validation studies for several specimens indicated that the incidence of CNVs for targeted regions did not differ from the incidence in the clinical whole-genome arrays.

Results

FOXP2 sequencing

One participant, ( Table 1 , participant 5), was found to have a heterozygous mutation, c.1789A>C in exon 15. This base substitution causes a missense mutation, N597H, in the C terminal of the protein, just outside of the forkhead domain ( Supplementary Figure S1 , online). There are no common SNPs identified in the region. This mutation has not been reported before in the literature. The PolyPhen-2 prediction/confidence scores were 0.995 and 0.795 for HumDiv and HumVar, respectively, suggesting this variant is likely to be pathogenic.18

aCGH analyses

Table 3 is a summary of findings for the 12 participants in Table 1 with CNVs. The 16 row-wise entries in Table 3 include CNVs with plausible neural consequences for cognitive, speech, language, and motor processes in development and performance. The CNVs occurred on 10 chromosomes, three of which included two to four candidate regions: chromosome 2 (four regions), chromosomes 4, 6, 7, 8 (two regions), chromosomes 9, 13, 14, 16 (three regions), and chromosome 17. The 16 CNVs ranged in size from ~36 kb to 1.8 Mb. As shown in the right-most columns in Table 3 , three participants had more than one CNV: participant 2 (3 regions), participant 6 (2 regions), and participant 12 (2 regions). Two of the participants in Table 1 had CNVs that included deletions of the same gene or genes. It is efficient to position additional information and comment on these and other findings in Table 3 in the following section.

Table 3 Array comparative genomic hybridization findings for 12 participants with childhood apraxia of speech

Discussion

Chromosome 2

As shown in Table 3 , four deletions were identified on chromosome 2, including CNVs at 2p14, 2q24.1, 2q31.1, and 2q31.2.

2p14

A deletion of ~67 kb at 2p14 was detected as the single CNV in participant 7. This deletion eliminates a portion of SPRED2, a regulator of differentiation via the MAP kinase cascade.19,20 Two deletions have been reported in this region in the ISCA and Decipher databases, but both are much larger (2.2 and 4 Mb) than the deletion identified in participant 7. However, based on the function of the affected gene in differentiation of neuronal cells,19 the 67 kb deletion in this participant plausibly affects speech processing.

2q24.1

A deletion of ~667 kb at 2q24.1 (participant 5) involves the UPP2, CCDC148, PK4P, and AK126351 (uncharacterized) genes. This participant was also found to have a heterozygous, likely pathogenic, FOXP2 mutation. The ISCA and Decipher databases report six deletions in this region, all greater than 2.3 Mb. Of particular interest is the phenotypic description of Decipher individual 254867 with a 2.3-Mb deletion. This individual is reported to have speech delay, microcephaly, and intellectual and developmental disabilities, as well as tall stature. This deletion overlaps findings for the participant in the present database by ~170 kb and involves the UPP2 and CCDC148 genes. CCDC148 is a putative transcriptional modulator. Additional phenotype evaluation will be necessary to determine the contributory significance of the CNV, if any, to the FOXP2-associated phenotype.

2q31.1

A 1.8 Mb deletion at 2q31.1, the largest deletion detected in this patient cohort, contains the DLX1 and DLX2 genes, which belong to a family of transcription factors involved in craniofacial patterning and forebrain development. RAPGEF4, also within this deletion, is involved in memory retrieval and spiny synapse remodeling.21,22 RAPGEF4 is reported as a putative target for activation by FOXP2,23 as well as an autism susceptibility gene.24 Other genes in this region include HAT1, MAP1D, ITGA6 (cell surface–mediated signaling), PDK1, AL157450 (hypothetical gene), CGEF2, ZAK, CDCA7, and MLK7-AS1. Deletions overlapping this region have been reported in the Decipher and ISCA databases, but all are much larger and associated with a more severe phenotype including intellectual and developmental disabilities as well as multiple congenital anomalies.

2q31.2

A 2q31.2 deletion in participant 3 includes ~182 kb, deleting several exons of the PDE11A gene. Mutations in this gene have been reported in association with the autosomal dominant disorder pigmented nodular adrenocortical disease. A deletion of 245 kb affecting PDE11A has been reported in the ISCA database; however, it was described as a finding of unknown significance. All other reported deletions within this region are substantially larger.

Chromosome 6

The duplication of ~715 kb identified at 6p12.1 in participant 11 overlaps with two reported duplications—a 15-Mb duplication reported as pathogenic and a 793-kb duplication reported as of uncertain significance (ISCA database; pathogenic CNVs, and uncertain CNVs 19 January 2011). The smaller duplication of uncertain significance reported a phenotype of developmental delay and hypotonia. Hypotonia was also reported in the case history information by parental report for 2 of the 12 participants, but not for participant 11. This region contains the DST, BEND6, ZNF451, BAG2, RAB23, and PRIM2 genes. DST encodes an adhesion junction plaque protein involved in anchoring neural intermediate filaments to the actin cytoskeleton. Homozygous loss of this gene produces a progressive neuropathy in mouse models.25 RAB23 is a negative regulator of Sonic hedgehog signaling. Expression of RAB23 is high in spinal cord, somites, limb buds, and cranial mesenchyme in the developing mouse embryo. Adult mice show high levels of expression in the brain, heart, and lung.26 RAB23 is also associated with recessive Carpenter syndrome characterized by distinctive skeletal anomalies and other congenital anomalies with no mention of speech disorder (OMIM no. 201000).27 It is unclear how duplication of these genes might be associated with the neurodevelopmental substrates of CAS.

Chromosome 7

Among the notable findings of this study is the identification of a 35-kb deletion within intron 13 of the CNTNAP2 gene. CNTNAP2 has become of wide-ranging interest in emerging studies reporting its association with a number of complex neurodevelopmental disorders including autism, language impairment, speech delay, dyslexia, and CAS.28 Of note, this gene, reported to be associated with a number of cortical functions, is regulated by the FOXP2 transcription factor.29,30 Although the detected deletion maps within the noncoding portion of the gene (intron 13), it could affect regulatory elements important for CNTNAP2 expression. A challenging finding is that this variant was identified in participant 6, who, as discussed subsequently, also has a deletion in 16p13.2 that includes ABAT, TMEM186, and PPM2.

The FOXP2 sequencing findings for participant 8 described previously will be addressed in a separate report that integrates speech findings for this participant with those reported for two other families with FOXP2 mutations and affected members of the KE family.7,9 As reported next, this participant also had a deletion on chromosome 8 detected by array.

Chromosome 8

Two CNVs, a 223-kb duplication at 8q11.23 and a 128-kb deletion at 8q21.13, were identified in Participants 8 and 10. Neither region contains a validated gene and all reported CNVs are substantially larger than those identified in these participants, leaving uncertain the significance of these findings. Additional phenotyping of participant 8 as well as in vitro and model organism studies will be necessary to define possible contributions to the CAS phenotype.

Chromosome 9

The deletion of ~70.6 kb at 9q32 identified in participant 9 contains a hypothetical protein, LOC169834, as well as the zinc-finger protein 37. The zinc-finger protein has been implicated in chondrocyte differentiation as extrapolated from screening of a human fetal cartilage-specific cDNA library.31 It is a proposed candidate gene for Nager syndrome (acrofacial dysostosis), which is characterized by multiple congenital anomalies including skeletal anomalies, conductive hearing loss, and speech delay (OMIM no. 154400).27

Chromosome 16

Two findings on chromosome 16 are among the most significant for ongoing genomic research in CAS. Participant 4 and participant 6 had overlapping deletions at 16p13.2 that include the ABAT, TMEM186, and PMM2 genes. A disruption in the ABAT gene is causally associated with autosomal recessive gamma-aminobutyric acid transaminase deficiency syndrome (OMIM no. 613163).27 TMEM186 is a transmembrane protein possibly associated with the mitochondria. PMM2 deficits are associated with congenital disorder of glycosylation type 1a, an autosomal recessive disorder. The disorders associated with disruptions in ABAT and PMM2 are characterized as severe; phenotypes have not been reported for carriers. It is possible that haploinsufficiency of these genes individually would not be sufficient for a CAS phenotype, but the contiguous deletion could have an additive effect possibly sufficient for CAS. Participant 6 also has a 35-kb deletion in the region that includes CNTNAP2. The deletion in CNTNAP2 may be the causative factor, with the 16p13.3 deletion being a rare variant. Haploinsufficiency of these genes individually may have an associated phenotype, but it has not been fully characterized in carriers. Additional functional and inheritance studies would be necessary to define the phenotypic effects of these deletions. This study did not include parental DNA to determine inheritance.

The second notable finding for chromosome 16 was for participant 2. Participant 2 has three CNVs, including a 13q13.3 duplication, a 14q3.2 deletion, and a 16p11.2 microdeletion. The 13q13.3-duplication includes RFXAP, SMAD9, ALG5, EXOSC8, and a partial deletion of FAM48. The 14q23.3 deletion contains no known genes. Of note, participant 2 also has a 16p11.2 microdeletion, a widely discussed, prevalent deletion recently characterized as the 16p11.2 microdeletion syndrome. As discussed elsewhere, the history and behavioral profiles of this participant and another with CAS are consistent with the emerging literature on the 16p11.2 microdeletion syndrome and extend the phenotype of this syndrome to include CAS32.

Multiple CNVs

The findings that three participants had more than one CNV, including the participant with the 16p11.2 microdeletion, are consistent with recent observations that carriers of microdeletions such as 16p11.2 show enrichment for “second-hit” CNVs.33 These authors suggest that a “two-hit hypothesis” might include the expectation of elevated, double-hit rates among pathogenic CNVs with clearly variable penetrance and expressivity. In this study, the hypothesis would predict that the three participants with multiple CNVs are at risk for expression of more heterogeneous CAS phenotypes due to the additional genomic modifiers. Such genotype–phenotype hypotheses are readily testable in CAS research with samples of sufficient size.

Conclusion

We hypothesized that as has been found for a number of complex neurodevelopmental disorders, rare CNVs in genomic DNA may be associated with increased risk for CAS. Procedures to identify CAS included a standardized assessment protocol and well-developed perceptual and acoustic diagnostic classification methods. Consistent with the hypothesis, whole-genome high-resolution oligo array comparative genomic hybridization studies in 24 participants with CAS identified one participant with a 16p11.2 microdeletion, two participants with overlapping deletions affecting genes on 16p13.2, and 10 participants with potentially pathogenic copy-number changes reported in the genetics literature to be associated with neural function or more directly with speech–language disorders. CNTNAP2 was the only gene given additional probe coverage in the customized aCGH chip that had a potentially pathogenic variant in our sample. DLX1, PDE11A, RAPGEF4, and ZFP37, as well as other genes with currently unknown function, were identified as additional strong candidate genes for CAS. Several of our identified genes had gene family members identified in other studies of verbal trait disorders. Gene families include ALG, BAG, CCDC, CDC, EXOSC, MAP, PDE, RAB, TMEM, and ZFP.23,34 These families are ubiquitous and redundant in function, with systems studies needed to delineate their individual and interactive contributions to CAS. Our design did not include the genomic information required for follow-up segregation analyses.

To summarize, our study findings underscore the genetic and biochemical complexity of pediatric motor speech disorders such as CAS, which historically were expected to be associated with monogenetic causal pathways with high attributable risk. On the contrary, identification of new genes and regions of interest is consistent with a trend recognizing the likelihood of complex gene-to-gene interactions underlying CAS in neurogenetic and complex neurodevelopmental disorders.35 For clinical needs, such perspectives on genetic heterogeneities impact the likelihood of informative diagnostic yields from single-gene assays for CAS in favor of high resolution, whole-genome approaches such as array comparative genomic hybridization and comprehensive genome sequencing. The findings also underscore the complex challenges of confounding molecular mechanisms in next-generation sequencing of complex neurodevelopmental disorders.36

Disclosure

The authors declare no conflict of interest.