An intronic mutation in Chd7 creates a cryptic splice site, causing aberrant splicing in a mouse model of CHARGE syndrome

Alternate splicing is a critical regulator of gene expression in eukaryotes, however genetic mutations can cause erroneous splicing and disease. Most recorded splicing disorders are caused by mutations of splice donor/acceptor sites, however intronic mutations can affect splicing. Clinical exome analyses largely ignore intronic sequence, limiting the detection of mutations to within coding regions. We describe ‘Trooper’, a novel mouse model of CHARGE syndrome harbouring a pathogenic point mutation in Chd7. The mutation is 18 nucleotides upstream of exon 10 and creates a cryptic acceptor site, causing exon skipping and partial intron retention. This mutation, though detectable in exome sequence, was initially dismissed by computational filtering due to its intronic location. The Trooper strain exhibited many of the previously described CHARGE-like anomalies of CHD7 deficient mouse lines; including hearing impairment, vestibular hypoplasia and growth retardation. However, more common features such as facial asymmetry and circling were rarely observed. Recognition of these characteristic features prompted manual reexamination of Chd7 sequence and subsequent validation of the intronic mutation, highlighting the importance of phenotyping alongside exome analyses. The Trooper mouse serves as a valuable model of atypical CHARGE syndrome and reveals a molecular mechanism that may underpin milder clinical presentation of the syndrome.

murine model of CHARGE syndrome carrying an ethylnitrosourea (ENU) mutagenesis -induced point mutation in Chd7. Interestingly, the pathogenic non-canonical splice mutation occurs 16 nucleotides upstream of the 3′ AG splice site, however intron retention and exon skipping occurs. The Trooper mouse presents with a mild CHARGE-like phenotype, exhibiting a range of anomalies including growth retardation, hearing-impairment and vestibular hypoplasia. Thus, the Trooper mouse is a valuable model of atypical human CHARGE syndrome and the molecular mechanisms that underpin this condition. Mice. Colonies of mice were maintained at the WEHI and the MCRI. Mice housed at MCRI were group-housed in individually ventilated micro-isolator cages (Tecniplast, Buguggiate, VA, Italy) on a 14 hour light/10 hour dark cycle. Mice housed at WEHI were group-housed in individually ventilated micro-isolator cages (Airlaw, Smithfield, NSW, Australia) on a 12 hour light/12 hour dark cycle. Animals were provided with standard Barastoc mouse chow (Ridley AgriProducts, Melbourne, VIC, Australia) and sterilized water ad libitum. Environmental enrichment was provided in the form of cardboard toys and sunflower seeds.

Mutagenesis Screen.
Male BALB/c mice were injected intraperitoneally with 85 mg/kg ethylnitrosourea (ENU, Sigma-Aldrich, Castle Hill, NSW, Australia) weekly for 3 weeks as previously described 7 . After a rest period of 12 weeks to recover fertility, treated males were backcrossed with untreated BALB/c females. The resulting progeny (G1) were screened using an acoustic startle response (ASR) test at 8 weeks of age. Mice with an ASR below 200 mV in response to white noise bursts of 115 dB SPL were test-mated to determine heritability of the phenotype. The Trooper strain was serially backcrossed to BALB/c for 15 generations prior to phenotypic analysis.
Acoustic Startle Response. The SR-LAB plug and play ASR system (San Diego Instruments, San Diego, CA, USA) was used to measure the startle response. Testing was conducted in an illuminated environment during the light phase of the lighting cycle. A perspex restraint chamber was used and mice were acclimatised to 70 dB SPL of background white noise for one minute. Clicks were presented in a pre-programmed pseudorandom order and separated by intervals of three to eight seconds. Each mouse underwent six trials of 70, 85, 90, 95 and 100 dB SPL and 16 trials of 115 dB SPL (40 ms white noise click). The largest and smallest recording were deleted before the average startle amplitude was calculated. Prism v 6.0b software was used for data compilation (GraphPad Software Inc, La Jolla, CA, USA).

Auditory Brainstem
Response. An evoked potentials workstation (Tucker Davis Technologies, Alachua, FL, USA) was used to assess the auditory brainstem response (ABR) as previously described 9 . Intraperitoneal injection of 100 mg/kg ketamine and 20 mg/kg xylazine was used and anesthetized mice were kept on a heat pad with eyes protected using REFRESH night time (Allergan, NSW, Australia). A free-field magnetic speaker (model FF1, Tucker Davis Technologies) was placed 10 cm from the left pinna. Computer-generated clicks (100 µs duration, with a spectrum of 0-50 kHz) and 3 ms pure tone stimuli of 4, 8, 16 and 32 kHz were presented with maximum intensities of 100 dB SPL. Subdermal needle electrodes (S06666-0, Rochester Electro-Medical, Inc., Lutz, FL, USA) were positioned at the apex of the skull (+ve), the left cheek (−ve) and the left hind leg (ground). ABRs traces were calculated and analysed using the average of 512 stimuli repeats in BioSig software (Tucker Davis Technologies). The threshold was detected by visual analysis, as the lowest intensity stimulus that reproducibly evoked an ABR.

Mutation Identification
Massively Parallel Sequencing. Exome sequencing was completed by the Australian Genome Research Facility (AGRF) using the 100803_MM9_exome_rebal_2_EZ_HX1 exome capture array (Roche Nimblegen, Madison, WI, USA), TruSeq Sample Preparation Kit (Illumina, San Diego, CA, USA) and HiSeq. 2000 Sequencing System (Illumina) using DNA isolated from two N7 Chd7 +/Trooper liver samples. The Bioinformatics Team at the Australian Phenomics Facility then utilised a custom analysis pipeline to align sequence reads to the reference genome (C57BL/6 NCBI m37). Raw single nucleotide variant (SNV) calls were filtered and a list of candidate SNVs created as described 10  SPL) were intercrossed, generating 336 F1N1 progeny. Progeny were ABR-tested at 8 weeks of age and euthanized for the collection of tissues. Genomic DNA was isolated from liver sections as previously described 8 . Twenty hearing-impaired F1N1 progeny (click ABR ≥45 dB SPL) were genotyped for 660 SNPs spaced at 5-10 Mb intervals throughout the genome using the iPLEX Gold method 11  Transcriptome analysis. Mice were euthanized by cervical dislocation. Brain was immediately collected, bisected and placed into RNAlater (Life Technologies) on wet ice. Following a 12 hour incubation at 4°, tissue was disrupted using a mortar and pestle. RNA was extracted using a Quiagen RNEeasy miniKit and cDNA was then synthesized using the SensiFAST ™ cDNA Synthesis Kit (Bioline). cDNA was amplified by PCR using primers CTCTCAGAGATTGAGGATGACCT and TTTTTGCACCTGCTCTTCCG and Pfx polymerase (Invitrogen) according to manufacturers recommendations. Purified PCR products were blunt ligated into pBluescript II phagemids digested with EcoRV, and E. coli NEB10-Beta cells transformed with ligation reactions. Colonies containing inserts were identified by blue-white screening, and sequenced using the alternative reverse primer (ACAACCACGTTCAACTCCGT).

X-ray Micro-Computed Tomography (µCT). Mice were euthanized by cervical dislocation. Cochleae
were dissected from the temporal bones and stored in 4% neutral buffered formalin. μXCT measurements were carried out using an Xradia © micro XCT200 (Carl Zeiss X-ray Microscopy, Inc.), which uses a microfocus X-ray source with a rotating sample holder and an imaging detector system. The source is a closed x-ray tube (tube voltage 40 kV, peak power 10 W). One data acquisition set consisted of 361 equiangular projections over 180 degrees providing a complete tomographic reconstruction. The exposure time was 8 seconds for each projection. The tomographic scan involved rotating the sample whilst recording transmission images on the CCD. Each projection image was corrected for the non-uniform illumination in the imaging system, determined by taking a reference image of the beam without sample. A filtered back-projection algorithm is used to obtain the 3D reconstructed image. The final three-dimensional reconstructed image size was 512 × 512 × 512 voxels with the voxel size of 7.6 µm along each side and Field of View (FOV) of (3.9 mm) 3 . Avizo-6.2 software (Mercury Computer Systems Inc., France) was used for image segmentation. Imaging, 3D modelling and evaluation were performed blinded of the genotype Statistical Analysis. Statistical analysis was completed in Prism 6 software (GraphPad Software Inc.).
Testing included two-way ANOVAs with post hoc t-tests, Wilcoxon-Mann-Whitney tests and was corrected for repeated testing using the Holm Sidak method.

Results
The Trooper strain emerged in an ENU mutagenesis screen that has produced numerous models of deafness 9,12,13 . G1 founder mice were identified using the ASR test and subsequently bred to validate heritability. The Trooper phenotype proved heritable in an autosomal dominant fashion.
The Trooper Phenotype is caused by a mutation in Chd7. The chromosomal location of the Chd7 mutation was first identified using meiotic mapping. A cohort of F1N1 mice were grouped as hearing impaired (threshold ≥45 dB SPL) or normal (hearing threshold ≤35 dB SPL) utilising the ABR click test. Twenty hearing impaired F1N1 mice were genotyped for 660 SNPs spaced every 5-10 Mbp throughout the genome. A shared common haplotype was observed proximally on the long arm of chromosome 4, between the centromere and rs13477553 (Fig. 1A). This localized the Trooper mutation to a 9Mbp region between the centromere and rs13477553.
Liver derived genomic DNA of two BALB/c-Chd7 +/Trooper N7 mice was subjected to exome enrichment and massively parallel DNA sequencing. Raw single nucleotide variant calls were filtered through a custom analysis pipeline to create a list of candidate SNVs for each mouse. However, no candidate SNVs were returned within the minimal linkage region on chromosome 4. Given that Chd7 fell within the linkage interval and that Trooper mice exhibited similar phenotypic attributes to the previously described Chd7 Looper mouse 9 , the exome sequence of Chd7 was re-inspected. Visual examination of sequence alignment data showed that a point mutation within intron 9 of Chd7 (c.3219-18 T > A, NCBI reference sequence NM_001277149.1) had been detected, but had subsequently been filtered out by the computational pipeline. Sanger sequencing of genomic DNA from hearing impaired and normal littermates validated the SNV (Fig. 1B).
The Trooper mutation creates a cryptic splice site. To investigate transcriptome changes, PCR amplification of Chd7 +/Trooper complimentary DNA was performed. Gel electrophoresis indicated that three transcripts were present. For validation, molecular cloning of the mutant transcripts was performed in a blue/white screen. PCR sequencing of 44 colonies revealed three different transcripts ( Fig. 2A), 14 of which had exon 10 skipped, 6 retained part of intron 10 and 24 were wild type ( Fig. 2B and C).  phenotypes associated with CHARGE syndrome, the middle and inner ears were imaged using µCT. This was done on Trooper mice with and without circling behavior. The Chd7 +/Trooper stapes was malformed in a small rounded shape and an unusual bone growth extended between the tubercle and otic capsule. In the non-circling mouse, µCT imaging showed that the stapes footplate was deformed and did not contact the oval window (Fig. 3B). However, a more severely affected Chd7 +/Trooper (with circling behavior) had the stapedial footplate fused to the oval window (Fig. 3C). µCT also showed partial development of the lateral semicircular canal and varied levels of posterior and anterior canal hypoplasia. The intersection of the posterior and superior canals is enlarged and misshapen.
In CHARGE affected individuals, hypoplasia of the semicircular canals can cause vestibular areflexia, resulting in poor motor and speech development 14 . In mice, this impairment manifests in head bobbing, tilting and circling behaviors 15 . These traits were observed in Trooper mice indicating that their vestibulo-ocular reflex is impaired. However, a full vestibular evaluation was not performed.
Trooper mice are hearing impaired. ABR thresholds of Chd7 +/Trooper mice were significantly elevated between 4 and 16 kHz when compared with wild type controls (Fig. 4A). Hearing impairment was greatest at lower frequencies, with an average threshold shift of 35-40 dB SPL at 4 and 8 kHz. An average threshold shift of 20 dB was observed at 16 kHz. At 32 kHz, Chd7 +/Trooper mice tended to have slight hearing threshold elevation, however this was not significant.

Discussion and Conclusions
The Trooper mutation emerged from an ENU mutagenesis screen designed to identify hearing impaired mice. Linkage mapping, exome sequencing and Sanger sequencing were used to pinpoint a mutation (c.3219-18T > A) in the Chromodomain Helicase DNA binding 7 (Chd7) gene. Mutations in this gene cause a pattern of defects collectively known as CHARGE syndrome in humans. CHARGE is a complex disease, affecting multiple organs with extreme heterogeneity between individuals. Likewise, variable phenotypic penetrance is common amongst the numerous murine CHARGE models. CHD7 deficient mouse lines are typically named after their striking circling behavior. Examples such as Cyclone, Dizzy 16 , Whirligig 17 and Looper 9 have marked vestibular deformity, severe eye anomalies, choanal defects, cleft palate and multiple minor CHARGE features. On the other hand, tail chasing is essentially absent from the Trooper strain. Choanal atresia and heart defects are unlikely given the survival rates of Chd7 +/Trooper mice post weaning, and common pathologies such as micropthalmia and blepharoconjunctivitis were rarely observed (Table 1). Additionally Chd7 +/Trooper hearing loss was less severe than in other strains (Fig. 5) 9 . Overall, the Trooper phenotype is particularly mild ( Table 1). Like Trooper, the atypical CHARGE phenotype in humans lacks major diagnostic features such as heart defects and choanal atresia. However hearing loss, growth deficiency and inner ear malformations remain common 18 . The consistency of inner ear malformation across all forms of CHARGE syndrome highlights that structural patterning in the ear is particularly sensitive to CHD7 expression levels. Indeed, CHD7 knockdown inhibits neural crest cell migration into the pharyngeal arches, impacting derived structures such as the ossicle chain 19 . To date, functional analysis of CHD7 has proven difficult, due to the exceptional size of the gene (185Kbp, with 42 exons, GRCh38.p7, NC_000008.11, protein ~ 336 kDa). As a result, there is no clear explanation as to why some patients present with a milder form of the disease, however it is thought that CHD7 truncating mutations cause a more severe phenotype than missense mutations 20 . CHD7 deficiency does affect cellular differentiation, proliferation and migration in a dose specific manner, which may account for some of the milder CHARGE phenotypes 19 . Yet, a genotype-phenotype correlation does not exist. This is most clearly evident when multiple sibling pairs (including monozygotic twins) carrying identical mutations present with significant phenotypic differences 5 .
The Trooper mutation creates an AG marker within a cryptic splice site, that interrupts canonical splicing of Chd7. In eukaryotes, dinucleotides at the 5′ (GT) and 3′(AG) ends of intronic sequence are essential for specifying splice site positions. These markers define exon boundaries and lead to the recruitment of the spliceosome for intron removal. Multiple cryptic splice sites exist within precursor-mRNA's, however they are not used unless the authentic splice site is interrupted by mutation. In rarer cases, such as the Trooper mutation, intronic de novo 3′ splice sites are activated by the creation of an AG marker in the polypyrimidine tract 21 . Generally, activated cryptic splice sites are within 100 nucleotides and upstream of the authentic splice site 22  is consistent with these observations and is preceded by a pyrimidine rich sequence, which likely initiates spliceosome assembly. Thus, the Chd7 +/Trooper mutation activates a cryptic splice site, prematurely signaling the end of intron 9. As a result, two aberrant isoforms are produced. The isoform we termed Chd7 x retains 16 nucleotides from the 3′ end of intron 9. This insertion causes a frame shift, leading to a premature stop codon in Exon 10. CHD7 x terminates before the helicase, BRK and SANT domains and is presumed to be non functional. The predicted truncation may induce nonsense mediated mRNA decay. However, the presence of CHD7 x cDNA in the blue/white screen shows that the product was not completely degraded prior to reverse transcription.
The second transcript, Chd7 y , skips Exon 10. Skipping of Exon 10 removes a large portion of the second of two chromo domains in CHD7 required for histone tail binding. However, remaining in frame, Chd7 y likely avoids mRNA degradation. It is certainly possible that Chd7 y retains some function and given the Trooper phenotype, a dominant negative effect is unlikely.
The atypical CHARGE pathology observed in Chd7 +/Trooper mice indicates that one of the alternate isoforms is retaining a measure of functionality, or that the spliceosome is occasionally creating enough wild type CHD7 to reach a critical threshold. As the wild type Chd7 transcript was not significantly over represented in the blue/ white screen, we predict that CHD7 y is retaining partial functionality with a single chromodomain.
Chromodomains are highly conserved, structural components of large chromatin remodeling proteins that regulate gene activity and genome organization. In a protein specific manner, chromodomains present in tandem, individually or with related 'chromo shadow domains' 23 A defining characteristic of Chromodomain Helicase DNA binding proteins is the presence of dual chromodomains. Whilst the mechanism of H3 histone binding in chromatin remodelers differs significantly between proteins containing single and dual chromodomains, the efficacy of H3 binding is comparable 24 . Therefore CHD7 y will likely retain histone binding capabilities despite deletion of the C-terminal chromodomain. A human CHD7 splice variant (CHD7s) provides further evidence that CHD7 y may be functional without tandem chromodomains. CHD7s lacks the C-terminal chromodomain as well as the helicase/ATPase, DNA-binding, and BRK domains. CHD7s acts cooperatively with CHD7 in the nucleus and antagonizes CHD7 in the nucleoplasm 25 . The ability of CHD7s to drive 45S rRNA gene transcription indicates that the CHD7 N-terminal chromodomain is functional in the absence of the CHD7 C-terminal chromodomain. Further analysis of the Trooper CHD7 isoforms will likely establish the functional importance of chromodomains within a tandem repeat.
There is mounting evidence of the significance of RNA splicing with regard to human disease, which suggests the practice of filtering out intronic variants (of unknown significance) is limiting our ability to diagnose patients. Currently, as many as 30% of CHARGE patients lack a genetic diagnosis. Some chromosomal deletions, one SEMA3E mutation, one RERE duplication and one KMT2D mutation have reportedly caused CHARGE like attributes [26][27][28] . However CHD7 mutations are the preponderant cause of clinically diagnosed CHARGE 27 . We postulate that the poor genetic diagnosis rate is due to methodology in diagnostic laboratories, which tend to limit CHD7 analysis to coding exons and dinucleotide splice sites 3,5 . Certainly, the Trooper mutation would be missed with current diagnostic methods and as the 48 nt leading to exon 10 in Chd7 is conserved between species, the mutation could be pathogenic in humans (NCBI ref seq. NG_007009.1). The sequence conservation of this region also indicates functional relevance beyond the canonical splice site.
After projects derived from the Encyclopedia of DNA Elements (ENCODE) contentiously suggested that 80% of the human genome had biochemical functions, a wealth of knowledge has been gained showing that the evolutionarily dynamic intronic sequence is biologically relevant in humans [29][30][31] . In an age of rapidly evolving diagnosis via genetic testing, the Trooper mouse is a timely reminder that as the methodology around exome and genome sequence analysis develops, it would be wise to reconsider ways to screen for mutations in introns, particularly in sequence with homology to cryptic splice sites.

Conclusion
Trooper mice carry an intronic ENU -induced point mutation that results in alternate splicing of Chd7. The strain has a mild phenotypic presentation resembling atypical CHARGE syndrome including hearing impairment, hypoplasia of the semicircular canals and stapes malformation. One of the Chd7 alternate transcripts detected in Trooper is predicted to be non-functional whilst the other is likely translated, with the resultant protein at least partially functional. The combined expression of CHD7 and CHD7 y in Trooper relative to the haploinsufficiency observed in other models such as Looper likely explains the milder presentation of the phenotype.
Further study of the Trooper Chd7 isoforms will provide insight into the functional domains of CHD7 and clarify the dosage effects of CHD7 expression on development and function in a variety of organ systems. In addition, the identification of this mutable cryptic splice site within the intronic sequence of Chd7 should prompt closer scrutiny of the non-coding genomic sequence of CHARGE patients for whom coding mutations of CHD7 have not been identified.