Introduction

Ataxias, including hereditary and sporadic forms, are a group of neurological disorders that demonstrate extreme clinical and genetic heterogeneity. These disorders may present as a pure cerebellar form or as part of a more complex neurological syndrome.1 There are over 70 known forms of spinocerebellar ataxia (SCA) and hundreds of additional genetic disorders that include ataxia as part of the clinical presentation. Ataxia may present at any age ranging from infancy to adulthood, and can manifest as dominant, recessive, or X-linked conditions.1,2,3 Nucleotide repeat expansions are estimated to be the basis of ~50–60% of dominant hereditary ataxias.1 Outside of these repeat expansion disorders, the mutational spectrum for most genes associated with ataxia is predominantly single-nucleotide variations and small insertion/deletion events.

Determining the genetic etiology of ataxia in affected patients is important for clinical care, disease prognosis, and in some instances targeted therapy.1,4 Exome sequencing is widely used as a diagnostic tool for disorders with broad clinical spectrum and genetic heterogeneity.5,6 To date a number of exome sequencing studies have been performed in patients with ataxia, with positive yields ranging from approximately 40–60%.3,7,8,9 These studies were either focused on specific populations of ataxia patients, such as pediatric forms of ataxia7,8 or adult-onset/sporadic patients,9 or had specific inclusion criteria and prior testing requirements.3

We assessed the utility of a targeted exome approach (exome sequencing followed by targeted analysis) in a cohort of unselected patients with ataxia-related phenotypes, representative of the general patient population seen at neurology and genetics clinics throughout North America. We present our findings on the first 170 patients studied. Patients ranged in age from 2 to 88 years with congenital to adult-onset, and presented with a wide clinical spectrum, ranging from isolated ataxia to syndromic presentations. In addition, some patients presented with other overlapping movement disorders, such as spastic paraplegia.

For our analysis, we studied 441 ataxia-related genes either known to be associated with ataxia as the predominant feature, as part of the clinical spectrum, or associated with a movement disorder with clinical overlap. We identified pathogenic and suspected diagnostic variants in 88 of the 170 patients, providing a positive molecular diagnostic rate of over 50%. Six genes accounted for >40% of positive cases. We report on a targeted exome approach for the largest cohort of patients with ataxia-related phenotypes to date and demonstrate it to be a high-yield tool for the genetic diagnosis of this group of disorders.

Materials and methods

Patients and collection of clinical information

Samples from a consecutive unselected set of 170 patients with ataxia referred for targeted exome sequencing between April 2014 and September 2016 were included in this study. Patients were referred by 76 physicians from 49 different institutions and included patients with pure ataxia or complex neurologic or multisystem disorders associated with ataxia. The study cohort included 105 patients from the United States, 63 patients from Canada, 1 patient from Mexico, and 1 patient from Turkey. Overall 56% (96/170) of the patients were female and 44% (74/170) were male. Fifty-four percent were >30 years old at the time of referral (91/170), and 46% were ≤30 years old (79/170). Patient age at time of referral ranged from 2 to 88 years. Information regarding previous testing was provided for 153 patients. Sixty-three percent (97/153) were reported to have previously had nucleotide repeat expansion testing while the remaining 37% (56/153) were reported not to have had any previous repeat expansion testing. Clinical information was collected using a standardized clinical checklist completed by the ordering physician that captured information such as age of onset of ataxia, type of ataxia, neurological features such as spasticity, seizures or movement abnormalities, cognitive/developmental status, brain anomalies, abnormalities of other systems, family history, and results of previous testing. Our laboratory did not systematically confirm clinical characteristics and prior laboratory investigations of patients reported by referring clinicians, but suggested exclusion of common trinucleotide repeat expansions causing hereditary cerebellar ataxia if suspected. All patients provided consent for exome-based sequencing.

Development of ataxia-related gene list

An ataxia-related gene list was generated by systematic review of databases such as PubMed, OMIM,10 Human Gene Mutation Database (HGMD) [11], and the Human Phenotype Ontology (HPO).12 A list of genes curated from our searches was developed to include genes that had sufficient evidence associating them with ataxia-related conditions. This included 441 genes implicated in pure forms of cerebellar ataxia, genes associated with syndromes that have ataxia as part of the clinical presentation, and genes associated with spasticity and other movement abnormalities that may be misinterpreted as ataxia (Supplementary Table 1).

Exome sequencing and data analysis

Exome sequencing was performed using the Agilent SureSelect Clinical Research Exome kit (Agilent Technologies, Santa Clara, CA, USA) that targets the exome with improved capture of exons of medically important genes. Sequencing was performed using Illumina NextSeq technology with 150-bp paired-end reads (Illumina, San Diego, CA, USA). Variants within exons and canonical splice sites of the 441 genes were identified and evaluated using a validated, custom bioinformatic pipeline. Variants with a global population frequency of ≥1% in ExAC were excluded. Variants were interpreted by a team of board-certified geneticists, genetic counselors, and neurologists. The American College of Medical Genetics and Genomics (ACMG) guidelines for sequence variant interpretation were utilized to categorize variants.13 Variants with a frequency >0.1% in genes for dominant or X-linked conditions were classified as benign while variants with >0.5% frequency in genes for recessive conditions were considered strong evidence for benign. Data was assessed for quality, and to confirm it had a minimum coverage of 30× for at least 90% of targeted regions. The mean depth of coverage per sample was over 150×, and on average more than 96% of the targeted regions were covered at a minimum of 30×. This analysis did not screen for repeat expansions disorders that are a known cause of several forms of SCA. Variants considered likely related to the patient’s phenotype were confirmed by Sanger sequencing.

RNA splicing analysis

RNA was extracted from blood using the PAXgene Blood RNA Kit (BD Biosciences, San Jose, CA, USA) according to the manufacturer’s instructions. Reverse transcription polymerase chain reaction (RT-PCR) was performed using OneStep RT-PCR Kit (Qiagen, MD, USA) according to the manufacturer’s instructions using specific primers targeting the potential splice sites. The c.16024-13C>G variant in intron 83 of the SYNE1 gene was targeted using primers in exon 82 (CATGCAGGAGAAAGTGAAGA) and exon 85 (TGGTCTGCTGGTGAAGTTCA). The c.1529C>T and c.2181+5G>A variants in exon 11 and intron 16 of the SPG7 gene, respectively, were targeted using a combination of primers in exon 10 (TTCATTGATCTCCCCACGCT), exon 15 (ACTCCATGGTGAAGCAGTTTG), exon 17 (CCCAAGTCCTGTTTCTCCCT), and 3’UTR (CAAACCTCAGCTGAAAAGCAA). Amplification products were sequenced using ABI’s dye-terminator chemistry (Applied Biosystems, Foster City, CA, USA).

Results

Clearly pathogenic and suspected diagnostic variants were identified in 88 of the 170 patients, leading to a positive molecular diagnosis rate of 52%. Table 1 lists the genetic variants identified and the clinical features of the patients. When analyzed by gender, pathogenic and suspected diagnostic variants were observed in 61% of females (54/88) and 39% of males (34/88). When analyzed by age group, pathogenic and suspected diagnostic variants were observed in 57% of patients ≤30 years old (45/79), and 47% of patients >30 years old (43/91). Pathogenic or suspected diagnostic variants were identified in 54% of those reported to have previously tested negative for at least one ataxia-associated repeat expansions disorder (52/97), and in 34% of those who were reported to have had no previous repeat expansion testing performed (19/56). Of the 19 patients with clearly pathogenic and suspected diagnostic variants identified in whom no prior repeat expansion testing was performed, all but 1 are of the younger age group with the majority developing ataxia at less than 10 years of age. Repeat expansions in these 19 patients are therefore not expected. Of the 88 patients in whom a positive finding was made, 33% were reported as having a positive family history.

Table 1 Clinical features and exome sequencing results of 88 patients identified with pathogenic/suspected diagnostic variants

Pathogenic or suspected diagnostic variants were identified in 46 genes (Table 1; Supplementary Table 2), 25 of which were associated with autosomal recessive inheritance, 19 with autosomal dominant inheritance, 1 with X-linked inheritance, and 1 with both autosomal dominant and recessive inheritance. Fifteen genes were implicated in more than one patient (Table 1; Supplementary Table 2), and accounted for 65% (57/88) of positive cases. The six most common genes identified with pathogenic variants were SPG7 (8), SYNE1 (8), ADCK3 (6), CACNA1A (6), ATP1A3 (4), and SPTBN2 (4). Together pathogenic variants in these six genes accounted for >40% (36/88) of the positive cases (Supplementary Table 2).

Pathogenic variants in the SPG7 gene accounted for 9% of pathogenic variant-positive patients (5% of patients overall). Five of these patients carried the common pathogenic p.Ala510Val variant,14 which was observed in the compound heterozygous/homozygous state in four patients and in the heterozygous state in one patient. One of the compound heterozygous patients (059) had an intronic variant, c.2181+5G>A, which was predicted to affect the canonical splice donor site of intron 16. Testing of the patient’s father revealed absence of both variants; paternity was confirmed by microsatellite analysis. Two unaffected siblings were found to carry only the p.Ala510Val variant; the patient’s mother was unavailable for testing (Fig. 1).

Fig. 1: Reverse transcription polymerase chain reaction (RT-PCR) analysis in patient 059.
figure 1

RT-PCR analysis across exons 11–16 of the SPG7 gene demonstrated the presence of a single allele containing c.1529T in exon 11 in patient 059. The two siblings who are heterozygous for c.1529C>T (p.Ala510Val) and do not carry the c.2181+5G>A intron 16 variant demonstrated the presence of two alleles at position c.1529 (c.1529C and c.1529T). The normal control demonstrated c.1529C at this position. This result is indicative of the c.1529C>T and c.2181+5G>A variants being in trans in patient 059 with the c.2181+5G>A allele not being amplified

To determine the impact of the c.2181+5G>A variant on SPG7 splicing, and to further evaluate the phase of the two variants identified in SPG7, we performed RT-PCR analysis on RNA from the affected patient and two unaffected siblings who were carriers of the c.1529C>T (p.Ala510Val) variant. RNA sequence analysis showed only the presence of a single mutated c.1529T allele in patient 059 and two alleles, the wild-type c.1529C and mutated c.1529T, in the unaffected siblings (Fig. 1). The presence of only one RNA sequence product in the patient suggests that the second allele carrying the c.2181+5G>A intronic variant either could not be amplified due to aberrant splicing or was subject to nonsense-mediated decay (NMD). NMD appears unlikely as the c.2181+5G>A variant is located within the final exon–intron junction of the SPG7 gene. These results indicate that the p.Ala510Val and the c.2181+5G>A variants are present in trans in this patient and that the c.2181+5G>A variant occurred de novo.

Pathogenic variants in the SYNE1 gene were also identified in eight patients (9% of mutation-positive patients and 5% of patients overall). Five had homozygous or compound heterozygous pathogenic variants while in three a heterozygous truncating variant was identified (075, 077, and 079). Because SYNE1-related spinocerebellar ataxia is inherited in an autosomal recessive manner, we hypothesize that the latter three patients have a second pathogenic variant in SYNE1 that was not identified. These three patients had a heterozygous novel missense variant in the SYNE1 gene (p.Val1070Ala, p.Gln5194Arg, and p.Ala1708Ser respectively) in addition to the truncating variant. The p.Gln5194Arg variant was determined to be in cis with the truncating SYNE1 variant by parental testing and classified as likely benign while the significance of the p.Val1070Ala and p.Ala1708Ser variants remains unknown.

Patient 078 was initially identified as carrying a SYNE1 c.6898del pathogenic variant in the heterozygous state, and a missense variant, p.Arg4373Gln, present at a frequency of 0.001% in ExAC. Further analysis of noncoding regions revealed an additional intronic variant in this patient, c.16024-13C>G. The c.16024-13C>G variant was predicted to create a de novo splice acceptor site in intron 83. RT-PCR analysis of the patient’s RNA demonstrated an aberrant isoform with an insertion of 12 bp into exon 84 that was not present in the normal control. This isoform was predicted to cause a premature stop codon one base pair after the sequence change and was therefore classified as pathogenic (Fig. 2). Follow-up maternal testing revealed the c.6898del and p.Arg4373Gln variants to be cis, indicative of the missense variant likely being benign.

Fig. 2: Reverse transcription polymerase chain reaction (RT-PCR) analysis in patient 078.
figure 2

RT-PCR analysis across exons 83 and 84 of SYNE1 demonstrated aberrant splicing with the c.16024-13C>G variant being used as a cryptic splice acceptor site. Both the normal and aberrant splice product was observed in patient 078 compared with only a normal splice product in the normal control sample. Aberrant splicing at the c.16024-13C>G site results in the creation of a truncating TGA stop codon immediately after the splice site

All 170 patient samples were analyzed as singleton exomes rather than parent–proband trios. In 23% of the positive cases (20/88), targeted variant analysis was performed on one or both of the patient’s parents or other selected family members to determine segregation of selected variants. In several cases, testing of family members showed segregation of the pathogenic or suspected diagnostic variant with disease, providing supporting evidence for the pathogenic nature of the variant identified in the proband (Table 1). For example, a number of extended family members of patient 031 were tested for a variant in the ELOVL4 gene (c.512T>C), associated with autosomal dominant spinocerebellar ataxia, type 34 (SCA34). The variant was also present in an affected first cousin, two affected second cousins, and one affected third cousin, and absent in an asymptomatic 63-year-old first cousin and asymptomatic 73-year-old third cousin. The variant was present in a 51-year-old asymptomatic sibling; because age of onset of SCA34 ranges from the second to sixth decade of life,15 it is possible this individual may become symptomatic later in life.

Discussion

In our cohort of 170 patients with ataxia, pathogenic or suspected diagnostic variants were identified in 52% of cases, with variants identified in 46 genes overall. Genetic approaches such as familial segregation for several variants or RNA splicing analysis for intronic variants strengthened our findings. For the majority of these genes (31/46), pathogenic or suspected diagnostic variants were each identified in only a single patient, highlighting the significant genetic heterogeneity in ataxia-related disorders. Together, the high number of genes implicated in our cohort, the rare nature of many of the associated disorders, and the large number of novel pathogenic variants identified demonstrates the benefit of a targeted exome approach in patients with ataxia.

The diagnostic yield was higher in younger patients although the result was not statistically significant; pathogenic or suspected diagnostic variants were identified in 57% of patients who were ≤30 years old, compared with 47% of those >30 years old (χ2 = 1.60, p = 0.20). We observed a statistically higher proportion of positive cases in the subset of patients who were reported to have previously tested negative for at least one ataxia-associated repeat expansions disorder (54%) compared with those who were reported to have had no previous repeat expansion testing performed (34%) (χ2 = 5.53, p = 0.02). Our results highlight the significant utility of targeted exome sequencing in patients who have had previous negative expansion testing.

This study represents the general patient population with ataxia-related disorders seen in neurology and genetics clinics. Previous studies of the utilization of exome sequencing in patients with ataxia have focused on specific selected cohorts, such as pediatric patients, or had cohorts selected from a single clinic or referral center with specific inclusion criteria or prior testing requirements.3,7,8,9 In contrast, our study cohort is an unselected set of 170 consecutive patients with ataxia referred to our laboratory nationally for targeted exome sequencing regardless of age or the presence or absence of additional clinical features. Our finding of pathogenic or suspected diagnostic variants in 52% of our cohort therefore represents the likely diagnostic yield of a targeted exome approach for unselected patients with ataxia.

The criteria for molecular diagnosis were used when a pathogenic or highly suspicious variant was identified in a gene that could explain the patient’s phenotype. Some patients in whom only a single heterozygous pathogenic/highly suspicious variant was detected in a gene for a recessive condition, were included in the positive cohort only if their phenotype fit with associated disease for the gene in question. For example, patient 036 with a heterozygous highly suspicious variant in INPP5E had a “molar tooth” sign on brain MRI, which is a typical finding in Joubert syndrome; patient 060 with a heterozygous pathogenic variant in SPG7 (the common p.Ala510Val pathogenic variant) had spastic ataxia, which is typically associated with SPG7; and patients 075, 077, and 079 with heterozygous truncating variants in SYNE1 had progressive cerebellar ataxia and cerebellar atrophy, typical of patients with SCAR8. In such patients we speculate that a second pathogenic variant is present but not detected.

Our strategy of analyzing a large set of ataxia-related genes in all referred patients provides a high positive molecular diagnostic yield, and is particularly useful for making a molecular diagnosis in patients who have an atypical presentation for a particular disorder. An example is patient 013, a 2-year-old girl who presented with progressive episodic cerebellar ataxia onset at age 16 months. In this subject, we identified two previously reported pathogenic missense variants (p.Asp337Val and p.Pro428Leu) in the ARSA gene, associated with autosomal recessive metachromatic leukodystrophy (MLD)/arylsulfatase A deficiency.16 Follow-up arylsulfatase A enzyme activity testing revealed loss of enzyme activity, confirming the diagnosis. Ataxia is not a typical presenting finding of late-infantile MLD, although there are recent case reports of other patients presenting with ataxia.17,18 Due to the atypical presentation, the diagnosis of MLD was not originally on the differential diagnosis for this patient. Based on the confirmed diagnosis of MLD, the patient underwent hematopoietic stem cell transplantation and at age 3 years was reported to have mild graft versus host disease and minimal neurological abnormalities.

The results from our cohort broaden the clinical spectrum of several gene-specific ataxia conditions. Six patients were identified with variants in the CACNA1A gene; two patients had frameshift variants, while the remainder had missense variants. CACNA1A is associated with a range of phenotypes including episodic ataxia-2 (EA2), SCA6, familial hemiplegic migraine, and early infantile epileptic encephalopathy; SCA6 is typically associated with a trinucleotide expansion in CACNA1A,19 while the other disorders are associated with SNV in this gene. Based on the reported clinical findings in each patient, including no reported hemiplegic migraine or epileptic encephalopathy, EA2 is the likely diagnosis for all six patients identified. Five of the six CACNA1A-positive patients were reported to have either developmental delays or cognitive impairment. Developmental delays and intellectual disability have previously been described in a minority of cases with CACNA1A variants;20,21 however in our cohort developmental delays or cognitive impairment were observed in the majority of cases. Our results suggest that developmental delay and cognitive impairment may be more common for patients with EA2 than previously recognized.

Autosomal recessive spastic ataxia of Charlevoix–Saguenay (ARSACS) is associated with biallelic pathogenic variants in the SACS gene. ARSACS typically presents with cerebellar ataxia around age 12–24 months, lower limb spasticity, and peripheral neuropathy, although atypical presentations and later ages of onset have been reported.22 The majority of pathogenic variants described in SACS are nonsense or frameshift variants, which are predicted to be protein truncating.23 We identified biallelic variants in the SACS gene in 3 patients (1.7%). Two patients (054 and 055) had biallelic truncating variants and appear to have a typical ARSACS clinical presentation with childhood onset of symptoms. The third patient (056) had two missense variants in SACS, likely in trans based on the presence of only one of the variants in the patient’s father (Table 1). This patient exhibited an atypical disease presentation with spastic paraplegia and lower limb spasticity, with onset of symptoms in adulthood. No clear genotype–phenotype correlations have been determined for this gene to date, however at least two other cases of late-onset disease associated with biallelic missense variants in SACS have been reported.24,25

Defects in SYNE1 are associated with autosomal recessive spinocerebellar ataxia (SCAR8) and autosomal dominant Emery–Dreifuss muscular dystrophy. To date, almost all described variants in SYNE1 that result in the recessive ataxia phenotype have been protein truncating, while almost all variants reported to be associated with dominant Emery–Dreifuss muscular dystrophy have been missense. In this study, 8 of the 88 positive patients (9%) identified had at least one truncating variant in SYNE1. Overall in this study cohort, missense variants in SYNE1 were frequently identified, with ~20% of ataxia patients sequenced in our laboratory carrying at least one low frequency (<1%) missense variant. Three patients (075, 078, 079) originally had only one truncating variant in SYNE1 detected, and all three also had rare missense variants in SYNE1. Patient 078 was subsequently identified to have a second truncating variant (c.16024-13C>G) in this gene. The overall high frequency of rare SYNE1 missense variants in our cohort and the finding of an intronic truncating variant in one patient emphasizes the need for caution when interpreting missense variants in the SYNE1 gene, even in the context of a second pathogenic variant.

Eight patients with at least one pathogenic variant in SPG7 were identified, which is associated with spastic paraplegia-7 (SPG7).26 Hereditary spastic paraplegias are associated with progressive gait difficulties,27 and therefore may be clinically difficult to distinguish from ataxia. More importantly, patients with SPG7 pathogenic variants can present with spastic ataxia where ataxia is the most notable feature.28 The common pathogenic p.Ala510Val variant was detected in five patients—in the homozygous/compound heterozygous state in four patients and in the heterozygous state in one patient. This variant has previously been reported in the homozygous, compound heterozygous, and heterozygous state in patients with SPG7, suggesting its association with both autosomal dominant and recessive forms of SPG7.14,26,29,30 In our cohort, patient 059 was originally identified as carrying only the p.Ala510Val variant in the heterozygous state, however through further analysis this patient was subsequently found to have a second intronic variant in the SPG7 gene (c.2181+5G>A) determined to affect RNA stability. It is possible that other intronic or regulatory variants are present in the SPG7 gene, which have not been identified to date. We believe it to be more likely that the previous association of SPG7, and the p.Ala510Val variant, with autosomal dominant disease is in fact a subset of affected patients where the second SPG7 pathogenic variant has not been detected.

Pathogenic or suspected diagnostic missense variants were identified in the SPTBN2 gene, associated with autosomal dominant spinocerebellar ataxia 5 (SCA5),31 in four patients. The association of variants in the SPTBN2 gene and SCA5 was first reported in 2006;32 however overall only a limited number of publications have described pathogenic variants in this gene associated with SCA5.33,34,35 Our finding of four additional cases of SCA5 adds to the body of knowledge for this disorder. In our cohort, variants in SPTBN2 were identified in 4.5% of positive cases (2.4% of cases overall). Our results suggest that SPTBN2-related SCA5 may be more common than previously recognized, accounting for an estimated 2–3% of all ataxia cases. Biallelic truncating variants in SPTBN2 have been associated with autosomal recessive spinocerebellar ataxia 14 (SCAR14).36,37 In our cohort, all variants identified in SPTBN2 were heterozygous missense variants, consistent with the known mutational spectrum for autosomal dominant SCA5. Our results indicate that SPTBN2-related SCAR14 appears to be a rare cause of ataxia compared with SPTBN2-related SCA5.

Detailed clinical and family history information is important for guiding data interpretation. An example of the challenges of confirming the molecular diagnosis is patient 007, a 43-year-old woman with adult-onset cerebellar ataxia, hyperreflexia, and cerebellar atrophy who had two variants identified in genes that could potentially be related to the phenotype. The first is a novel missense p.Pro688Ala variant in AFG3L2, associated with adult-onset, slowly progressive, dominant spinocerebellar ataxia-28 (SCA28). The p.Pro688Ala variant in AFG3L2 is novel and affects a highly conserved amino acid residue located in the M41-protease domain of AFG3L2 protein, where pathogenic sequence changes are clustered38 and multiple in silico predictions demonstrated a deleterious effect. The second is a splice variant c.492+2T>C, affecting a canonical splice donor site in DARS2 implicated in childhood-onset, slowly progressive, recessive leukoencephalopathy with brain stem/spinal cord involvement and lactate elevation (LBSL).39 Upon further communication with the referring physician, the family history revealed that the patient’s mother and two siblings have a similarly affected ataxia phenotype, strongly supporting a dominant pattern of inheritance. Therefore, SCA28 was determined to be the most likely genetic diagnosis for this family.

In conclusion, our study reports the results of targeted exome analysis in the largest cohort of patients with ataxia described to date. Our diagnostic rate of 52% demonstrates the effectiveness of targeted exome sequencing as a diagnostic tool in a diverse group of patients with ataxia, with no additional inclusion restrictions such as age of onset, clinical presentation, or requirements for prior testing. Through our approach of analyzing the same large ataxia-related gene set in all referred patients, we have been able to identify patients with atypical presentations and broaden the clinical spectrum for some genes and their associated disorders.

Our targeted-analysis approach to exome sequencing allows new ataxia-related genes, as they become identified, to readily be included for analysis. Indeed this is the current process in our laboratory with the most recent version of the panel containing 484 genes. In addition, it also provides the opportunity for broader analysis of negative patients by opening up analysis to all genes outside of known ataxia-related genes for potential new gene and new disease–gene association discovery. Finally, complementary methods such as transcriptome sequencing may be useful for detecting regulatory or splicing variants not detected by exome sequencing.