Introduction

Spinocerebellar ataxia type 6 (SCA6, MIM 183086) is a late-onset, slowly progressive neurodegenerative disorder that characteristically presents with dysarthria, gait and limb ataxia.1 The disease is caused by an expansion of the CAG repeat in the α1A subunit of the voltage-dependent calcium channel gene CACNA1A.2 SCA6 appears to be a major cause of dominantly inherited ataxia, affecting at least 1.59/100 000 of the UK population3 and accounting for between 6 and 32% of families with autosomal dominant ataxia.4 Haplotype analysis in different geographical regions identified shared regions of chromosome 19p13 in affected individuals, suggestive of a common founder chromosome existing in Germany,4 Japan,5, 6 the Netherlands7 and the United Kingdom.3 This raises the possibility that all affected individuals inherited one or a few common founder chromosomes, as has been described for nucleotide repeat disorders.8, 9, 10, 11 Against this hypothesis, a few proven de novo SCA6 CAG expansions have been described,12, 13 pointing toward a predisposing chromosome, rather than a founder effect. We previously identified a common CACNA1A haplotype present in 16 pedigrees with SCA6 from the northeast of England.3 To determine whether this haplotype was due to a founder effect, or was predisposing to CAG repeat expansion in CACNA1A, we carried out microsatellite analysis on SCA6 families throughout the world, including a de novo case.

Materials and methods

Subjects

Haplotype analysis was carried out on 96 individuals (95 affected and 1 unaffected) from 45 families with a molecular diagnosis of SCA6. Twenty-two families were from the northeast of England (n=37 subjects, 16 of these families have been described before3), 12 families were Japanese (n=31 subjects), 2 families were Brazilian (n=7 subjects), 2 families were Finnish (n=2 subjects) and 9 families were Taiwanese (n=18 subjects). Allele frequencies were determined in control subjects from corresponding geographical regions (northeast of England, n=50; Japan, n=100; Brazil, n=50; Taiwan, n=56).

Genetic analysis

All samples were genotyped for the following (CA)n microsatellite markers in the same laboratory D19S912, D19S906, D19S221, D19S914, D19S1150, D19S840, D19S226, D19S899 and D19S414 (primer sequences and map positions were obtained from the NCBI UniSTS database; Figure 1; Supplementary Table 1, online). Primers were 5′ fluorescent labeled and PCR products for each allele were simultaneously analyzed on a single capillary DNA analyzer (Beckman CEQ 8000). Reaction conditions 0.25 pM of each primer, 1 U of Promega or HotMaster™ Taq DNA polymerase with 1 × associated buffer, 2 mM dNTPs and 250 ng of DNA. Amplification was carried out at 94°C for 4 min, (or 94°C for 2 min for HotMaster), followed by 30 cycles at 94°C for 1 min, annealing temperature for specific microsatellite for 1 min and 72°C for 1 min with a final extension of 72°C for 10 min.

Figure 1
figure 1

Schematic representation of microsatellite marker positions on chromosome 19. Black box=CACNA1A. (Mb=megabase, kbp=kilobase pairs).

Statistical analysis

Haplotypes were constructed manually for the familial samples and inferred using Phase v 2.1.1 for the control subjects.14 The frequency of individual alleles and haplotypes in cases and controls were compared using Fisher's exact test. Linkage disequilibrium (LD) was estimated by the parameter δ, which is an approximation of the population attributable risk according to the equation, δ=(Fd−Fc)/(1−Fc), where Fd is the frequency of the allele in carrier chromosomes and Fc is the frequency of the allele in noncarrier chromosomes.15

Results

Affected individuals from the 6 additional English families shared microsatellites with the 16 families previously reported (Figure 2a), with highly significant association and LD between the intragenic marker D19S1150 and flanking marker D19S840 (Table 1 and Supplementary Tables 2 and 3, online). A similar haplotype was also found in the Japanese, Brazilian and Finnish SCA6 families (Figures 2b–d). Shared flanking markers between the SCA6 families suggest minor differences in the core haplotype (D19S1150 and D19S840) between these regions that probably arise through single mutation events, which occur between 0 and 7 × 10–3 per locus per gamete per generation.16 Again, specific alleles and haplotypes in each population were associated and in LD with mutated CACNA1A alleles (Tables 1 and 2, Supplementary Tables 2 and 3, online). The same centromeric marker alleles were also found in the Taiwanese families, as were the same telomeric alleles for D19S914/D19S906 (Figure 2e). Given that the same D19S914 alleles were strongly associated, and in LD with mutated SCA6 alleles, in both the Taiwanese and Japanese families (Table 1), this suggests that the different intragenic D19S1150 alleles in Taiwanese families are also due to mutation of the microsatellite, as described for other intragenic diallelic markers defining an ancient founder haplotype, which has spread throughout the world.17

Figure 2
figure 2

CACNA1A haplotypes for the SCA6 subjects. (a) British, (b) Japanese, (c) Brazilian, (d) Finnish and (e) Taiwanese SCA6 patients. Numbers after the decimal place represent individuals within a family. Haplotypes were determined manually. Both alleles were considered where it was not possible to determine phase. The most common (core) haplotype is shown in dark green and the allele sizes are as follows: D19S912(178), D19S906(158), D19S221(198), D19S914(90), D19S1150(160), D19S840(204), D19S226(245), D19S899(112) and D19S414(163). With paler shading representing slightly different haplotypes that are closely related to the core haplotype (differing by one or two dinucleotide repeats) and probably differ due to microsatellite instability. Unshaded regions are regions that differ through recombination and are not related to the core haplotype.

Table 1 Frequency of microsatellite marker alleles in SCA6 families and control subjects from the same geographic region
Table 2 Frequency of CACNA1A haplotypes in SCA6 families and controls defined by microsatellite markers D19S914, D19S1150 and D19S840

Haplotype analysis of a Japanese de novo SCA6 patient and his parents was also carried out. Although CAG20 alleles have been associated with late-onset mild ataxia,18 the patient was previously reported as being de novo by Shimazaki et al,12 given that the parents were neurologically and radiologically normal at the time of the original study, and on follow-up prior to this study (Figure 3). The extended haplotype suggested that the CAG expansion occurred during paternal transmission of the CAG20 allele, which is characteristic of CAG repeat disorders.19 Although the core haplotype carrying the mutated SCA6 allele in this family is rare (D19S914(90)-D19S1150(160)-D19S840(208)) (Supplementary Table 3, online), both telomeric (D19S914(90) and D19S1150(160)) centromeric to D19S840 markers that define the SCA6 chromosome in this family were also found in other Japanese families (Supplementary Table 2, online). Thus, although it is possible that the father will go on to develop a mild form of SCA6 late in life (and thus have a pathogenic allele), the haplotype analysis described here indicates that the unstable paternal chromosome is the likely origin of the de novo expansion, and that this occurred on a haplotype found in other Japanese SCA families.

Figure 3
figure 3

Pedigree for the de novo SCA6 patient and their parents showing the haplotypes and the possible inherited haplotype from parent to offspring.

Discussion

The identification of a common CACNA1A haplotype in affected individuals with SCA6 from Europe, Brazil and Japan supports the hypothesis that all SCA6 patients descend from a small pool of founder individuals. However, the demonstration a de novo SCA6 expansion on a similar genetic background raises the possibility of a predisposing haplotype leading to new mutation events in different populations across the globe. Recent evidence from SCA7 transgenic mice has shown that cis-acting elements 3′ to the repeat drive instability of the (CAG)n at the SCA7 locus20 and a similar mechanism may operate for SCA6. Mice generally require a larger (CAG)n tract than humans to show instability of repeat lengths21, 22, 23 but introducing large regions of flanking human sequence allows instability for moderate repeat lengths in some mouse models.20, 21

CpG methylation may also influence trinucleotide repeat tract stability through an effect on chromosomal structure.24, 25 We therefore used a bioinformatic approach using a previously described method to study GC content flanking other trinucleotide repeat genes.26, 27 Genomic sequence for 5000 bp upstream and 5000 bp downstream of the CACNA1A (CAG)n repeat sequence were downloaded from NCBI and analyzed using cpgplot (available through EMBOSS) in the 5′–3′ orientation, with a moving window of 500 bp and a step of 100 bp.27 This revealed a large CpG island immediately upstream of the CACNA1A (CAG)n repeat sequence (Figure 4, 56 bp upstream of the CAG repeat spanning 611 bp). These sequence characteristics are associated with repeat instability in other disorders.26 At 77.94%, the percentage GC content of the chromosomal region immediately upstream of the CACNA1A (CAG)n repeat is greater than that found for other unstable pathogenic repeat sequences (including HD, SCA1 and SCA3) and just less than the region flanking the highly unstable SCA7 at 83.5%.26

Figure 4
figure 4

Bioinformatic analysis of the 5000-bp of DNA sequence flanking either side of the CACNA1A (CAG)n repeat, based on the algorithm of Gardiner-Garden and Frommer (1987).27 Upper panel: CpG prediction plots (observed/expected). Middle panel: percentage of CG residues (% CG). Lower panel: predicted CpG islands. The (CAG)n repeat is indicated by an open arrow. Note that the numbering from left to right corresponds to the direction of the NCBI chromosomal sequence. The gene is transcribed from right to left (solid arrow).

It is intriguing that the region 5′ to the CAG repeat in the de novo expansion (Figure 3) was defined by markers strongly associated with other mutated SCA6 alleles (Table 1). Conversely, the region 3′ to the CAG repeat on the newly mutated chromosome contained alleles not associated with (CAG)n expansions (Table 1). This is in keeping with a common telomeric region predisposing to pathogenic repeat formation. Mammalian CpG islands are known sites for the initiation of transcription and may also act as origins of replication,28 and modification of repeat instability has been linked to both transcription and replication origin events.29 Methylation of CpG islands appears to stabilize GCG tracts in fragile X syndrome,30 although the mechanisms involved are likely to be complex,19 and may relate to GC content and chromatin structure.25 It is therefore possible that sequence variation within the CpG island immediately upstream of the (CAG)n alters the susceptibility of specific CACNA1A haplotypes to cause repeat expansion, as had been described for other repeat sequences.31 Identifying the underlying molecular mechanism will have important implications for our understanding of the onset and genetic anticipation of SCA6 in large dominant families.