Introduction

Microsatellites, also referred as short tandem repeats (STRs), are composed of short motifs of one to six nucleotides tandemly repeated throughout the genome (Hancock 1999). These sequences present a unique mechanism of mutation characterised by variation in copy number, which was called “dynamic mutation” (Richards and Sutherland 1992). Additional interest in the evolution of microsatellites came from the discovery that expansions of triplet repeats, a special class of microsatellites, are linked to various human genetic disorders (Pearson and Sinden 1998; Schlötterer 2000). Examples of triplet-repeat disorders include Huntington’s disease (HD), myotonic dystrophy type 1 (DM1) and some spinocerebellar ataxias (SCAs), among which is SCA3, also known as Machado-Joseph disease (MJD). MJD (MIM 109150), is an autosomal dominant neurodegenerative disorder of late onset (mean age at onset 40.5 years) (Coutinho 1992) characterised by a complex and pleomorphic phenotype. Main clinical manifestations include cerebellar ataxia, progressive external ophthalmoplegia, pyramidal and extrapyramidal signs, dystonia and distal muscular atrophies (Coutinho 1992). MJD was first described in North American patients who had emigrated from the Portuguese islands of the Azores Archipelago (Nakano et al. 1972; Woods and Schaumburg 1972).

The Azores is formed by nine islands (Santa Maria, São Miguel, Terceira, Graciosa, Pico, São Jorge, Faial, Flores and Corvo) and located in the North Atlantic Ocean 1,500 km from the European mainland. The Islands were discovered uninhabited by the Portuguese navigators in the fifteenth century (Mendonça 1996). According to historical records, the first settlers came mainly from various regions of mainland Portugal and from Madeira Island. People of different origins, such as the Flemish, also made up part of the early settlers (Mendonça 1996). To date, 32 extended MJD families, with origins in Flores, São Miguel, Terceira and Graciosa islands, were identified in this population. In the Azores, the estimated prevalence of the disease is 1:2,309. However, in Flores Island, this disease reaches the highest worldwide value of prevalence, with 1:103 individuals being affected (Lima et al. 1998), thus constituting a public health problem. Genealogical information indicates that there were at least two different introductions of the MJD mutation in the Azores, probably by Portuguese settlers before the seventeenth century (Lima et al. 1998). Molecular data also corroborates this finding by demonstrating the presence of two distinct haplotypes in the Azores, both being found in MJD families from mainland Portugal (Gaspar et al. 2001).

The MJD locus was mapped to 14q32.1 (Takiyama et al. 1993). An expansion of a triplet repeat with a CAG motif at exon 10 of the ATXN3 gene was identified as the causative mutation for this disorder (Kawaguchi et al. 1994; Ichikawa et al. 2001). The wild-type alleles present between 12 and 44 CAG units, whereas the expanded alleles contain between 61 and 87 CAG repeats (Maciel et al. 2001).

In triplet-repeat disorders, expanded alleles usually present high mutation rates and are relatively easy to study by analysing disease pedigrees. In contrast, and because wild-type alleles have a much lower mutation rate, a very high number of normal pedigrees would be necessary to observe mutational events. Given that fact, a population genetic approach is frequently used as a feasible alternative to analyse the wild-type variation. Lima et al. (2005) previously used a population approach to study the behaviour of the wild-type MJD alleles in a large and representative sample of unrelated individuals from the Portuguese population. In that study, no evidence was found that large wild-type alleles provide a pool from which the expanded alleles might be continuously emerging. Martins et al. (2006) also studied the dynamics of MJD triplet repeat using a multicontinental sample of individuals for whom haplotypes, which included the CAG repeat, were built. Their results strongly suggest that the evolution of the CAG alleles at the MJD locus have been shaped by a multistep mutation mechanism.

Similarly to other triplet-repeat disorders, MJD exhibits several non-Mendelian features. An important aspect of the non-Mendelian behaviour, which can have an impact on triplet-repeat loci evolution, is the putative segregation distortion of alleles. Meiotic drive, or segregation ratio distortion (SRD), occurs within a given locus when one of the alleles in a heterozygous individual is transmitted preferentially, resulting in an unequal representation of the different variants among the population of gametes or offspring. This might be caused by mitotic events occurring in proliferating germ cells, nonrandom segregation of chromosomes during meiosis, differential viability or functionality of gametes, or differential survival during development (Pardo-Manuel de Villena and Sapienza 2001). The occurrence of SRD has been linked to several triplet-repeat disorders (e.g. Gennarelli et al. 1994; Ikeuchi et al. 1996), including MJD (Ikeuchi et al. 1996; Riess et al. 1997; Takiyama et al. 1997; Iughetti et al. 1998).

Given the availability of extensive characterisation at the epidemiological level, as well as the detailed genealogical information concerning MJD families (Lima et al. 1998), the Azorean population provides an adequate background to test several aspects related to the behaviour of the (CAG)n repeats at the ATXN3 gene. We thus analysed, in the present work, the size of the (CAG)n tract in normal sibships of Azorean ancestry, representing 428 meioses, with the aim of characterising stability and segregation patterns of wild-type MJD alleles.

Subjects and methods

DNA samples

The study sample was composed of 398 normal individuals from the Azores islands (Portugal) belonging to 102 sibships and included the parents and at least one sibling, representing 428 meioses. Buccal swabs were collected after informed consent. Prior to DNA extraction, samples were anonymised. DNA was extracted using JETQUICK blood and cell culture DNA Mini Spin Kit (Genomed, Löhne, Germany), according to the manufacturer’s instructions.

Allele size determination

Fragments containing the CAG tract of the ATXN3 gene [198 bp + (CAG)n] were amplified using the following set of oligonucleotide primers: MJD52F (5′-CCA GTG ACT ACT TTG ATT CG-3′) (Kawaguchi et al. 1994) and MJD72R (5′-TTA CCT AGA TCA CTC CCA A-3′ labelled with the fluorescent tag 6-FAM). The amplification reaction was carried out in a total volume of 25 μl, with 1 μM of each primer, 300 μM of dNTPs, 2.5 mM of MgCl2, 10× reaction buffer [160 mM (NH4)2SO4, 670 mM Tris–HCl (pH 8.8 at 25°C), 0.1% Tween-20], 10% of dimethyl sulfoxide (DMSO), 1.25 U of BIOTAQ™ DNA polymerase (Bioline) and 100 ng of genomic DNA, using the following conditions: a first denaturation step of 5′ at 95°C; followed by 25 cycles of 1′ at 95°C, 1′ at 56°C and 1′ at 72°C; and a final extension step of 10′ at 72°C.

One microliter of the amplification products (diluted whenever necessary) was mixed with 0.3 μl of GeneScan 500-TAMRA size standard and 12.2 μl of Hi-Di Formamide (Applied Biosystems, Foster City, CA, USA), heated for 2′ min at 90°C and immediately placed on ice for at least 5 min. Performance Optimized Polymer-4 (POP-4, Applied Biosystems) was used to separate the DNA fragments by automated capillary electrophoresis (CE) in an ABI-Prism 310 Genetic Analyzer (Applied Biosystems). Amplicon length was calculated by comparison with the GeneScan 500-TAMRA size standard using the GeneScan Analysis 3.1.2 software. Size standard fragments 250 bp and 340 bp were not considered for size estimation purposes due to their anomalous migration (Rosenblum et al. 1997). A size correction formula [(CE product size (bp) − 198)/3 × 1.0184 + 0.7062; adapted from Dorschner et al. 2002] was applied to allow the comparison with data from previous works, where manual polyacrylamide gel electrophoresis (PAGE) assay was used instead of CE.

Statistical analysis

Population data analysis was performed using 128 unrelated individuals from the parental generation. Allelic and genotypic frequencies were estimated. Conformity with the Hardy–Weinberg expectations was tested using the unbiased exact Hardy–Weinberg probability. An exact test was used to compare genotypic and allelic composition obtained in this study with that observed by Lima et al. (2005). All analyses were performed using the Arlequin package (Schneider et al. 2000), with a 5% significance level. For segregation analysis, all individuals from the 102 sibships were considered, comprising 428 transmissions. From the total transmissions, 103 were excluded, as they were noninformative due to: (a) homozygous transmitter(s) (mother and/or father) or (b) heterozygous transmitters sharing the same genotype and transmitting both alleles. The χ2 goodness-of-fit test was used to compare the proportions of transmission of the smaller and larger alleles, considering the expected segregation ratio of 1:1. Statistical analyses were performed using SPSS 15.0 (SPSS Inc. 2006). The Power and Precision 2.0 software (Borenstein et al. 2000) was used for power calculations.

Results

Population analysis

Allele composition analysis revealed the presence of 16 allelic variants for the CAG-repeat-containing segment of the ATXN3 gene on 256 chromosomes of unrelated individuals from the parental generation. All alleles were in the normal range [mean 21.98 ± 0.30 standard error (SE)], varying between 14 and 39 CAG repeats (Fig. 1). Alleles with 23 and 14 repeats were the most frequent (35.5% and 20.3%, respectively). Genotypes 23,23 and 14,23 were the most represented (both with 15.60%), followed by 14,27 (7.8%). Genotypic frequencies were in conformity with Hardy–Weinberg expectations (P = 0.109), with an observed heterozygosity value of 78.9%. No significant differences were detected between the results obtained and data previously published concerning the Portuguese population (Lima et al. 2005). The allele size distribution showed a negative skew [−0.249 ± 0.152 (SE)], with 73.4% of the chromosomes analysed having a CAG repeat size equal or smaller than the mode (allele 23).

Fig. 1
figure 1

Allele size distribution at the Machado-Joseph disease (MJD) locus in 256 chromosomes from unrelated Azorean individuals

Segregation analysis

No mutational events were detected in the 428 wild-type allele transmissions studied. A segregation analysis was performed (Table 1) to test the transmission proportions of the smaller and larger alleles, considering the expected segregation ratio of 1:1. When analysing all informative transmissions, the results showed a significant SRD in favour of the transmission of the smaller alleles (56.9% of transmissions; χ2 = 6.231; P = 0.013). When only paternal transmissions were considered, the preferential transmission of the smaller allele (58.1%) was still observed and significant (χ2 = 4.225; P = 0.040). However, when considering maternal transmissions alone, and although the smaller alleles were transmitted more often (55.7%), such preference was not statistically significant. Nevertheless, the power for this last analysis was 33.9% (α = 0.05; two-tailed), implying that it is highly probable that the absence of significance is due to a type II error.

Table 1 Frequency of transmissions of smaller and larger Machado-Joseph disease (MJD) alleles by normal individuals

We raised the hypothesis that the transmitters’ genotypic composition, specifically the difference in length between the larger (L) and the smaller allele (S) constituting the transmitters’ genotype (D) (D = LS), could influence the probability of preferential transmission of one of the alleles. The results from the total informative transmissions corroborate this hypothesis, showing a tendency for preferential transmission of the smaller allele, except when D = 1 and D = 2 (Table 2). Furthermore, when analysing the smaller allele transmission frequency by genotype, a positive correlation was found between D and the frequency of transmission of the smaller allele (r sp = 0.291; P = 0.028; one-tailed). Indeed, when D is greater than 6, the percentage of transmission of the smaller allele tends to be greater than 50 (Fig. 2).

Table 2 Frequency of informative transmissions of smaller and larger Machado-Joseph disease (MJD) alleles versus the difference in length between the two alleles that constitute the transmitters’ genotypes
Fig. 2
figure 2

Scatter plot of the relationship between frequency of transmission of the smaller allele and the difference in length (CAG repeat number) between the two alleles that constitute the transmitter’s genotypes, with dots labelled by genotype, comprising a total of 325 transmissions

If we exclude the transmissions involving genotypes with D = 1, the transmission of the smaller allele reaches 58% considering either the total, paternal or maternal transmissions. The SRD value is statistically significant for the total transmissions (χ2 = 6.964; P = 0.008). However, for paternal and maternal transmissions, although the frequency of transmission of the smaller allele is the same, it does not achieve statistical significance, probably due to type II error (power = 47% for both analysis; α = 0.05; two-tailed). Furthermore, if we exclude the transmissions involving D ≤ 2 (Table 3), then the segregation distortion in favour of the transmission of the smaller allele becomes significant also for maternal (58.9%) and paternal transmissions (59.5%).

Table 3 Frequency of transmission of smaller and larger Machado-Joseph disease (MJD) alleles by normal individuals excluding the informative transmissions that involve D ≤ 2 CAG repeats

Discussion

The allelic profile obtained was in agreement with what has been described for the Portuguese population. As mentioned previously, a negative skew was observed for the allele size distribution. Despite the lack of significance, this result is in agreement with our previous work studying a large representative sample of the Portuguese population [−0.185 ± 0.057 (SE)] (Lima et al. 2005). This behaviour was also reported for other Caucasian samples (Takano et al. 1998) and indicates an excess of wild-type MJD alleles with shorter repeats in the corresponding populations. This contrasts to what has been observed in other populations, such as the Japanese population (Takano et al. 1998), for which a positive skew was reported. For other triplet-repeat diseases, such as HD (Rubinsztein et al. 1994) and SCA7 (Stevanin et al. 1998), it has been suggested that the large majority of new mutations arise from the upper end of the normal allele distribution. However, as proposed previously (Lima et al. 2005) and in agreement with the results presented here, it seems that this behaviour is not applied to wild-type MJD alleles, at least for Caucasian populations.

Previous analyses of the MJD locus have consistently shown intergenerational instability of the expanded alleles, especially during paternal transmission, with a tendency for the CAG repeat to further expand (Maciel et al. 1995; Takiyama et al. 1995; Igarashi et al. 1996). However, to our knowledge, the normal alleles were stable upon transmission in all family studies carried out to date, including this work. The mean mutation rate for microsatellites is estimated at around 2 × 10−3 per generation (Ellegren 2000). No mutational events were observed in 428 transmissions, suggesting that the occurrence of mutational events in the normal allele range is below the mean.

Our data on segregation analysis strongly indicates the existence of SRD in favour of the transmission of the smaller allele. Previously published studies on patterns of segregation of wild-type MJD alleles (Rubinsztein and Leggo 1997; MacMillan et al. 1999; Wiezel et al. 2003) generated conflicting results. Nevertheless, analysis of the total informative transmissions showed a tendency for preferential transmission of the smaller allele (> 50% in all studies mentioned). Analysing by gender of the transmitters, MacMillan et al. (1999) also found a tendency for the preferential transmission of the smaller allele in both maternal and paternal transmissions (53%). However, these authors failed to detect a significant deviation form the 1:1 segregation assumption, which could be due to type II error (power < 25%). Rubinsztein and Leggo (1997) found SRD in favour of the transmission of the smaller alleles in the case of maternal transmissions (57.2%; χ2 = 6.083, P = 0.014), but in paternal transmissions, the tendency seemed to be the opposite (47.3%). Data from Wiezel et al. (2003) showed the same tendency of our results for paternal transmissions (52.1%; power = 18%), but their results were borderline on what concerns maternal transmissions (49.1%; power = 8%), not producing a significant deviation form the 1:1 segregation assumption in either case. Our results show a positive correlation between D values and transmission of smaller alleles. According to our data, it seems that small D values, such as differences of one or two CAG repeats, are not enough to modify the probability of preferential transmission, and this may condition the ability to detect SRD. Thus, the genotypic composition of the transmitters affects the ability to detect SRD, acting as a confounding factor. This fact could be on the basis of the conflicting results obtained previously for the wild-type MJD alleles.

Although several studies using MJD patient data indicate that SRD favours the transmission of larger/expanded alleles during maternal (Riess et al. 1997) or paternal transmissions (Ikeuchi et al. 1996; Takiyama et al. 1997; Iugetti et al. 1998), others contested the existence of such a phenomenon (Grewal et al. 1999). Notwithstanding, expanded alleles can have a differential segregation pattern not relatable to the data presented here.

In the case of wild-type alleles at the DM1 locus, Chakraborty et al. (1996) describe the occurrence of SRD favouring transmission of the larger alleles during maternal but not paternal meiosis. According to the authors, if contractions outnumber the expansions in mutations within the normal range, the occurrence of SRD with the preferential transmission of larger alleles might be responsible for maintaining disease frequency of DM1. However, according to our results, this is not happening in the case of MJD, since SRD is favouring the transmission of the smaller alleles.

In summary, the results presented here indicate that SRD, affecting wild-type alleles at the MJD locus, favour the transmission of smaller alleles, particularly if D is larger than two repeat units. Therefore, the ability to detect SRD may be influenced by the pool of genotypes in the parental generation.