Introduction

Dentatorubropallidoluysian atrophy (DRPLA) is an autosomal dominant neurodegenerative disorder belonging to the group of expanded polyglutamine diseases. It has a variable phenotype including ataxia, choreathethosis, myoclonic epilepsy, dementia and cerebellar and brain-stem atrophy at magnetic resonance imaging. The gene encodes for atrophin (ATN1) and is located in chromosome 12p13. Normal alleles have up to 35 CAG repeats, whereas pathologically expanded alleles range from 40 to 100. DRPLA is the second most frequent hereditary ataxia, after SCA3, in the Japanese population,1 with a prevalence of 0.2–0.7 per 100 000, but has only been occasionally reported in the Caucasians: one family from Denmark,2 seven from Britain,3, 4, 5, 6 one from Malta,3 four from Portugal,7, 8 two from Spain,9, 10 one from France11 and five from Italy.12, 13, 14, 15, 16 Among these, four share an origin in western Sicily, in the province of Trapani, whereas for the one by Le Ber14 no regional origin was reported. Two additional Italian DRPLA families were recently diagnosed by our group (see below for details), bringing the number of DRPLA families of Italian origin to seven.

The haplotype analysis with two intragenic ATN1 polymorphisms17 showed that all DRPLA-expanded alleles share a unique haplotype both in the Japanese population and in two non-Asian families (one from Denmark and one of African-American descent), suggesting a singular origin for the CAG expansion. Subsequently, the same haplotype was also found in four Portuguese and two Italian DRPLA families,8, 16 raising the hypothesis of the introduction onto the same Japanese founder haplotype. This haplotype (hereafter called ‘narrow haplotype’) is defined by alleles A and T at two SNP markers (rs34199021 and rs2071075) located 3.25 and 2.39 kb telomerically to the CAG repeat segment, respectively. Interestingly, in the Japanese population this haplotype is also the most frequent in the general population (HapMap frequency 37.8%), whereas in Caucasian populations it is less common (8.5%). This favours an introduction of the DRPLA mutation in different Caucasian subgroups onto the founder Asian haplotype.17 However, Aridon16 extended their analysis to markers flanking the CAG segment and indeed found evidence for recombination. As these markers span 8.8 Mb (positions 531652 to 9385634 in NCBI36/hg18), corresponding to 20 cM, the finding of recombinant haplotypes in two families is well within expectations, as they were apparently unrelated.

To clarify the population history of DRPLA in Italy and to try to date back the introduction of the monophyletic mutation, we analyzed an extended haplotype flanking the CAG repeat in 10 patients of Italian ancestry. Building on the notion that, through generations, recombination is able to fragment a given DNA stretch, we scanned 155 kb of DNA surrounding the CAG repeat by selecting appropriately positioned markers in search for recombinant haplotypes. Our aim was to compare the hypothesis of a single, recent genealogy connecting all the observed haplotypes with the alternative hypothesis of multiple introductions by more distantly related haplotypes from outer sources. Our approach was dictated by the lack of any obvious relatedness among the families in which the 10 DRPLA alleles were represented, as judged from their surnames and the reconstructed pedigrees.

Materials and methods

Patients

DNAs of two patients of the family reported by Le Ber (O54 and O55),14 of five patients from the family reported by Villani12 (P50, P52, P53, P54, P55) and one patient from the family reported by Brusco15 (P51) were kindly provided by the authors.

In this study, we have also included two additional recently identified families of Sicilian origin, both from the province of Trapani. In the first one, the proband (N95) was diagnosed in Rome and carried a 70 CAG repeat in the ATN1 gene (Figure 1). At age 10, he presented with learning difficulties, myoclonic seizures and gait instability. Five years later, he showed axial dystonia, cerebellar ataxia, myoclonic and, occasionally, generalized seizures. His father was diagnosed as having Huntington disease, but he was unavailable both for neurological examination and blood sampling. The healthy mother (P75) was sampled, upon informed consent, to reconstruct the affected haplotype (Figure 1).

Figure 1
figure 1

Pedigree of patient N95. Haplotypes in the mother and the son are shown, with phased markers in the same order as in Table 1. Alleles that could not be phased are in brackets. The marker displaying recombination is arrowed.

In the second family, the proband (P56; male, 28 years of age), was diagnosed in Milan. He presented generalized episodes of myoclonic seizures, ataxia and cognitive decline since age 24. A familial history positive for myoclonic epilepsy and ataxia was also reported. In fact, the mother died at 45 years of age from a severe neurodegenerative disorder characterized by ataxia, epilepsy and cognitive decline. The patient had three healthy brothers and a sister presenting the same clinical phenotype since age 12. The proband carried 64 CAG repeats in the ATN1 gene, whereas his sister carried 66 CAG repeats. DNA was also available from the two healthy brothers with normal CAG size.

Written informed consent for the research purposes was obtained from all individuals participating in this study.

Haplotype analysis

The number of CAG repeats in the ATN1 gene was determined by PCR as described in Nagafuchi et al.18 with one fluorescently labeled primer and separation in polyacrylamide denaturing gel in a LICOR 4200 apparatus. All subjects were typed for 12 polymorphic DNA markers, chosen to cover chromosome 12 from position 6818289 to 6971234 (NCBI36/hg18), that is, from 98 kb telomerically to 55 kb centromerically to the CAG repeat. An additional marker (rs12302749) was typed in subjects P75, N95 and O55, in order to possibly refine the location of a recombination event. All markers were typed by sequencing of the PCR products obtained with primer pairs designed with the program Primer-BLAST (http://www.ncbi.nlm.nih.gov/tools/primer-blast).19

Haplotype reconstruction in the nine unrelated patients (Table 1) was obtained with Arlequin,20 with both the ELB and EM algorithms and default settings, using their unphased genotypes as input and no additional information on haplotype frequencies over the same region in the general population. Phasing in patient N95 was obtained by tree reconstruction. Table 1 lists the full set of phased haplotypes that have been compared with the null distributions (see below).

Table 1 Results of typing in ten DRPLA patients (phased)

Generation of null distributions for recombinant haplotypes

A null distribution of haplotypes segregating in cis to a neutral mutation along genealogies, with recombination, was generated with the program SelSim.21 Our rationale was to search for a maximum DNA region in which the observation of a recombination event would testify a relatedness between haplotypes significantly older than a neutral genealogy underlying 10 copies.

Haplotypes consisting of 10 markers on each side of a spreading neutral mutation were modeled in genealogies underlying 10 chromosome copies in the present (that is, the number of DRPLA haplotypes studied here). The deterministic mode was used conditioned on the number of markers, and the inter-marker distance was set by adjusting the corresponding genetic distance. Actual local recombination rates were obtained from the USCS Genome Browser. Other settings were: effective population size=2000 and initial frequency of the simulated DRPLA mutation=0.003. These were chosen with an array of exploratory runs and turned out to be of general validity for haplotypes bearing a rare mutation, which acquire material by recombination from the vast majority of normal haplotypes. Ten thousand genealogies were generated and the output file summarizing the location of recombination events was further analyzed in Excel, by scoring the overall number of events at a given distance from the central position. A number of runs was performed to work out the span of the genomic stretch in which to select real markers for genotyping. These led to an expectation of 20 kb for the region relatively spared from recombination (see bracket in Figure 2), with markers beyond 100 kb bearing no information for our aims. This approach was not aimed at reaching a punctual dating, but rather a threshold to reject the hypothesis of a single, recent founder.

Figure 2
figure 2

Summary of the null distribution of recombination events in 10 kb intervals in 10 000 simulated genealogies leading to 10 haplotypes each. The ATN1 CAG repeats are assumed at position 0 on the x axis. Note the effect of the higher recombination rate on the centromeric side (right). Bar color codes: black=all genealogies, white=quartile of the shortest genealogies, gray=quartile of the longest genealogies. The range in which the expected percentage of haplotypes displaying recombination falls below 5% is bracketed.

Results and discussion

The presence of an expanded CAG stretch was reconfirmed in all the patients, ranging from 64 to 72 repeats. Genotyping and haplotype phasing in nine patients revealed the arrangements reported in Table 1. Both inference methods produced identical results, with a Bayesian support >0.5 for all subjects and >0.7 for seven of them. A single haplotype surrounding the (CAG)n from position 6886176 to position 6946497 (marker rs12580543 at position 6971234 being uninformative) was shared by all these patients, in association with the expanded CAGs. Interestingly, this haplotype includes alleles A and T at markers rs34199021 and rs2071075, respectively. This coincides with the narrow haplotype described to be associated with the expanded CAGs in the Japanese17 and Portuguese8 populations. Patient N95, whose haplotype could be reconstructed based on the results in the unaffected mother (Figure 1), also inherited expanded (CAG)s onto the A-T narrow haplotype, but shared the extended haplotype with the other patients only from positions 6903643 to 6913747.

By contrast, non-expanded CAGs (range 10–19) were associated with 7 different haplotypes in the entire sample of 10 patients. Only one of these haplotypes (subject P56) included the A-T narrow haplotype, confirming its much lower frequency in the repertoire of Italian normal haplotypes as compared with DRPLA haplotypes.

The DRPLA haplotype of patient N95 could be unequivocally reconstructed to carry a G at rs2159887 (Figure 1), a marker located 30 kb centromerically to the CAG. This contrasted with all the remaining DRPLA haplotypes, which carried a C at this position (Table 1).

At the opposite end of the scanned region, patient O55 turned out to be a homozygote for A at rs4963516, a position 98 kb distal to the CAG. This too, contrasted with all the remaining DRPLA haplotypes that carried C at this position (unequivocally confirmed by homozygosity in patient O54). We tried to refine the position of haplotype breakdown in this region by using marker rs12302749, which, however, turned out to be uninformative as patient O55 was homozygote. In summary, we observed the disruption of the main DRPLA haplotype in this set of patients, both proximally and distally to the CAG, with the introduction of alleles that are common on normal haplotypes.

We then assumed that recombination could be responsible for this reshuffling of alleles. Under the hypothesis of a shared ancestry of the DRPLA expansion among the gene copies examined here, we then asked whether the occurrence of recombination events within 100 kb from the CAG could inform us on the length of the genealogy underlying the observed Italian DRPLA haplotypes. In fact, in a scenario of multiple introductions of the DRPLA expansion from outer source(s), (for example, the Eastern Asian or the Portuguese populations, or both) a long genealogy of the DRPLA haplotypes had to be expected, with more opportunities for recombination to hit in proximity to the expanded CAGs and excising them from the surrounding haplotype background. In fact, multiple introductions would be accompanied by more haplotype diversity that would be correlated with a higher number of recombinations and long genealogy. Conversely, in the scenario of a singular introduction, the number of meioses allowed for recombination to occur is that connecting all extant copies through this relatively recent, common founder. We then compared our results with those expected for the haplotypes surrounding the DRPLA mutation during its transmission along genealogies, from a single founder to 10 extant copies. By doing this, we assumed the mutation to be nearly neutral, in view of the late onset of the disease.

As expected from coalescent theory, in our simulation the length of 10 000 genealogies averaged around 20 generations (21.4). The range in which recombination could be expected less than 5% of times extended across 20 kb on either side of the CAG (Figure 2). Outside this interval, the chance of observing recombination increased, to reach 15% at a distance of 100 kb. Given that we detected recombinant alleles 30 and 98 kb away of the CAG, we cannot reject the null hypothesis of a single, recent genealogy underlying our DRPLA mutant alleles. However, one of our recombinations fell in a range (30 kb) expected less than 5% of times in the 2500 shortest genealogies, whose average length was 9.1 generations. This makes unlikely an origin of the mutation younger than 270 years (assuming 30 years per generation).

In conclusion, we used here recombination as a form of molecular clock to give boundaries to the number of generations to observe a given degree of haplotype breakage. Our results do not support the hypothesis of multiple introductions of the DRPLA mutation in the Italian population. They are compatible with introduction by a single founder in the last 600 years, most likely before the last 270 years. In this context, we notice that there is no surname sharing among the examined families. Also, the above time window leaves little chance of reconstructing a possible relatedness through historical records (for example, Parish records).

Reconstructing the history of the Italian DRPLA mutation

Our observation of a focus of DRPLA in Western Sicily is compatible with multiple scenarios. Two refer to the de novo occurrence of the mutation event in this population. The de novo rate of such an event is hard to estimate. However, the fact that the A-T narrow haplotype, shared with all Far Eastern DRPLA alleles, is uncommon in the European normal haplotypes makes this possibility further unlikely. Also, the possibility that the A-T narrow haplotype actually marks an haplotype prone to expansion needs to be reconciled with the lack of reports of recurrent mutations anywhere in the world and with the patchy geographic distribution of the disease.

Two other scenarios refer to the gene flow of already expanded DRPLA alleles, either directly from the Far East or through intermediate steps. Trade of silk and other goods between Eastern and Western countries has been active for millennia, through a variety of land and maritime routes. In the case of direct gene flow from East Asia, however, the landing place of the mutation should have been Eastern rather than Western Sicily, where Trapani is located. A more likely hypothesis is that the disease may have arrived to Trapani from Portugal. From the beginning of the 15th century, the Portuguese empire expanded to Asia, trading Far Eastern goods to the Mediterranean, including Trapani, which was an important free port. For instance, one of the goods that arrived to Italy from the Far East through Portuguese ships are oranges, identified still today with words that share the same root with ‘Portugal’ (for example, ‘Portogalli’ in Sicily and most of Southern Italy). Portugal appears to have one of the highest frequencies of the DRPLA mutation among European countries, again associated with the same Japanese narrow haplotype.22 Our estimates for the antiquity of the mutation in the Sicilian population largely overlap a period in which the Japanese haplotype with the DRPLA mutation could have been introduced in Trapani from the West by Portuguese maritime travelers.