Introduction

The fragile X mental retardation syndrome is caused by the expansion of an unstable CGG repeat sequence located in a 5′ exon of the FMR1 gene [16]. The (CGG)n repeat is polymorphic in the normal population (n = 5–50). In fragile X families, one can in successive generations follow the expansion of the repeat from a premutation present in normal carriers (n = 50–200) to a full mutation (n > 200) present in fragile X patients. The transition from premutation to full mutation occurs only when transmitted from a carrier female to her offspring. Local DNA hyper-methylation is associated with the full mutation [3], leading to a shutdown of FMR1 expression [7, 8].

The fragile X syndrome is a common disease, affecting about 1 in 1,500 newborn males in Caucasian populations [9]. As affected males very rarely reproduce, one would a priori expect a high rate of new mutations to compensate for this loss of chromosomes carrying a full mutation [10]. However, no mutations from a normal allele to premutation or full mutation have been observed in studies of large sets of families [4, 1113]. This suggested that initial mutational events may create alleles carried silently for many generations (S alleles) before leading to highly unstable CGG repeats (Z alleles) and a full mutation [14, 15]. This hypothesis was supported by the identification of fragile X patients with common ancient ancestors [12, 16]. Indeed, it was shown that the transition frequency from premutation to full mutation from a carrier female depends on the size of the premutation [1, 11, 17], and thus small premutations (in the range n = 55–80) may be carried for several generations before a full mutation is observed.

Analysis of linkage disequilibrium with very close markers may be a powerful way to investigate the origin of fragile X mutations. Using the microsatellite markers FRAXAC1 and FRAXAC2 which are within 10 kb of the CGG repeat, Richards et al. [18] found that three haplotypes were significantly overrepresented in fragile X patients from Australia and the USA, accounting for 58% of the fragile X chromosomes, and 18% of the normal ones. We reported similar findings in a French population using FRAXAC2 and DXS548, a microsatellite located 150 kb proximal to the CGG repeat. Five haplotypes appeared to be preferentially associated with the fragile X mutation (54 versus 14% in normal chromosomes) [19]. These observations suggested that a small number of founder chromosomes more susceptible to expansion of the CGG repeat may account for a large proportion of fragile X patients in Caucasian populations [1820].

We hypothesized that within a more homogenous population, a putative founder effect responsible for the fragile X syndrome might be even stronger. We show this to be the case in the Finnish population, as 19 of 26 independent patients originating from various parts of the country carried the same FRAXAC2-DXS548 haplotype, present only in 1 of 34 normal chromosomes. This finding is best explained by assuming that initial mutation events that create a susceptibility to further expansion are very rare, as already pointed out for myotonic dystrophy [21], where linkage disequilibrium is even more striking [22, 23].

Families and Methods

Families

A fragile X male and his mother were selected from 26 independent families. The carrier ancestors were all from Finnish-speaking families and were ascertained through a proband living in the Helsinki area. First-and second-cousin relationships were formally excluded for 21 families. In the 5 remaining families, due to lack of detailed information, a second-cousin relationship cannot be ruled out but is unlikely, as carrier ancestors lived in different rural areas where migrations were rare before the Second World War.

All patients had a typical fragile X phenotype, including mental retardation. Chromosome analysis was performed and found to be positive in 23 cases. Two cases were not studied cytogenetically, but a full mutation was shown in the FRAXA locus with the probe StB 12.3 [4], as in the other patients. In 1 fragile X negative case, only a premutation (A of 300 bp) was detected in lymphocyte DNA in spite of the typical phenotype. All mothers were phenotypically normal: 3 had a full mutation, and 23 had premutations ranging from Δ = 90–500 bp.

Methods

Analysis of CA repeats at DXS548 and FRAXAC2 were carried out by PCR amplification with oligonucleotide primers described by Richards et al. [24] and Verkerk et al. [5] for DXS548 and FRAXAC2, respectively. The reaction conditions were slightly modified to run both amplifications concomitantly in the same tube, as reported in Oudet et al. [19].

The PCR analysis of the CGG repeat length was carried out according to the method reported by Fu et al. [1], using a 32P-end-labeled primer; fresh deaza-dGTP was added to the reaction mixture, as freezing may adversely affect the PCR reaction [S. Warren, pers. commun.].

Results

The FRAXAC2 and DXS548 microsatellites that flank the fragile X locus [5, 24] were analyzed in 26 unrelated pairs of carrier mother and affected son. The haplotypes carried by both the fragile X and normal maternal chromosomes could thus be determined. Additional data on normal chromosomes originating from the same population were provided by analysis of healthy fathers.

Five alleles (145–155 bp) for the FRAX-AC2 poylmorphism (table 1) were observed on the normal chromosomes and only three on the fragile X chromosomes. There was a major allele (153 bp), both in the normal and the fragile X population, representing 54 and 73%, respectively, of each population. The 151-bp allele, which accounts for 26% of the normal alleles, was absent from the fragile X chromosomes studied. Four alleles (194–204 bp) for the DXS548 polymorphism were detected on the normal chromosomes (table 2), and only two (196–204 bp) were present on the fragile X chromosomes studied. The 194-bp allele, which accounted for 72% of the normal chromosomes, was not observed on the fragile X chromosomes. The 196-bp allele carried on only 17% of the normal chromosomes was manifestly the major fragile-X-associated allele (92%). For both markers, the allele distribution on normal chromosomes was remarkably similar to that observed in other more heterogeneous Caucasian populations [18, 19, 25]. In contrast, the distribution on fragile X chromosomes was very different from that observed in Australian, US or French patients.

Table 1 Distribution of FRAXAC2 alleles in Finnish normal and fragile X chromosomes, and comparison with two other populations
Table 2 Distribution of DXS548 alleles in Finnish normal and fragile X populations

Nine FRAXAC2-DXS548 haplotypes were observed on the normal chromosomes (table 3). There was a slight deviation in the frequencies of some of them compared with the expected distribution. The population was, however, too small to test the significance of such a disequilibrium. The haplotype 153–194 was the major haplotype in the normal population (52%), and was not observed in fragile X patients. The haplotype 153–196 was the major haplotype in the fragile X population (73%) and was observed only once in the normal population (table 3). The four haplotypes observed in the fragile X patients accounted for only 18% of those observed on the normal chromosomes. A comparison with the results reported for the French population [19] reveals a great similarity for the normal chromosomes (the two major haplotypes are the same, with frequencies of 52 and 17% in Finland, and 44 and 11% in France), and a striking difference for the fragile X chromosomes (the two haplotypes that account for 88% of Finnish patients were found in only 10% of French patients).

Table 3 Distribution of FRAXAC2-DXS548 haplotypes in Finnish and French normal and fragile X populations

Richards et al. [18] analyzed the length of the CGG repeat in normal individuals carrying FRAXAC1-FRAXAC2 haplotypes over-represented on fragile X chromosomes. They found a higher frequency (38%) of longer repeats (n > 39) in this subset than in the average normal population (< 5%). We performed a similar analysis of the six normal independent chromosomes carrying the 196-bp DXS548 allele which is prevalent on Finnish fragile X chromosomes (fig. 1). Although only three of these chromosomes had a fragile-X-associated haplotype, one of them, with the haplotype 147–196 (fig. 1, lane 8), carried an allele of 55–60 CGG repeats (i.e. in the lower premutation range). This was unexpected given the very low frequency of such alleles [1]. Five of the six chromosomes were associated with alleles with 29–35 CGG repeats.

Fig. 1
figure 1

Length of the FMR1 CGG repeat in normal chromosomes and fragile-X-associated FRAXAC2-DXS548 haplotypes. The repeat length of the PCR product was analyzed on sequencing gels as described in Fu et al. [1]. The FRAXAC2-DXS548 haplotypes are indicated using a two-letter code (see table 3). Those carried by the fragile X chromosome are underlined. The mutated allele was not amplified, except in the carrier female 8, where it reaches about 100 CGGs. Lanes 1, 3–6 = normal males; lanes 2, 7–12 = carrier females; lane 13 = control female with 32 and 42 CGG repeats; lane 14 = 19(CGG) control fragment.

The geographical origin of the oldest known carrier in each family is indicated in figure 2. These individuals were either obligate carriers or their mutation was detected by DNA analysis. The oldest carrier included was born in 1890, and the youngest in 1949. Although the index cases were from the Helsinki area, only one family had lived there fore more than two generations. The places of origin of all the others are rather evenly spread over the southern half of Finland, including the eastern part lost to the Soviet Union in 1945.

Fig. 2
figure 2

Birthplaces of the oldest known carrier ancestors of fragile X patients. = Haplotype 153–196 bp; = haplotype 147–196 bp; ▲ = haplotype 155–204 bp; ■ = haplotype 155–196 bp.

Discussion

The population history of Finland has favored founder effects accounting for a higher incidence of about 30 hereditary diseases [26]. Other diseases such as Huntington disease, phenylketonuria or cystic fibrosis have a much lower incidence than in Caucasian populations in general [27, 28]. The population is thought to derive largely from a small number of settlers who immigrated from the south and the east to the southwest 2,000 years ago, and has remained isolated as a result of geographical and linguistic barriers. Migration towards the central, northern and eastern parts started in the 16th century, and regional isolation contributed to the formation of consanguineous subpopulations. Furthermore, the southern and western coastal regions have also been settled by people of northern European origin about 1,000–1,500 years ago. Molecular analysis has shown that single mutations account for 98% of aspartylglucosaminuria alleles [29] or 100% of gelsolin amyloidosis alleles [30, 31], two diseases very frequent in Finland. Extreme linkage disequilibrium found for diastrophic dysplasia also implies allelic homogeneity [32].

We have now shown another striking founder effect which concerns the fragile X syndrome in this population. This was unexpected a priori for a recessive X-linked disease which severely affects reproductive fitness of affected males and which is present at high frequency in many populations throughout the world [9]. The linkage disequilibrium between the fragile X mutation and the flanking FRAXAC2-DXS548 haplotype reported here is much stronger than that observed in a French population [19] or, with a different combination of markers, in Australia and the USA [18]. The major haplotype (153–196) which accounts for 73% of the fragile X chromosomes studied was found only once in 34 normal chromosomes, and thus carries a relative risk of about 90 compared to its absence, and of much more compared to the major normal haplotype (153–194). Almost identical results have been obtained independently by Leisti et al. [submitted] on a population of fragile X families originating mostly from northern Finland, where the same two fragile-X-associated major haplotypes (153–196 and 147–196) were found at frequencies very similar to those reported in the present study. The broad geographical distribution of the 153–196 fragile X haplotype all over Finland indicates that it is of very ancient origin, arising long before the migrations towards the inland and eastern regions began in the 16th century. For comparison, a choroideremia mutation originating from a common ancestor born around 1645 shows a very restricted distribution in northeastern Finland [33]. According to the calculations of Morton and Macpherson [15], the 153–196 haplotype carrying a CGG repeat susceptible to further amplification (S allele) would have a frequency of about 1–2% and was thus most likely present in the founder population of Finland about 100 generations ago. It is of interest to note that the same haplotype was found in 35% of fragile X patients from Sweden, but in only one of 102 French patients (who was in fact of Danish origin), while its frequency in normal chromosomes was very similar (10 and 5%, respectively) in the two populations [19; H. Malmgren and N. Dahl, pers. commun.].

The second most common haplotype (147–196) was found four times in our series, in families from the coastal region. The same haplotype is present in the families from the Oulu region analyzed by Leisti et al. [submitted], and appears at a very low frequency in Swedish families [H. Malmgren and N. Dahl, pers. commun.]. This haplotype may be the result of an independent mutational event at the CGG repeat (from an N to an S allele), or may be derived from the major haplotype which shares the same DXS548 allele, by an ancient recombination between FRAXAC2 and the CGG repeat (only 10 kb apart), or by a slippage mutation at the FRAXAC2 marker (although such mutations most often change length by only 1 or 2 repeat units [34, 35]). Haplotype 155–204 is rather frequent in the French fragile X chromosomes [19] (the 155 allele also being overrepresented in the Australian and US patients [18]). The linkage disequilibrium we observed may be of diagnostic interest, as 96% of carrier females in Finland should be informative for the FRAXAC2-DXS548 combination.

Richards et al. [18] found an excess of large (n > 39) CGG repeats in normal individuals carrying the haplotype most frequently associated with fragile X in the population studied. We could analyze only three normal chromosomes with the 153–196 or 147–196 haplotypes. In one of them we actually found a small premutation (n = 55–60). Further analysis is needed to see whether large CGG repeat alleles really are frequent on chromosomes with these haplotypes, or whether the sequence of these alleles, notably with respect to interspersion of AGG repeats [2, 3, 5] differs from those of alleles carried on haplotypes with a very low risk of fragile X syndrome in the same population.

Our finding supports the multistep progression of fragile X mutations, starting from a chromosome carrying an allele more susceptible to further expansion, which can be carried silently, as in the present case, for as many as 100 generations. Given the similarity between the mutation mechanism in myotonic dystrophy and fragile X syndrome, one may speculate why linkage disequilibrium appears much more extreme in myotonic dystrophy, as patients all over the world carry the same allele of a nearby insertion-deletion polymorphism [22, 23]. For fragile X, although linkage disequilibrium has been observed, a large number of different haplotypes have been found associated with the disease. The pattern in Finland is very different from that seen in France, although the haplotype frequencies are very similar for the normal populations. A trivial partial explanation for the multiplicity of haplotyes in the case of fragile X is the much greater informativeness of the markers used (two multiallelic micro-satellite markers in linkage equilibrium in the normal population), and there is also the possibility for the generation of haplotype diversity through slippage mutations in the markers, or through recombination (DXS548 being 150-kb proximal to the CGG repeat) [5]. However, a more important cause of the difference may lie in the length and frequency distribution of normal alleles at the two loci. In myotonic dystrophy, the most frequent alleles accounting for about 90% of Caucasian chromosomes have 5 or 11–13 CTG repeats, while alleles in the 20–30 range account for 10%. We have recently shown that all alleles in the latter range are most likely derived from a single initial event, from a chromosome with n = 5. Linkage disequilibrium analysis strongly suggested that myotonic dystrophy mutations are derived through successive steps from chromosomes with n = 20–30 repeats [21]. On the other hand, most major alleles in the FMR1 gene have 28–30 CGG repeats [1]. These would be more likely, through rare but not unique events, to mutate to longer alleles, in the 40–45 range (the rate suggested by Morton and Macpherson [15] is about 2.5 × 10−4). This would generate different at-risk haplotyes in different populations as shown in the present study.