Main

Galactokinase (EC 2.7.1.6) catalyzes the first step of galactose phosphorylation in the Leloir pathway of intermediate metabolism. Galactokinase deficiency (MIM 230200), one of the three autosomal-recessive inborn errors of galactose metabolism leading to hypergalactosemia, is clinically characterized by cataract formation during the first weeks to months of life in homozygous affected children (1, 2). Heterozygosity for this trait has been implicated as a factor predisposing to the development of presenile cataracts (3, 4).

Galactokinase deficiency was first described in a Roma (Gypsy) patient in Switzerland (1, 2) and a number of subsequent reports, based on clinical observations as well as on newborn screening data (58), have suggested that the disorder is common in that ethnic group.

In 1999, we reported the identification of a missense mutation in the GALK1 gene, P28T, that was shared by six Romani families from Bulgaria, originating from three socially distinct but linguistically and genetically related groups of Vlax Roma, whose historical migrations can be traced back to the northern part of the Balkans (9). Since that initial description, we have detected the P28T mutation in affected individuals from different countries, suggesting that it is the major founder mutation responsible for galactokinase deficiency in Romani patients throughout Europe.

In this report, we present evidence of the common origin of the mutation, an estimate of its age, and data on its frequency in different Romani populations.

SUBJECTS AND METHODS

Subjects

Mutation analysis was conducted in eight galactokinase-deficient subjects, including seven Roma: one from Bulgaria, one from Romania, three from Hungary, and two from Spain (belonging to the same consanguineous family, Fig. 1), as well as one patient of declared Turkish ethnicity, resident in Switzerland.

Fig 1
figure 1

Pedigree of consanguineous Spanish Gypsy family from Barcelona. Asterisks indicate individuals available for mutation and haplotype analysis. Despite consanguinity, three different haplotypes were observed in this family (see Fig. 2).

Haplotype analysis was performed in the affected families from Hungary, Spain, and Switzerland and compared with the original findings in families from Bulgaria (9).

Population screening for the P28T mutation was conducted among 803 unrelated control individuals of Romani ethnicity, including 332 from Bulgaria, 310 from Hungary, and 161 from Spain. Romani group identity was known for the Bulgarian Roma, among whom 263 individuals were Vlax Roma (10, 11), and 69 originated from non-Vlax groups speaking different Balkan dialects of the Romani language. The participating individuals from Hungary and Spain were classified on the basis of area of residence as follows: northeastern Hungary 159, southern Hungary 95, Budapest region 56, Madrid area 77, Zaragoza 46, and Barcelona 38.

Informed consent has been obtained from all individuals included in the investigations. The study complies with the ethics guidelines of the institutions involved.

Detection of the P28T Mutation

Screening for the P28T mutation was performed using a PCR-based Ava I restriction assay. A 400-bp DNA fragment spanning the region between nucleotides 418 and 818 in the GALK1 sequence (GenBank accession number L76927) was amplified from genomic DNA and Fitzco-Whatman-FTA cards (Whatman Bioscience, Maidstone, Kent, U.K.) by use of primers 5′-CCCGAGCATCCCGCGCCGAC-3′ and 5′-GACAGGCTGTTCCCCACGT-3′. PCR volumes were made up to 20 μL and consisted of 20 ng of DNA for genomic samples, or a 1 mm2 section of FTA card, 1.25 mM of MgCl2, 2 mM of each deoxyribonucleoside triphosphate, 1 × PCR buffer, 4% DMSO, 0.5 units of Taq polymerase and distilled H2O. PCR conditions included initial denaturation at 94°C for 5 min, 35 cycles of 94°C for 30 s, 63°C for 30 s, and 72°C for 40 s, and a final extension step at 72°C for 7 min. PCR products were ethanol-precipitated, resuspended in 5 μL of distilled H2O, and digested with 3 units of Ava I, according to manufacturer's recommendations. Products were electrophoresed in 4% agarose gels and visualized by ethidium-bromide staining.

Haplotype Analysis

Genotyping was performed for six microsatellite markers spanning a 9-cM region surrounding the GALK1 gene. The markers included D17S1797, D17S1602, D17S1864, D17S929, D17S1839, and D17S1817 (Fig. 2). PCR protocols and haplotype construction were performed as previously described (9).

Fig 2
figure 2

Haplotypes in the 17q24 region of disease chromosomes in patients from Bulgaria (Bu), Hungary (H), Spain (Sp), and Switzerland (Sw). Allele 2 of marker D17S1839 is conserved in all chromosomes carrying the P28T founder mutation. Shaded regions indicate conserved haplotype.

Dating of the P28T Mutation

Standard Luria-Delbruck estimation.

The age of the mutation was estimated using the formula P = (1 − θ)g ≈ e−θg, which implies g = −log(P)/θ, where θ is the length of haplotype and P is the proportion of the most frequent haplotype for this size. This value of g (gapp) has to be corrected according to the model of Luria and Delbruck (12). The basic principle is that when we evaluate g using the formula above, we omit a number g0 of generations during which one recombinant is expected. This number of generations is the time necessary for one recombination to occur. For a given θ, the number of meioses (M) such that only one recombination is expected is 1/θ. When the population is growing at a rate d, the number of meioses isMATH

Labuda et al. (12) made a simplification: edg − 1 ≈ edg. This simplification is accurate in the case of a fast-growing population, like the one they studied. To be more general, we do not make this simplification. ThenMATH

with r = ed. Thus, the time since the introduction of the mutation in the population becomes g = g0 + gapp

Multipoint Luria-Delbruck estimation.

We designed a new method, which we refer to as multipoint Luria-Delbruck estimation, that takes into account the whole information on haplotypes, not only the longest one. The basic principle is to calculate the likelihood of the whole set of data, taking into account the Luria-Delbruck correction. We then estimate g by maximum likelihood, and provide the confidence interval by the standard Max-2 rule.

RESULTS

P28T analysis in galactokinase-deficient subjects and controls.

All eight affected individuals, referred for analysis from different European countries, were homozygous for the C→A transversion in exon 1 of the GALK1 gene, resulting in the threonine for proline substitution at amino acid position 28 of the galactokinase protein.

Screening for the P28T mutation (Table 1) among 803 control Roma individuals from Bulgaria, Hungary, and Spain identified 17 carriers, giving an overall carrier rate of 1:47 and an expected incidence of affected births about 1:10,000.

Table 1 Population screening for P28T and carrier frequency

The distribution of the mutation was nonrandom: 14/17 carriers belonged to the Bulgarian sample and, among those, 12 individuals were Vlax Roma and 2 originated from non-Vlax groups; 3/17 were Spanish Gypsies; no carriers were identified in the Hungarian screening sample. Carrier rates can therefore be estimated at about 1:22 among the Vlax Roma and 1:35 in non-Vlax groups in Bulgaria, and 1:54 in Spain.

Haplotype analysis and age of the P28T mutation.

Polymorphic haplotypes on disease chromosomes were studied in the affected families from Spain, Hungary, and Switzerland. The data were compared with those obtained in the original study of galactokinase-deficient Romani families from Bulgaria (9). The analysis revealed complete homozygosity for allele 2 of marker D17S1839 (Fig. 2), previously shown to be the conserved allele in strong linkage disequilibrium with the P28T mutation (9). Five disease chromosomes (four Hungarian and one Spanish) carried haplotypes that were closely related to the full ancestral haplotype found in the Bulgarian families (Fig. 2) (9). The remaining haplotypes provided evidence of different historical recombinations that have occurred in the divergent Romani groups. In general, haplotype diversity in the new sample was greater than that observed among disease chromosomes in Bulgaria.

Estimates of the age of the P28T mutation, based on the diversity of polymorphic haplotypes observed in the Bulgarian families and in the families sampled in other European countries, are shown in Table 2. The younger age calculated for the Bulgarian Roma is the result of the recent splits of the proto-Romani population, where diversity within each group is less than that in the entire population. The dating method incorporates the growth rate of the population, and the values obtained for the European Roma range between 18 generations (the lower confidence limit for a fast-growing population) and 54 generations (the upper confidence limit for a slow-growing population). If we consider 17,000 the population size in the 14th century [a reasonable approximation given the available information on the historical demography of the Roma (10)] and the current population of 8 million Romani in Europe (13), the overall growth rate, for a generation time of 25 y, is 1.32. The estimated time of the origin of the P28T mutation is then around 30 generations or 750 y ago.

Table 2 Multipoint estimates of the age of the P28T mutation in Europe

DISCUSSION

The P28T mutation was identified in a study of Romani families with galactokinase deficiency from Bulgaria (9). The results of the present study suggest that it is shared by galactokinase-deficient patients of Romani descent throughout Europe. A common origin of the mutation and its inheritance, identical by descent, is strongly suggested by the closely related polymorphic haplotypes carried by all disease chromosomes and by the complete homozygosity for a single conserved allele of marker D17S1839, which was shown in our previous study to be at a small physical distance from the GALK1 gene (9).

The limited number of affected families where haplotype analysis has been performed, and the great variety observed in these families, means that we have sampled only a fraction of the haplotype diversity that may exist in Romani populations across Europe. The age of the P28T mutation, based on the current observations, is therefore likely to be an underestimate. Nonetheless, even the calculated 750 y since the origin of P28T are sufficient to explain its wide geographical distribution: comparative linguistic studies of Romani dialects suggest that the major splits occurred around the 13th century, after the arrival of the proto-Roma in the Balkans (14). An older age, compatible with our present data, could mean that the mutation originated in the ancestral population before the exodus from India, around 1000 y ago.

An uneven distribution of the mutation frequency among different Romani groups, and thus the existence of groups at particularly high risk, has been suggested by our previous observations (9) and is supported by the present results. Within Bulgaria, carrier rates among the Vlax Roma are twice those among other groups. In Hungary, where the occurrence of P28T is clearly documented by the analysis of affected families, the screening of 310 Gypsy Romani control individuals failed to detect a single carrier, suggesting that a high-risk Romani group may not have been included in the screening sample. The reasons for such nonrandom distribution of the galactokinase deficiency, as well as other mutations (1518) among the Roma, are not clear. The early historical descriptions of small Gypsy groups traveling in different parts of Europe (13) suggest that the differences observed today could easily result from a founder effect/genetic drift scenario, leading to high frequencies in some groups and loss of the mutant allele in others. In Spain, whose Romani population is considered genetically homogeneous (de Pablo R, unpublished data), the average carrier rate is 1:54.

Galactokinase deficiency is a rare disorder, with a reported incidence of affected births around 1 per million in America, Japan, and western Europe (19) Gitzelmann R, Steinmann B, unpublished data). Predictably, out of 21 mutations in the GALK1 gene identified in these populations (2023), 20 are private defects confined to individual families, and one, Q382X, is common to affected families of Costa Rican descent (22).

A high incidence of the disorder among the Roma has been suggested by a number of reports (58), as well as by the gradient in the incidence of galactokinase deficiency in Europe, from the low figures reported in the West, to 1:150,000 in central Europe, and 1:52,000 in Bulgaria (9, 24), coincident with the distribution of Romani minorities in the different parts of Europe (13, 14). However, the traditional demographic pattern has undergone significant changes during the last decade, as a result of the political and economic changes in eastern and central Europe and the wars in former Yugoslavia. The shift in the demographic profile, with significant numbers of Romani refugees moving to the West (25, 26), can be expected to result in an equalization of the frequencies of “private” Romani disorders between different countries.

Our studies have shown that galactokinase deficiency among the Roma is genetically homogeneous and, similar to other mendelian disorders (1518), is caused by a single founder mutation. The mutation, P28T, is found in patients with galactokinase deficiency along the historical migration routes of the Roma, from the Balkans to the Atlantic, and our estimates of the age of the mutation are compatible with its occurrence among socially divergent and geographically dispersed Romani groups. Superimposed on this historical spread will be the migration wave of the last decade. P28T is thus likely to account for a very high proportion of cases of galactokinase deficiency across Europe. The low cost, high specificity, and relatively high sensitivity of the P28T detection assay appear to justify its inclusion in pilot screening programs, to determine the mutation frequency among the newborn populations of different European countries and design the longer-term public health strategy for prevention.