Introduction

Recent studies have estimated that at least 10% of the human genome is polymorphic for structural rearrangements. New developments in genome-scanning array and comparative DNA sequencing technologies have made possible the characterization and assessment of all classes of genomic variation, including structural variation. These developments provide an opportunity to generate geographical maps of the frequency of structural variants. Some of them are typical for different ethnic groups or for certain populations (Gu et al., 2007; O'Hara, 2007; Spielman et al., 2007). The presence of these variations is thought to be an important contributor to the evolution in human genetic diversity and can generate difference in disease susceptibility (Feuk et al., 2006).

One of the most notable structural variants found to date is a 900 kb inversion, which is located to 17q21.3 (Stefansson et al., 2005). This region includes the gene of microtubule-associated protein tau (MAPT), which is widely studied, as it contributes to several human diseases (Hardy et al., 2006). These extensive investigations have revealed that two main nonrecombining MAPT locus haplotypes (H1 and H2) can be distinguished. The H1 haplotype and/or subhaplotypes play a general role in the development of sporadic tauopathies (Laws et al., 2007; Myers et al., 2007). It has also become obvious that there are notable differences in the geographical distribution of the MAPT-related haplotypes. The H2 haplotype is rare in Africans, and almost absent in East Asians and Native Americans, but very frequent (20–30%) in populations of Caucasian origin (Evans et al., 2004; Stefansson et al., 2005). It has even been postulated that the H2 haplotype was contributed to the human genome by Homo neanderthalensis (Hardy et al., 2005).

In this study, we examined the distribution in two historically and ethnically different populations (Gypsies and Caucasian Hungarians) from the same geographical area. The Caucasian Hungarians belong to the Uralic linguistic family, a diverse group of people related by an ancient common linguistic heritage, distinct from that of the Indo-European speakers who surround them. Of the approximately 25 million Finno-Ugrians, the best known are the Estonians, the Finns and the Hungarians. Around the 5th century BC, the ancient Hungarians were caught up in a wave of migrations that swept the steppes and were displaced from their western Siberian homeland. Migrating westwards, the Hungarians arrived in 895 in the Carpathian Basin, an area where the overwhelming majority of the indigenous population was Slavic (Figure 1). Various genetic appraisals have estimated that the newly arrived Hungarians accounted for 10–50% of the total population of the Carpathian Basin (Cavalli-Sforza, 1994). During the turbulent history of present-day Hungary, the mixing process has continued, and Hungarians can now be regarded as members of a mixed European population (Semino et al., 2000).

Figure 1
figure 1

Migration routes of Gypsies from Punjab region and Hungarians from Uralic Mountains.

In contrast to Hungarians, Gypsies are a conglomerate founder population with Asian roots, imbedded in a genetically different Caucasian population. The social sciences and comparative linguistic studies have hinted at the Asian origin, and this has been supported by population genetic studies of single-locus polymorphism, of multi-locus STR Y chromosome haplotypes and of mtDNA haplotypes (Gresham et al., 2001; Kalaydjieva et al., 2001a; Morar et al., 2004).

The combined evidence suggests that Gypsies migrated from the Punjab region of northwest India 1000–1500 years ago and traveled through Asia (along Persia, today's Armenia and Turkey). The main stream moved into the Balkans and Greece and some of them into eastern Europe ahead of the Turks (Figure 1). Early diaspora appeared in western Europe around the period from the fourteenth to the fifteenth century, and another wave of migrations to western Europe started after the abolition of serfdom in the Habsburg Empire in 1841, and recently from 1989 after the disappearance of the Iron Curtain (Kalaydjieva et al., 2005).

At present, 8–10 million Gypsies live in fragmented subisolates in Europe, approximately 600 000 of them in Hungary. In Gypsy society, the primary unit is the group, and groups are members of metagroups. They live in a closed society structure, with rare admixture with other populations, and a relatively high rate of consanguinity (Assal et al., 1991). There appears to have been population bottlenecks, both when they left India and during the European segregation. A high intragroup diversity can be observed (Gresham et al., 2001).

Hungarian Gypsies were not classified in previous publications or were included among western European Gypsies (Morar et al., 2004). However, we think that the comparison of the Hungarian Gypsy population is an adequate choice for genetic investigations because the ethnic diversity in Hungary is not as high as in the Balkans, and it is possible to distinguish three well-described metagroups among Hungarian Gypsies. Carpathian Gypsies or Romungros are the least characterized and intact metagroup. Their language consists elements from Beas, Lovari and Hungarian. They represent the 70% of the Gypsies living in Hungary. The two smaller metagroups are more closed and cohesive; they live typically in separated parts of smaller villages or towns. They preserve their traditions and language; as a consequence, the assimilation with other metagroups or with the Caucasoid Hungarian population is low. Beas represents 10% of the Hungarian Gypsy population; their migration to the Carpatian Basin came from the Central-West Balkans. They speak the Beas language. The Olahs, with a proportion of 20% from the Hungarian Gypsy population, arrived at the Carpatian Basin from the territory of today's Romania and they speak the Lovari language. They are the descendants of the Walachian/Vlax Gypsies, the most studied Gypsy population (Kalaydjieva et al., 2001a).

In this study, our goal was to evaluate the H1–H2 haplotype frequencies in the populations mentioned above by using a polymorphism of MAPT gene as a marker.

Materials and methods

Sample characteristics

In this study, 118 healthy Gypsies of the Olah/Vlax metagroup and 184 healthy Caucasian Hungarians were genotyped (Figure 2). The Gypsy participants were recruited from three villages in the same geographical area in northeastern Hungary. The Hungarians were employees and students of Department of Psychiatry, University of Szeged, and Department of Hungarian Congenital Abnormality and Rare Disease Registry of the National Centre For Healthcare Audit and Improvement and their acquaintances, who were matched with the Gypsy volunteers for age and gender. After complete description of the study to the subjects, written informed consent was obtained.

Figure 2
figure 2

The distribution pattern of MAPT genotypes.

DNA isolation

The genomic DNA of Gypsy and control subjects was extracted from peripheral blood according to a standard method (Davies, 1986).

PCR amplification

The selected region was amplified means of the PCR. The inverted chromosome region was screened by using the standardly used biallelic intron 9 deletion-inversion polymorphism (Baker et al., 1999). The following primer pairs were used: forward: 5′-GAAGACGTTCTCACTGATCTG-3′; reverse: 5′-AGGAGTCTGGCTTCAGTCTC-3′.

Polymerase chain reaction amplification was carried out in 20 μl reaction volume containing 2 μl of 10 × ZenonBio, 10 × reaction buffer, 50 nM of each of the primers, 0.5 μM of each of the dNTPs, 4 mM MgCl2, 100 ng of DNA extract and 0.3 U of ZenonBio TaqPolymerase. The amplification protocol was as follows: 3 min at 93 °C, 30 cycles of 93 °C for 60 s, 60 °C for 60 s and 72 °C for 60 s, and final extension at 75 °C for 5 min. A volume of 7 μl of PCR product was run on 6% native polyacrylamide gel and visualized after ethidium bromide staining by UV transillumination, and the size of the products was determined with the GelBase gel documentation system (UVP).

Statistical analysis

The departure from the Hardy–Weinberg equilibrium was tested by using the ‘HWE.test’ function (P-value calculated by the exact method) of the genetics R package (R version 2.4.0, R Development Core Team, 2006; Warnes, 2008). Fisher's exact tests carried out in R were used to determine the significance of differences in genotype and allele frequencies.

Results

The MAPT allele frequencies in the Caucasian sample were in Hardy–Weinberg equilibrium (P=0.842). A deviation from the Hardy–Weinberg equilibrium was observed in the Gypsy population sample (P=0.017).

The distribution of MAPT genotypes are presented in Figure 2. The MAPT H1 homozygote haplotype is seen to be overrepresented in the Gypsies as compared with the Caucasians (83.0% (n=98) vs 56.5% (n=104) (one-tailed P<0.001). H1/H2 heterozygotes prevail in the Caucasian population (38.0%; n=70 in the Caucasians vs 13.6%; n=16 in the Gypsies) (one-tailed P<0.001). The calculated frequency of the H1 allele in the Gypsy population was greater than that in the Caucasians (89.8% (n=212) vs 75.5% (n=278) (one-tailed P<0.001), whereas H2 allele was more dominant in the Caucasian population.

Discussion

Our results indicate a different proportion of the inversion at 17q21.3 in Hungarian Olah Gypsies as compared with Caucasian Hungarians. This study has revealed that Hungarian Olah Gypsies, who are related to the Asian population, carry the H1 allele at a higher proportion than Caucasian populations. This supports the notion that 17q21.3 structural variation and tau haplotypes are suitable markers for the demonstration of the degree of admixture in a well-characterized non-Caucasian population.

The 23.8% H2 allele frequency in the Hungarian population accords well with the frequency of 25% in middle-eastern and European populations (Evans et al., 2004). The previously reported 8% of H2 allele frequency (Evans et al., 2004) in the Finnish population stands closer to the Asian genotype distribution. These results suggest that the Finnish population experienced less admixture than the Caucasian population of Hungary, and the Asian descent of the latter is not detectable by this method.

In our Gypsy sample, the frequency of the H1 allele was lower than previous estimates from populations of Asian origin (only populations from South Pakistan were similar) (Evans et al., 2004). The lower frequency of the H1 haplotype in the Gypsy population may be a consequence of their coexistence for centuries and partial admixture with H2 carrier Caucasian populations. This effect is likely to have been strengthened by the fact that the Olah/Vlax metagroup traditionally tolerates marriages with non-Gypsy women, whereas some other Gypsy groups do not. The deviation from the Hardy–Weinberg equilibrium in the Gypsy group can be explained by the population genetic effect of their closed society structure and the higher rate of consanguineous mating.

The Gypsy ethnic group was ignored for centuries by Western society and medicine. The United Nations Development Programme (www.undp.org) and the Decade of Roma Inclusion 2005–2015 (www.romadecade.org) recognized the importance of medical and social studies. In the past decade, various Mendelian diseases with a carrier rate of 5–15% have been identified in the Gypsy population (Kalaydjieva et al., 2001b), but multifactorial tauopathies have not been well described in Gypsies. This can be explained by their social and medical neglect and the fact that tauopathies are typically late-onset neurodegenerative diseases, although the average life expectancy of Gypsies is 10–15 years lower than the European standard (Sepkowitz, 2006).

H1 carriers are under a negative selection, as H2 carrier women have more children (Stefansson et al., 2005; Voight et al., 2006) and because of the possible role of H1 allele in tauopathies. Alzheimer's disease (Myers et al., 2005; Laws et al., 2007), Parkinson's disease (Skipper et al., 2004), progressive supranuclear palsy (Pittman et al., 2004), argyrophilic grain disease (Fujino et al., 2005), corticobasal degeneration (Buee and Delacourte, 1999) and the Parkinson–dementia complex of Guam (Sundar et al., 2007) are all associated with MAPT H1. In addition, it seems that, besides carrying H1 allele, there are other factors that influence disease onset. Differences in gene expression or in alternative splicing or both may lead to enhanced tangle formation and the development of the disease (Avila, 2006; Caffrey et al., 2006; Hardy et al., 2006; O'Hara, 2007). Exposure to different (European) environmental factors means differences in epigenetic effects on gene expression (Spielman et al., 2007). For instance, a recent study (Winkler et al., 2007) indicated H1/H1 genotype as an ethnically dependent risk factor of Parkinson's disease, and another one raised remarkable suggestions on this field (Fung et al., 2005). An early work also observed association regarding tau variants and Asian versus Caucasian populations in progressive supranuclear palsy (Conrad et al., 1998). Thus, higher H1 frequency in Gypsies might be a risk factor of multifactorial disorders and be manifested as an elevated susceptibility to tauopathies among the Gypsy population in Europe. Further investigations are needed in populations with high H1 frequency where the social and medical aspects and the average life expectancy are better.