Introduction

Huntington's disease (HD) is a dominantly inherited neurodegenerative disorder caused by a mutation of the HD locus mapped at 1983 at 4p16.3 (Gusella et al. 1983) and later cloned (HDCG 1993). The unmutated gene consists of 6–34 CAG trinucleotides in exon 1; a mutation with extended CAG trinucleotide repeats beyond this normal range has been identified as the cause of HD. This mutated gene encodes a longer polyglutamine tract with deleterious effects on certain neural cells. The clinical features include motor disturbance, progressive cognitive loss and psychiatric manifestations, eventually leading to death. The age of onset of symptoms is quite variable, and the disease shows a tendency to be transmitted predominantly through the paternal line (Snell et al. 1993). Extended mutations of more than 39 CAG trinucleotides show complete penetrance, but the symptoms still appear mostly late in life.

An assessment of the size of the expanded alleles may enable an accurate diagnosis of the carrier condition and the prediction of the approximate age of onset in asymptomatic individuals, since an inverse correlation is present between the number of CAG repeats and the age of onset. One approach to ascertaining the genetic origin of a mutation(s) involved in any inherited disease is to detect concurrent intragenic markers in a population for that disease allele and the geographic distribution of the population.

Huntington's disease shows a striking geographic prevalence and aggregation among residents of the western coast of Lake Maracaibom, Zulia State, Venezuela (Negrette 1962; Avila-Girón 1973; Wexler et al. 2004). This population has the dubious honor of having the highest known frequency of carriers worldwide. However, all other epidemiologic aspects of HD are unknown among other Venezuelan populations. Here, we describe the results of a molecular analysis of the mutated HD gene among 60 families living in Venezuela of which only four (6.7%) have ancestors that came from within the endemic area in Zulia state.

Materials and methods

DNA samples

Between 1988 and 2006, 279 individuals from 60 unrelated families affected with HD were studied for confirmation of the diagnosis and genetic counseling. All of the individuals had given spontaneously consent for the tests or had been referred by neurologists to the Laboratory of Human Genetics at Instituto Venezolano de Investigaciones Científicas (IVIC). The exact place of birth of the grandparents and great-grandparents, a family history and age of onset of symptoms (psychiatric, involuntary movements, ataxia) were recorded in all index cases and in carriers. Once voluntary informed consent had been obtained from each family, in accordance with the Ethics Committee of the institute, we collected 5 mL of peripheral blood in EDTA as an anticoagulant. DNA was extracted by a saline method (Lahiri and Nurnberger 1991) and kept at room temperature (23° ± 1) in 70% ethanol for as long as 2 years in families diagnosed with HD prior to 1995, or amplified within the following 4 weeks after collection.

DNA analyses

DNA amplification of CAG repeats was achieved with primers F-5′-ATG AAG GCC TTC GAG TCC CTC AAG TCC TTC (HDCG 1993) and R-5′-GGC GGC GGT GGC TGT TG, which were designed to exclude the CCG tract nearby the CAG repeats. A PCR analysis was performed in a volume of 15 μL containing 10 pmol of each primer, 3.2 mM MgCl2and 200 μM dNTPs at 65°C (annealing temperature) with a hot start protocol (80°C for 3 min before thermal cycling). The products were resolved on 8% polyacrylamide gels (14 × 14 cm, thickness 0.8 mm; acrylamide:bisacrylamide, 29:1) in TAE buffer. The gels were stained with silver nitrate (Brandt et al. 1992).

Polymorphisms used to define haplotypes

The polymorphic markers used to ascertain ancestral origins (Arias 1994) are two variable number of tandom repeats (VNTRs) and two single nucleotide polymorphisms (SNPs) located in the promoter region, the CCG repeats in exon 1 and the GAG trinucleotide insertion/deletion in exon 58 (Δ2642), as depicted in Fig. 1. When the affected member of a family was a heterozygote for a number of the polymorphisms, a segregation analysis in the family was carried out to assess the phase. The promoter region (−303 to −1 bp) was PCR amplified with the primers Pf 5′-CAG CGG CTT GCT GTG TGA GG and Pr 5′-AGC TTT TCC AGG GTC GCC AT, which were modified from Coles et al. (1997). The PCR analysis was performed as reported, with the addition of 10% dimethylsulfoxide (DMSO) and annealing at 58°C. The products were resolved on 8% polyacrylamide gels (acrylamide:bisacrylamide, 29:1). The amplified fragments were sequenced for the SNPs at −148 G > A (GenBank: Y07982) and −103 C > T (GenBank: Y07983) by the Macrogen Sequencing Service, Seoul, Korea.

Fig. 1
figure 1

Schematic illustration of the six polymorphisms along the Huntington's disease HD gene which were used to construct haplotypes

For the two VNTR polymorphisms, a combined system was used for assignment. We observed only four haplotypes associated with the first VNTR, all consisting of a 6-bp tract (one or two copies) and the second VNTR, consisting of a 20-bp stretch (one to three copies), as shown in Fig. 1. Thus, the combined allele 1 was defined as having one and two copies of the respective VNTRs; allele 2, as having two and two copies; allele 3, as having one copy of each VNTR; allele 4, as having one and three copies.

Primers used for the CCG polymorphism were HD419X and HD482X (Andrew et al. 1994). The PCR analysis was performed as reported above, with the addition of 3.5% formamide, using a touchdown protocol for thermal cycling that begins at 68°C and decreases at a rate of 0.5°C/cycle until 65°C. Products were resolved in 10% polyacrylamide gels (19:1). Primers used for the Δ2642 polymorphism were d2642F 5′-CAG TGC CCC GTT TCT GTG and d2642R 5′-TGT TAA AAA TAA AAA GGC CATC, which were designed with the Primer3 program available at http://frodo.wi.mit.edu/. The PCR analysis was performed as above with 5 pmol of each primer at 55°C. The products were resolved on 8% polyacrylamide gels (acrylamide:bisacrylamide, 29:1), and the structure and size of the allele were confirmed by sequencing The assigned alleles correspond to the number of CCG repeats; 'A' indicates deletion and 'B' indicates insertion for the Δ2642 polymorphism (Almqvist et al. 1995; Rubinsztein et al. 1995).

Statistic analyses in CAG repeat expansion

An electrophoreogram in Fig. 2 shows the normal and expanded alleles in an HD-affected family. The number of CAG repeats was calculated by dividing the detected size by three following subtraction of 50 bp that corresponds to the sequence outside the tract of CAG repeats. Thus, a maximum estimation error would be less than two CAG repeats, even in the case of increased retardation during electrophoresis. The statistical analysis was carried out using the Excel software program (Microsoft, Seattle, WA). The χ 2 test was done based on Roff and Bentzen’s methodology (1989). The regression equation graphic was obtained using a SPSS software program (SPSS, Chicago, IL).

Fig. 2
figure 2

Polyacrylamide electrophoresis of normal and expanded alleles in a single HD family. (CAG repeats): 1 Index case (22/41), 2 her 91-year-old father (17/36), 3 her mother (22/24), 4 her daughter (16/22), 5 86-bp amplicon (CFTR exon 10) according to Friedman et al. (1990), 6 DNA bp ladder (pBR322MspI digest)

Results

Allele size and its transmission to offspring

Of the 279 individuals tested, 139 had an expanded allele at HD, of whom 51 (36.7%) were still asymptomatic (ages 7–91) during the study period. The size of the allele in 40 independent control individuals (normal husbands and wives of affected cases) varied between 11 and 31 CAG repeats (Fig. 3), with 17 (21.3%), 18 (16.3%) and 19 (15.0%) CAG repeats appearing the most frequently; heterozygosity was h = 0.877. The number of CAG repeats of all these control subjects therefore fell within the normal size range – i.e. there were no major expansions or contractions. As such, the numbers of CAG repeats found in the control subjects are similar to those reported for other normal populations (Kremer et al. 1994; Masuda et al. 1995; Alonso et al. 1997; Raskin et al. 2000). The expanded alleles varied between 35 and 112 repeats, with 41 (14.4%), 42 (10.1%) and 43 (10.1%) CAG repeats appearing the most frequently (Fig. 3), which is also similar to values reported previously for other populations (Kremer et al. 1994; Masuda et al. 1995; Alonso et al. 1997; Brinkman et al. 1997; Wexler et al. 2004).

Fig. 3
figure 3

Distribution of the number of CAG repeats in 40 independent control individuals (open bars) and in 139 expanded allele carriers (shaded bars). HD alleles with >55 CAG repeats are beyond the limits of the graph

We observed eight shorter expanded repeats (35–39 repeats) with incomplete penetrance. Interestingly, some of these shorter expanded repeats showed instability in terms of being passed onto the successive generation. Three cases of this tendency that we observed are: (1) a man with 36 repeats had an affected daughter with 41 repeats (Fig. 2); (2) an asymptomatic 76-year-old father with 39 repeats had an offspring with 46 CAG repeats; (3) a mother had two sons carrying 40 and 43 repeats respectively. These findings contrast those recently reported in 80 Zulian carriers with very stable incomplete penetrant alleles (Wexler et al. 2004) that did not show expansions beyond 40 CAG repeats in their offspring.

The transmission of the mutated allele from parents to offspring was assessed in 55 cases – 30 maternal and 25 paternal transmissions. The repeat numbers changed during the transmission from parents to offspring, as shown in Fig. 4. In paternal transmissions, there were 17 instances (68.0%) of a gain of a further 2–66 repeats; in maternal transmissions, there were 12 instances (40.0%) of expansions of between one and four repeats. Any expansion of more than four repeats was always associated with paternal origin since meiotic instability is greater in fathers (Ranen et al. 1995). In our series, a single father with 46 CAGs had various offspring having between 52 to 112 repeats and sizes in between. As the extent of further expansion was not directly correlated to the expanded repeat size, the haplotype around the mutated HD locus, or even the haplotype on another chromosome, may influence the instability. Further systematic study will be required to clarify the mechanism for transmission instability.

Fig. 4
figure 4

Variation in allele size in 55 transmissions of the mutated HD chromosomes from parent to offspring. Variation is clearly greater in paternal inheritance

Age of onset

The age of onset of symptoms was ascertained in 75 carriers. The earliest age of onset was, based on family accounts, 3 months – in a young female who carried an expanded allele of 112 repeats and who died at 23 years of age after being severely affected since puberty. In 58.7% of the patients, age of onset occurred between the fourth and fifth decades of life (ages 30–50); 21.3% of the patients had symptoms as early as the third decade, and one 91-year-old father with 36 repeats has not shown symptoms and most probably will not display them during his lifetime, as reported previously (Brinkman et al. 1997; Penney et al. 1997).

A log-transformed regression equation, y = −0.0238x + 2.6616 (Fig. 5), was developed on the basis of the symptomatic cases, i.e. within the range 39–54 CAG repeats (71/75 carriers). The variance in the range of age of onset seems to be lower in our patient cohort for those having more than 46 expansions and those having fewer than 41 expansions (Fig. 5). The obvious variability in the age of onset occurring in patients with 39–46 CAG expansions may reflect the joint action of genetic and environmental factors influencing the HD gene phenotype, as suggested by Wexler et al. (2004). The number of CAG repeats, individual differences in the male meiotic instability and additional, as yet unclear genetic factors seem to influence the age of onset (Rubinsztein et al. 1997; Andresen et al. 2007).

Fig. 5
figure 5

Log of age of onset regression equation estimated from 71 symptomatic cases. Dotted lines show the limit of the 95% confidence interval. Numbers indicated the number of observations >1

Geographic distribution of the disease

The most frequent geographic origins for the first known affected ancestor were the states of Táchira with seven families (11.7%), Miranda and Lara each with six families (10.0%), followed by Guárico, Zulia and Distrito Capital, each with four families (6.7%) (Fig. 6); in six families the geographic origin was unknown. This distribution conforms to three main geographic foci (circles 55 km in diameter = 2376 km2) (Arias 1994), with centers in Seboruco (Táchira), Curarigua (Lara) and Charallave (Miranda), respectively. The four families from Caracas (Distrito Capital) have an uncertain remote heterogeneous origin and may not constitute an actual focus according to the original definition (Arias 1994). Huntington's disease is present in 15 of 22 states, but it is primarily focal in the states of Táchira, Miranda and Lara. An additional conjoint focus may be present in the southern part of Guárico State and the northern part of Apure State, since there are two families in the San Fernando (Apure) focus.

Fig. 6
figure 6

Political map of Venezuela showing the number of HD families in the state of birth of the first known affected ancestor. The states of Táchira, Lara and Miranda had the highest number of HD families. Of 60 families, 11 had grandparents born in other countries (Spain, three; Portugal, two; France, two; Colombia, two; Peru, two). Six families had an unknown origin

Ancestors and origin

Two VNTRs in the promoter region – the CCG repeat in exon 1 and the Δ2642 polymorphisms – were ascertained in 90 members of the HD families (75 affected and 15 normal individuals) plus 25 control persons whose ancestors came mainly from the states of Táchira, Lara and Zulia. Our analyses of the partial haplotypes (VNTR;CCG;Δ2642) in both groups (HD and controls) revealed that the most frequent combination of allele 1 was the VNTRs and seven repeats of the CCG stretch (Table 1). The deletion at position 2,642 was found only in one control chromosome of the 50 tested. One affected individual (from Piura, in northern Peru) of the two Peruvian families was homozygous for this marker and thus may have a different genetic origin than those from the known southern foci in Peru (Cuba 1986; Cuba and Torres 1989; Torres et al. 2006). There is a significant (p < 0.001) difference in the frequency of haplotype 1;7;(A) between the control and HD groups. The incidence of the partial haplotype 1;10;(A) was almost sevenfold more frequent in the control group than in the homologous non-mutated chromosome of the HD mutation carriers (Table 1).

Table 1 Observed partial haplotype (VNTR;CCG;Δ2642) frequencies in Huntington's disease (HD) families and in control individuals

Using the SNPs at −148 G > A and −103 C > T, which were ascertained through sequencing an affected member in each Venezuelan family, two Colombian families, two Peruvian families and affected members from European countries, we constructed haplotypes for each family in each state and compared these with those with a Caucasoid origin (Spain, Portugal and France). With the exception of two families (French and Peruvian origins), those tested had the same 1;G;C;7;(A) haplotype in phase with the Huntington mutation (Table 2). The normal chromosomes in HD families had several different haplotypes, although the 1;G;C;7;(A) was also the most frequent (62.2%) (Table 2).

Table 2 Complete haplotypes in carriers according to geographic origin

Discussion

Huntington disease is a genetic disorder that occurs worldwide, but the prevalence varies significantly among ethnic groups. The incidence is highest among European Caucasoids [1/11,905 in Canada (Manitoba), 1/13,333 in England (Bedfordshire), 1/18,587 in Spain (Valencia)] and lower among other ethnic groups, such as the Finnish (1/250,000), Chinese (1/270,270, Hong Kong), Japanese (1/138,889, San-in) and Africa Negroids (1/270,270, Cape, South Africa) (Frequency of Inherited Disorders Database, FIDD 2007).

To date, the published prevalence figures from Venezuela (1/143 and 1/23,000) (Negrette 1962; Avila-Girón 1973) deal exclusively with subsets of a singular community inhabiting the western coast of Lake Maracaibo in the state of Zulia ( Gusella et al. 1984; Young et al. 1986; Wexler et al. 1987; Penney et al. 1990; Bonilla 1991). The carriers were postulated originally to be descendants from a single male ancestor (Negrette 1962), although a female origin – neither documented nor empirically supported – has recently been proposed for the mutation (Okun and Thommi 2004; Wexler et al. 2004). However, the extreme high frequency and widespread distribution of the mutation in the state of Zulia within a period of only 100–120 years (Negrette 1962) is an unlikely outcome for any single female founder. Published data on the incidence of HD in areas of Venezuela outside of Zulia are limited to a congress abstract (Arias and Paradisi 1996).

The Genetic Counseling Clinic at IVIC has been – and still is – the single national reference center for the molecular diagnosis of HD in Venezuela. Since HD is a disease that is rarely found outside of Zulia State, it can be hypothesized, without much bias, that it should be possible to assess the geographic distribution and ethnic origin(s) of the mutation(s). However, the prevalence of the HD in the “general population” of different Venezuelan regions cannot yet be estimated accurately due to a definitive incomplete detection, although it does seem to be low, with a frequency of approximately 1/200,000. This incidence is estimated from data gathered during the last two decades, assuming an under-detection rate of 50% since only single families have been found in 30.4% of all the 22 nation states. Intrafocal prevalence in the three main foci described here (Táchira, Lara and Miranda) lies between 1/18,315 and 1/31,803, which is 9.1- to 5.2-fold higher than that for the “general” Venezuelan population.

The complete haplotype that we constructed using six intragenic polymorphisms, five of which lie within a stretch of less than 500 bases, was identical in almost all of the chromosomes carrying the mutation. This haplotype was observed in the families inside the states (within a focus) and in all others with which it was compared, including the affected Spanish, Portuguese, Colombian and one Peruvian family, suggesting a common and very old Caucasoid origin. According to published data, the 7/10 CCG polymorphism enables affected Mongoloid (Japanese, Chinese) and Negroid populations to be differentiated from European Caucasoid (including Finnish) individuals (Squitieri et al. 1994); the frequency of ten CCGs is lower than 6% in the latter group compared to 61% among the former two ethnic groups (Squitieri et al. 1994). All of our HD chromosomes tested had seven CCG repeats, which again supports a plausible European Caucasoid common source for the mutation in Venezuela. In contrast, the study of the CCG polymorphism in other populations, such as the Colombian and the Spanish ones (Valencia), clearly revealed at least two different origins for their mutations (Alfonso et al. 1996; García-Planells et al. 2005).

The Δ2642 deletion has been reported to be in allelic disequilibrium with some Huntington chromosomes, showing marked ethnic differences in its frequency (Almqvist et al. 1995). It is absent in Negroid, Chinese or Japanese chromosomes but has a frequency between 0.103 and 0.32 in European Caucasoid HD chromosomes (Squitieri et al. 1994; Almqvist et al. 1995). No Δ2642 deletion was found in 47 out of 48 independent HD families tested. The homologous non-mutated chromosomes of the Venezuelan HD patients also showed a lower haplotype diversity than the chromosomes of the control population (four versus seven different VNTR;CCG;Δ2642 haplotypes). This lower diversity may reflect a very remote common ancestor within the foci with the same ethnic origin in the HD families for both chromosomes but a different ethnic origin for the homologous (out-of-phase) ones in the control population.

There have been no published reports on the frequencies of the haplotype with six intragenic markers that was used in this study; therefore, any comparison is impossible at this time. Of the four promoter region polymorphisms for which frequencies are available (Coles et al. 1997; Kartsaki et al. 2006), the 1;G;C haplotype appears to be the most frequent in all studied populations, including our sample, although with interethnic differences. The 4;G;C haplotype is almost absent in normal Caucasoids, but its frequency in control Japanese chromosomes is around 10% (Coles et al. 1997); in our control sample, it is 12%. While this may indicate an Amerindian (Mongoloid) origin for the control chromosomes (Kolman et al. 1996; Tokunaga et al. 2001), it rules out this possibility for the homologous chromosome in carriers from the Venezuelan populations.

The wide distribution of affected families throughout Venezuela may suggest more than one founder phenomenon for the HD subset of European Caucasoid chromosomes. Conversely, the haplotype findings do indicate a common origin for at least the families from the states of Zulia and Táchira, and it is possible that the same interpretation could be extended to other foci. There has been demographic exchange between the southern Zulia and northern Táchira populations from the eighteenth century onwards (Osorio 1996); consequently, the ancestral identity of the mutation in these two nearby regions must be considered likely for the HD gene, which has a low mutation rate within the normal size range (Rubinsztein et al. 1995). The fact that the same identical haplotype appears in all mutated chromosomes in states far away from Zulia also points toward a probable Caucasoid origin of the mutation. A recent report on the Zulian cluster of HD carriers (Andresen et al. 2007) described different alleles in the mutated chromosomes; there were two (CCG)6 homozygotes (0.65%), and some homozygotes of the 4% of carriers in their sample had the Δ2642 marker. These findings prove the heterogeneity of origin in at least 4% of a subset of the Zulian kindred. Thus, the single HD founder chromosome hypothesis of either sex (Negrette 1962; Okun and Thommi 2004; Wexler et al. 2004) seems to be untenable for all the carriers, although it might still be admissible for the remaining 96% of the whole subset.

A multiple ethnic origin for some of the mutated chromosomes in our sample cannot be entirely excluded, but it does not seem to be likely. The results of haplotype analyses suggest that the mutation has a common ethnic descent (although not necessarily a single origin) in all Venezuelan states with HD families. This origin would appear to be European Caucasoid in Venezuela, including Zulia state, and this may be the case in some population subsets of other IberoAmerican countries.