Main

Phenylketonuria (PKU; MIM# 261600) is a common autosomal recessive inborn error of amino acid metabolism and mainly results from mutations of the phenylalanine hydroxylase gene (PAH; 612349). So far, >500 different mutant alleles have been identified at the PAH locus and listed in the PAH Mutation Analysis Consortium database (http://www.pahdb.mcgill.ca/), which cause different levels of reduction in the catalytic activity of the enzyme, generating a wide spectrum of biochemical and clinical phenotypes (1,2). Data (318) on the distribution and relative frequencies of the mutations have been described for various populations and have shown great variability in the mutational spectrum and differences in the degree of heterogeneity, which are useful for further understanding of both the clinical features and the population genetics of the disorder.

To date, there has been no comprehensive population genetic study of PKU focused on the Han Chinese. This largest ethnic population in China is naturally divided into two groups, the Southern and Northern Han, by the Yangtze River (19,20), resulting in the formation of different founder populations with relatively isolated consanguinity. It is believed that the difference between two groups is greater than that between given subpopulations and ethnic minorities at the same location, and the stratification would affect the mutational spectrum. Based on previous studies (2123), there are some indicators to suggest that regional variations on mutation frequencies do exist between southern and northern China. However, the evidence to support this statement is limited, as existing ones are selective and of small sample size (7,2430).

We undertook this study with the objectives of reaching full ascertainment of PKU mutations in the Chinese Han and of investigating regional differences in the mutation spectrum within China, looking in greater depth at the possible explanations for the geographic distribution of the common mutations.

MATERIALS AND METHODS

Subjects.

A total of 212 unrelated Chinese Han patients with PKU, corresponding to 424 independent alleles, were investigated. The geographical distribution of mutant alleles within China was defined on the basis of the origin of the birth parents of each case studied. So, only those patients with both parents originating from the same native place were recruited. Therefore, 112 came from southern China and 100 from northern.

Most (150) of the patients, accounting for 71%, were identified when they showed mental retardation between 6 mo and 3 y olds; the remaining (62), accounting for 29%, in neonatal screening. Their phenotypes were classified based on the pretreatment plasma phenylalanine (phe) levels or the phe level at diagnosis. Accordingly, 113 patients were classic PKU (phe >1200 μM/L) (113/212); 66, mild PKU (phe 6001200 μM/L) (66/212); 15, MHP (phe 120600 μM/L) (15/212); 18 cases were unclassified, with their phe level unavailable. An explanation of the study was given to the participating patients and a standard informed consent, which was reviewed and approved by the Shanghai Ethical Committee of Human Genetic Resources, was obtained from all subjects.

Polymerase chain reaction condition and DNA sequencing.

Systematic mutation screening was performed by direct sequencing. Genomic DNA was isolated from the peripheral blood using the standard procedure (31). PCR primers were designed to amplify all 13 exons and surrounding introns of the PAH gene. The primers used in this study are shown in Table 1. The PCRs were carried out on the Gene Amp PCR system 9700 (Applied Biosystems, CA), with a cycling protocol which consisted of denaturation at 95°C for 30 s, 50–65°C for 1 min, and 72°C for 10 min. Amplified DNAs were incubated with shrimp alkaline phosphatase (Roche, Basel, Switzerland) and exonulease (New England Biolabs Inc., MA) at 37°C for 45 min. The products were sequenced using an ABI Prism BigDye Terminator Cycle Sequencing Kit, version 3.1 (Applied Biosystems) on an ABI Prism 3100 sequencer.

Table 1 PAH primers and amplifications

Nomenclature.

It has been the convention in the PAH Mutation Analysis Consortium to use “trivial names” (32,33). The corresponding systematic names are given in the database as well (http://www.pahdb.mcgill.ca/). These two types of nomenclature have been used throughout our study.

Calculation of homozygosity.

Homozygosity (j) at the PAH locus in the population was determined by j = Σxi2, where xi was the frequency of the ith allele. Here each of the uncharacterized alleles was defined as having a frequency of 1/N, where N was the total number of mutant chromosomes investigated (15).

Statistical analysis.

Mutation and genotype frequencies were calculated by the counting method. Comparisons of them between two geographic populations were done using χ2 tests or Fisher's exact tests with a significance level set at 0.05. All statistical calculations were computed with scripts running on an SPSS 13.0 platform.

Automated splice site analysis.

Reference and variant genomic sequences were used to predict splice sites to evaluate potential splice site variants (http://www.fruitfly.org/seq_tools/splice.html).

RESULTS

Mutational spectrum.

Mutation analysis was performed on 212 PKU individuals, representing 424 independent mutant chromosomes, and a potential disease-causing mutation was identified on 405 of 424, corresponding to a mutation detection rate of 95%. The spectrum was composed of 79 different mutations. The majority were missense mutations (54 of 79, 68.4%), with 12 splice-site ones, 10 nonsense ones, and three deletion ones, the latter three types classified as null mutation. These mutations were distributed across the PAH coding sequence except exons 1 and 13, with 30% (24 of 79) occurring in exon 7. A substantial proportion mutant alleles (62%) were account for by R243Q (26%), Ex6–96A>G (9%), IVS4 − 1G>A (6%), R413P (5%), Y356X (5%), R111X (3.7%), R241C (3.7%), and V399V (3.2%). All other mutant alleles were present at relative frequencies of 2.5% or less. The relative frequencies of mutations found in our study are summarized in Table 2. The geographic distribution of mutant alleles on Chinese mainland was also investigated, and we observed that a uniform distribution of R243Q, IVS4 − 1G>A, Y356X, R241C, V399V, Ex6–96A>G, and R111X, the prevalent mutations in China, with only one common mutation, R413P, although present overall, clustering in northern China.

Table 2a Spectrum of PAH mutations detected in Chinese Han population

Novel mutations.

We found 15 novel mutations, F121L, Y154C, A156P, E183G, L227Q, E228X, R270G, P275A, I324N, C357Y, C357X, P362L, Y414X, IVS4 + 2T>A, IVS5 − 2A>G, and two novel polymorphisms, IVS6 − 59C>G and IVS3 + 164T>A, not recorded in the PAH mutation Analysis Consortium database. Four novel sequence variations were detected in intron; the other 13 were in coding region. Reference and variant genomic sequences were used to predict splice sites to evaluate potential splice site variants (http://www.fruitfly.org/seq_tools/splice.html), and IVS4 + 2T>A, IVS5 − 2A>G were found to alter the existing splice sites. Those novel single nucleotide changes, F121L, Y154C, A156P, E183G, L227Q, E228X, R270G, P275A, I324N, C357Y, C357X, P362L, and Y414X, which occurred in coding region, altered the original amino acid sequence (34), and we assumed that these variants are functionally relevant although this needs to be confirmed by additional studies. Except for IVS6 − 59C>G and IVS3 + 164T>A, two novel polymorphisms, 15 novel mutations occurred at a low frequency (0.2–0.4%).

Genotype.

Complete PAH locus genotyping was established for 194 of 212 patients (91.5%). A further 17 had one mutation allele identified; no mutations were detected in one patient. In the group of the fully genotyped patients, 22 were homozygous for one mutation, which is among the most frequent ones in the total pool of Chinese PAH mutant alleles while the remaining 172 patients were compound heterozygote for two mutations. The homozygosity value in the Chinese population is only 8.9%, and the genotypic homozygosity frequency is 11.3% (22/194), which are the consequence of the relatively low frequency of prevalent mutations, further pointing to a high heterogeneity of Chinese population. According to our mutation analysis, a total of 127 genotypes were detected, and R243Q × R243Q, R243Q × IVS4 − 1G>A, and R243Q × Ex6–96A>G were dominant with frequencies of 7.3, 4.4, and 2.9%, respectively. We also compared the frequencies of the genotypes between two regions; only two genotypes demonstrated significant differences, they were R111X × R243Q and R243Q × R413P.

DISCUSSION

We have systematically investigated the variety of genetic defects underlying PKU in a sample of 212 patients representing a cross-section of Chinese Han population. A point mutation or a microdeletion was identified in the coding region or immediately adjacent intronic regions of PAH gene on 405 of 424 independent chromosomes. Among the 79 different mutations found, 68.4% were missense mutations, 15.2% were splice mutations, 12.7% were nonsense mutations, and 3.8% were frameshift deletions. Eight mutations, with a relative frequency of 3% or more, accounted for two thirds of the identified ones. In this context, a total of 15 previously unknown mutations were identified in the Chinese population, and all of them were found to be rare. The data presented here indicated that the total pool of mutant PAH alleles, at least in patients of Chinese descent, consisted of a small number of frequent mutations and a very high number of rare mutations. This distribution of different types of mutation was very similar to that observed in European and other Asian populations (35,11).

As revealed by our intriguing findings and useful analyses, we found that:

1. Previous molecular studies during the last 18 y have elucidated the spectrum of mutations in PKU patients of a few Asian populations and indicated that mutations were not randomly distributed and particular ones showed regional associations. Derived from our study, eight mutations, R243Q, Ex6–96A>G, IVS4 − 1G>A, R413P, Y356X, R111X, R241C, and V399V, with a relative frequency of 3% or more, account for 62% of all mutant alleles. Meanwhile, the most prevalent mutations in Japanese PKU alleles are R413P, IVS4 − 1G>A, R241C, R243Q, T278I, Ex6 − 96A>G, Y356X, and R111X, accounting for 74.4% of all (5). In Korean patients (3), the most common mutations are R243Q, IVS4 − 1G>A, and Ex6 − 96A>G, each with a frequency of 10% or more. However, R111X—a frequent mutation in Japanese and Chinese patients—is very rare in Korean, whereas R413P—the most prevalent one in Japanese—forms a very small proportion of patients in Korean. T278I—with a relative high frequency in Japanese and Korean—is very rare (0.5%) in Chinese patients. However, for Y356X, there is a similar relative frequency in Chinese, Japanese, and Korean patients (5.2, 4.9, and 5.7%). In summary, a total of eight different mutations—R243Q, EX6 − 96A>G, R111X, Y356X, R413P, IVS4 − 1G>A, R241C, and T287I—were identified that might be regarded as prevalent in Asian population, which reached relative frequencies of at least 3% in at least two countries, the same criteria used by Zschocke in his systematic review on PAH gene mutation spectrum in Europe (11).

Comparison of the PKU mutational spectrum among Orientals has identified the aggregate of specific prevalent mutation. Of further interesting, it is visible that some rare mutations cluster in particular regions, for instance, 442-706delE5/E6 accounting for 2.4% in Japanese patients (5), E286 K 4% in Taiwanese (4), and A259T 5.7% in Koreans (3). All these characteristics are in agreement with those of mutation profiles and their frequencies varying among populations, with many alleles being specific to regions.

Although the 424 independent chromosomes analyzed here represent only a small fraction of the total pool of mutant PAH alleles in Chinese Han population, the cross-sectional results provide some insight into the extent of mutational heterogeneity. Calculation of the heterogeneity (homozygosity value) at the PAH locus in Chinese as previously described (15) gives a value of 8.9%, which might be in part due to strict discouragement or even prohibition of consanguineous marriages. Moreover, the value is lower than that of other Asian populations or regions, such as Japanese and Taiwanese, indicating the higher heterogeneous the Chinese population with respect to PAH mutation and the gene flow occurring between Asian populations.

2. To map the distribution of PKU mutations in Chinese Han population and unveil the underlying origins and mechanisms, the allele frequencies of our study were stratified by geographic regions, named southern and northern China. We found that except for R413P, which gave a significant p value (p = 0.0155) between two regions, with a relative allele frequency of 8% in the northern PKU Chinese but 2.7% in the southern ones, seven other common mutations, including R243Q, EX6 − 96A>G, IVS4 − 1G>A, Y356X, R111X, V399V, and R241C reached at least 3% in both regions without a significant difference (p > 0.05) statistically (Table 2).

Table 2b (Continued)

Furthermore, we compared our data of the northern group with that derived from Song et al. (7). A total of 10 mutations were included when they reached a relative allele frequency of at least 3% in either study, R243Q, EX6–96A>G, IVS4 − 1G>A, Y356X, R111X, R241C, R413P, V399V, IVS7 + 2T>A, and R53H. There were no remarkable differences on these common mutations but for R111X (p = 0.0322) between two databases, which suggested our sample could be representative of the northern group. Then we combined the alleles of our northern group with that of Song et al., compared the affiliated allele frequencies with that of our southern group, and we observed that only R413P, R111X, and R53H, with a p value of 0.0189, 0.0373, and 0.036 respectively, showed differences between two regions (Table 3). Do these findings contradict the evidence on local mutation clustering determined in previous studies (2123,28)? Or could we contribute this uniform distribution of common PAH deficiency mutations other than R413P and R53H to migration and mixture between regions? We can offer no obvious explanation for these data. The data limitation is a possibility, as existing ones were selective, of small sample size and an ambiguous ethnic background of patients. Migration and admixture between regions or founder effect are not unlikely alternatives. To elucidate the underlying possibility with the data available, we attempted to link R413P and R241C to range expansion and migration of early historic populations before it became mixed in present-day populations. Published manuscripts on R413P in Orientals indicated that it accounted for 30.5% in Japanese PKU alleles (5), 6.5% in northern Chinese (7), 3.2% in Korean (3), and 4% in Taiwanese (4). According to our study, it occurred at a relative allele frequency of 5.4% on average, with 8.3% in north china, and 2.8% in south. These data suggested that this specific allele might spread throughout the Orient by a founder effect, validating the hypothesis that “northern Mongoloids” (35) represented as a founding population in Asia in that PKU mutation might have occurred in northern Mongoloids and subsequently spread to the Chinese and Japanese population. As for R241C, which representing a strong founder effect in Taiwanese (4) who mainly derived from Chinese, has never been described as a common mutation (7,22,26,36). But in our study, it was found that R241C has a relative frequency of 3% or more in both regions. This discrepancy could be contributed to the evidence that there might be multiple founding populations of PKU in East Asia and genetic drift. Furthermore, all the ambiguity presented here led us to doubt on the traditional geographical stratification method (19,20) in the Chinese Han population, because a geographical label that is usually adequate for the overall classification of samples may only represent a certain proportion of the actual underlying population genetic structure, as the real information of human history is hidden in the genome (37,38).

Table 3 Geographic distribution of the mutant PAH alleles within China. Allele counts for ten common mutations are given together with their relative frequencies expressed as percentage values (in brackets)

3. Derived from our study, the geographical and ethnic source of the common mutations in Chinese could not be established, and it would be interesting to expand this study to an investigation of haplotypes to clarify their origins and also assess genetic admixture, the effect of migration, expansion, and founder effect. We hope to find these answers in research yet to set up.