Introduction

Human prion protein, when conformationally transformed into another state, is responsible for Creutzfeldt–Jakob disease (CJD) and other prion diseases. Among detected mutations in PRNP, the gene encoding this protein, a methionine to valine mutation at the 129th amino acid (M129V) was found to be prevalent in all ethnic groups. M129V heterozygotes have been shown to be intrinsically protected from complete induction of the prion conformation change and therefore might be resistant to the disease (Rujescu et al. 2002; Lee et al. 2001).

By studying the common genetic variations in a 4.3 kb segment flanking M129V, Mead et al. (2003) showed positive Tajima’s D (Tajima 1989) in worldwide populations using a selected group of variations in PRNP. They proposed that balancing selection played a critical role in shaping the worldwide distribution of 129V, including Fore-speaking New Guinean, African, European, and East Asian populations. In addition, it has been shown that the frequency of 129V in East Asians is low (Erginel-Unaltuna et al. 2001; Jeong et al. 2004; Yu et al. 2004) and the Tajima’s D is also lower, though positive, in East Asians than in other populations (Mead et al. 2003).

Kreitman and Di-Rienzo (2004) challenged the conclusions of Mead et al. 2003 by pointing out that the exclusion of polymorphisms with low frequency may introduce ascertainment bias, which may tilt the estimation of the allele-frequency spectrum towards the strongly positive Tajima’s D values seen in Mead’s work. By analyzing a 2.5 kb segment of PRNP in samples selected from Human Genome Diversity Project [Centre d’Etude du Polymorphisme Humain (HGDP-CEPH); Cann et al. 2002; Soldevila et al. 2005] Kreitman and Di-Rienzo (2004) showed that Tajima’s D’s are generally negative in worldwide populations, and that the smallest D was found in East Asians. However, their conclusion may be compromised by the fact that the East Asian individuals in their study were sampled from multiple populations with different ethnic backgrounds, and the presence of population substructures may lead to a biased estimation of D values from neutrality expectation (Simonsen et al. 1995; Hammer et al. 2003).

In this report, we reevaluated the presence of balancing selection on PRNP based on sequence data from the entire genomic region (15 kb) in a large sample of natural Han Chinese (92 chromosomes) from Boxing County, Shandong, PR China. A negative Tajima’s D was observed when all genetic variations were included in the analysis, contradicting the hypothesis of balancing selection on PRNP. Furthermore, a systematic population study in ten Han Chinese populations validates the previous observation of low 129V frequency in East Asians.

Materials and methods

Genomic DNA was extracted by standard techniques from the blood of 436 unrelated healthy donors sampled from Shanghai, Shandong, Liaoning, Yunnan, Sichuan, Xinjiang, and Beijing. Genotyping of the codon 129 polymorphism was conducted by PCR-RFLP using the enzyme BsaAI. We sequenced the entire genomic locus of the PRNP gene (GenBank: AL133396). Overall, 46 individuals from Boxing County, Shandong were selected for sequencing using an ABI PRISM 3100 Genetic Analyzer (Foster City, CA). Primer sequences will be provided upon request. We used Tajima’s D test (Tajima 1989) and Fu and Li’s test (Fu and Li 1993) to examine departure from the assumption of neutral evolution.

Results

Three (0.6%) heterozygotes and 433 (99.4%) homozygotes for the allele 129M were observed in 436 Chinese samples from ten populations. The frequency of allele 129V was about 0.3% in Chinese. The frequencies of genotypes and of 129V in the populations tested are presented in Table 1.

Table 1 Frequency of human PRNP M129V polymorphism in ten Chinese populations

We sequenced a 15 kb-segment encompassing the entire PRNP gene in 46 samples from Boxing County, Shandong. A total of eight polymorphic sites was found, including M129V. Unlike M129V, the other seven variants are all located in non-coding regions. These eight PRNP variations are in complete linkage disequilibrium and therefore form a perfect haplotype block, with all pairwise |D′| values equal to 1. Overall, eight haplotypes were observed and their frequencies were estimated using Phase 2.0 (Stephens et al. 2001) and are listed in Table 2.

Table 2 PRNP haplotype frequencies in Boxing County population

The hypothesis of balancing selection is characterized by a deep split in the genealogy and an excess of high-frequency polymorphisms flanking the sites being selected (Mead et al. 2003). Using Tajima’s D (Tajima 1989) and a related statistic, Fu and Li’s D (Fu and Li 1993), an excess of high-frequency polymorphisms results in large positive values of the statistics and is indicative of balancing selection. The Tajima’s D value (Tajima 1989) in the Boxing County population was −1.78 (0.10 >P >0.05), and Fu and Li’s D (Fu and Li 1993) was −1.92 (0.10 >P >0.05), both inconsistent with the hypothesis of balancing selection, in which positive D is expected. In contrast, the negative D values may reflect the roles of positive selection and population expansion although the observation is not statistically significant. In addition, the possible presence of population structure in these populations can be excluded based on the lack of deviation from Hardy–Weinberg expectation at all eight sites studied (data not shown).

The reduced median network of the haplotypes was inferred using NETW4101 (Forster et al. 2001) (http://www.fluxus-engineering.com) (see Fig. 1). In Fig. 1, the area of each circle is proportional to the frequency of the haplotype it represents. The haplotype carrying the 129V allele (haplotype G) was derived from the M-carrying haplotype C by a single mutation (M129V). A deep split between the 129V-carrying and 129M-carrying lineages, as expected under the assumption of balancing selection, was not observed in our samples. Haplotype A is the most frequent in Chinese and is probably the founding haplotype in East Asian populations. The star-like structure of the genealogy of M-carrying lineages and the negative D statistics are consistent with population expansion in East Asia. We also examined the mismatch distribution of PRNP haplotypes in our samples (data not shown) and again it does not support the presence of balancing selection as shown by Mead et al. (see Supporting Online Material Fig. S1 in Mead et al. 2003).

Fig. 1
figure 1

Reduced median network of PRNP haplotypes. The size of the circle is proportional to the frequency of that haplotype within the sample. Haplotype G represents the haplotype with the 129V allele, while the other circles represent haplotypes with 129M alleles. Each line connecting two haplotypes represents mutation steps. The circle and the line in the lower right of the figure indicate the size that represents one individual and one step distance, respectively

Discussion

Since Tajima’s D and other related statistics for detecting the presence of selection are based on the frequency spectrum of all genetic variations, Kreitman and Di-Rienzo (2004) pointed out that an analysis of a subset of variations may constitute an ascertainment bias that may lead to a wrong conclusion, as exemplified by the study of Mead et al. (2003) on PRNP in which only more frequent variations were included. This notion was supported by a study of a selection of PRNP variations by Soldevila et al. (2005) in which all variations within a 2.4 kb region were included. In the latter study, Tajima’s D’s are generally negative and the smallest D was found in the East Asian population. However, the samples used in the study of Soldevila et al. (2005) were obtained from the HGDP-CEPH collection (Soldevila et al. 2005). It should be noted that the sample sizes for each population in the collection are small (http://www.cephb.fr/HGDP-CEPH-Panel) especially from East Asians (ten lymphoblastoid cell lines in most populations). Pooling of samples from different ethnic groups may generate a substructured population in which negative D values (Hammer et al. 2003) can be observed. Therefore, using the data of Soldevila et al. (2005) to support the notion of Kreitman and Di-Rienzo (2004) requires careful examination.

In this study, we showed a negative Tajima’s D value in a 15 kb segment encompassing the entire genomic region of PRNP in a natural population. Our result therefore constitutes stronger support of Kreitman’s notion since (1) we studied a natural population instead of a substructured sample, (2) our sample size is larger, and (3) a much larger segment was surveyed. However, our observation does not preclude the possible absence of a driving force for balancing selection in East Asia, although the difference between our observation and the marginally significant presence of balancing selection suggested by Mead et al. (2003) underscores the necessity of using larger samples and all segregating sites in the analysis.

We also examined the possibility of selection using several different approaches including Fu and Li’s D (Fu and Li 1993) and reduced median networks (Forster et al. 2001). In either case, our observations are inconsistent with the hypothesis of balancing selection, at least in Chinese.