Main

DNA methylation at the carbon-5 position of the cytosine base in CpG dinucleotides is a common chemical modification of DNA in eukaryotes. It has been shown to have an important role in the transcriptional regulation of multiple physiological processes. Normal DNA methylation changes are involved in embryogenesis, cell differentiation and aging,1, 2 whereas aberrant DNA methylation has a close connection with several chronic diseases, such as cancer and diabetes.3, 4, 5 Thus, the simple and rapid detection of aberrant DNA methylation is generating more and more interest. It not only enhances our understanding of how DNA methylation is regulated but is also helpful to investigations into the relationship between methylation and disease development.6, 7

Recently, many approaches for quantitative assessment of DNA methylation depending on sodium bisulfite treatment of genomic DNA and subsequent PCR amplification have been developed and widely used.8, 9, 10 In these methods, when DNA is treated with a high concentration of bisulfite, unmethylated cytosine is converted into uracil, whereas methylated cytosine remains intact. If a target gene is amplified by PCR, the PCR product will contain thymine at a general cytosine location and cytosine at a methyl cytosine location. On the basis of this principle, several methods are used to analyze the methylation status of a target gene with high sensitivity and specificity. Standard methylation-specific PCR (MSP) is one of the simplest methods, but is not quantitative. Several quantitative variations on the basis of MSP, such as MethyLight,11 HeavyMethyl12 and MethylQuant,13 improved DNA methylation quantification in the CpG dinucleotide. However, these strategies can only determine the methylation level for one or two CpGs overlapped by PCR primers, leaving other sites unexplored. As such, their application and quantitative utility is restricted. Analysis of all CpG dinucleotides within a given sequence will provide a broader view of DNA methylation levels. Many studies have already shown that very few or even single CpG methylation changes are sufficient to epigenetically alter the expression of a gene through quantitative methylation analysis of every CpG site located in its promoter.14, 15, 16, 17 At present, the most popular methods for quantification of every CpG methylation event occurring in a specific region involve pyrosequencing and DNA sequencing. Pyrosequencing, in an unbiased manner, is a newly emerging method, although its intrinsic short-read sequencing (normally only up to 30 bp at a time) presents a disadvantage in comparison with DNA sequencing.6, 18 As for sequencing-based DNA methylation techniques, the cloning-based sequencing protocol is one of the most widely accepted strategies.19, 20 However, this approach has a shortcoming: multiple laborious steps (such as bisulfite treatment, cloning of PCR fragments, construction of recombinant vectors, identification of positive clones and DNA sequencing). This technique requires cloning of the PCR product before sequencing for adequate sensitivity, and therefore is a very labor-intensive and time-consuming method unsuitable for high-throughput sample analysis. Usually, 10 clones must be examined to determine the degree of methylation from a mixture of PCR fragments of a sample, inevitably reducing the statistical power of the sequencing data. A strategy that forgoes cloning-based sequencing and simply directly sequences PCR products will become an ideal high-throughput platform. At present, the approach of direct sequencing has failed to gain acceptance in the field of epigenetic study as a reliable method for quantification of methylation from sequencing chromatograms of poor data, because of high background noise and overscaled cytosine signals.20, 21 On the basis of direct bisulfite sequencing, Paul and Clark20 and Lewin et al22 have independently developed a modified approach using an innovative protocol and novel algorithm to rectify distortions in sequencing trace files. Despite the richness of information obtained from their sequencing methods, these two methods are too laborious, time-consuming and expensive to use in most clinical laboratories. Therefore, simple, rapid and inexpensive alternative methods, which allow accurate quantitative assessment of DNA methylation especially for clinical samples, are desired based on the technology of bisulfite genomic sequencing.

In this study, we introduce a novel method for evaluating DNA methylation status using peak height information obtained from four-dye trace files in direct sequencing of bisulfite-treated PCR products. In our previous studies, we quantified methylation levels of the glucokinase (Gck) promoter in rats of various ages using this new strategy and the classical cloning-based sequencing of PCR products, and found that results of the two methods share high similarity.23 To assess the feasibility and reliability of this approach in detail, we examined it in parallel with the cloning-based sequencing of PCR products from the bisulfite-treated Gck promoter in rat liver tissues and BRL line. In addition, to examine linearity and precision, we compared the method with the pyrosequencing assay using a dilution matrix of known concentrations of mixed methylated and unmethylated Gck promoter regions.

MATERIALS AND METHODS

Cell Lines

The normal rat liver cell line BRL was obtained from the Cell Bank of the Chinese Academy of Sciences (Shanghai, China). BRL cells were cultured in high-glucose Dulbecco's modified Eagle's medium (Invitrogen, Carlsbad, CA, USA) containing 10% fetal bovine serum (Gibco, USA).

Animals

Male, 3-week-old Wistar rats (Animal Developmental Centre, Chinese Academy of Sciences) were given free access to commercial laboratory animal chow. Experiments were carried out after 16 h of fasting. Experimental procedures involving the use of animals were conducted in accordance with the NIH Guidelines and were reviewed and approved by the Animal Use and Care Committee of Fudan University.

DNA Isolation and Bisulfite Sequencing PCR Amplification

Genomic DNA obtained from rat liver tissues and BRL cells were isolated by digestion with proteinase K and phenol–chloroform extraction as described previously.19 DNA was purified again using Nucleic Acid Purification Kits MagExtractor Genome (Toyobo, Osaka, Japan) following the manufacturer's instructions.

Bisulfite modification of genomic DNA was carried out according to previously reported methods.23 In general, 2 μg of genomic DNA was treated with sodium bisulfite and extracted to a final volume of 40 μl of modified genomic DNA. The basal promoter region of the rat hepatic Gck gene, spanning a 601-nucleotide (nt) fragment with 11 CpG sites from nt −518 to nt +83, was amplified with nested primers (Table 1) under amplification conditions as described previously.23, 24

Table 1 Primer sequences for bisulfite sequencing PCR of hepatic Gck, L-PK and Glut2 and pyrosequencing of Gck promoter

A schematic diagram of primers and CpG dinucleotide positions within the bisulfite-converted Gck gene promoter sequence (all C converted to T, except for CpG sites.) is shown in Figure 1. The first cycle reaction mixture (10 μl) contained 2 μl of modified DNA template and 2 × TaqPCRMaster (Tiangen Biotech, Shanghai, China). The first PCR products (2 μl) were subjected to a second round of PCR (10 μl). Liver-type pyruvate kinase (LPK) promoter regions essential for basal transcriptional activity occurred between nt −280 and nt −191 and included 14 CpG sites.25 The 471-bp region was amplified with two rounds of PCR using primers designed using the MethPrimer software (http://www.ucsf.edu/urogene/methprimer/index1.html) (Table 1). Bisulfite sequencing PCR amplification (BSP) of the LPK promoter was performed using our novel program (Table 2). In brief, the first round of the PCR reaction (9 μl) contained 0.7 μl of modified genomic DNA and was overlaid with mineral oil to form a vapor barrier. The cycling conditions consisted of an initial denaturation at 96°C for 5 min, at which point 0.5 μl reverse primer (10 μM) was added to the PCR mixture, followed by 2 cycles of 96°C for 1 min, 60°C for 2 min and 72°C for 2 min. When the temperature again increased to 96°C, which was the beginning of the next phase of 8 cycles (96°C for 1 min, 60°C for 2 min and 72°C for 2 min), 0.5 μl of the forward primer was added to the PCR mixture. Thereafter, 30 cycles of 96°C for 30 s, 53°C for 45 s and 72°C for 45 s were performed, followed by a final extension of 72°C for 7 min. The first-round products (0.5 μl) were subjected to a second round of PCR under the following conditions: for 1 min at 96°C, for 2 min at 58°C and for 1 min 30 s at 72°C for 35 cycles. Another amplified region, glucose transporter type 2 (Glut2) promoter, used this optimization program with a slight modification as described in Supplementary material.

Figure 1
figure 1

Diagrammatic representation of primer alignment and CpG dinucleotide positions on the Gck promoter region. (a) Bisulfite-treated Gck promoter sequence (all C converted to T, except for CpGs). CpG dinucleotides in gray boxes indicate nucleotides detected by BSP and pyrosequencing assays. (b) Schematic diagram of primers (indicated by arrows) and 11 CpG sites (indicated by up arrows) from nt −518 to nt +83 within the Gck promoter. BSP-F, bisulfite sequencing PCR forward primers; BSP-R, bisulfite sequencing PCR reverse primers; PMA-F, pyrosequencing methylation assay forward primers; PMA-R, pyrosequencing methylation assay reverse primers.

Table 2 Optimization program used for BSP amplification

After amplification, the size and quality of PCR products (2 μl) were visualized on 2% agarose gel. For direct sequencing, another 2 μl of PCR products was purified with exonuclease I (ExoI) and shrimp alkaline phosphatase (SAP) enzymes (United States Biochemical, Cleveland, OH, USA) to eliminate unincorporated dNTPs and primers. Enzymatic purification was carried out in a 7 μl PCR mix by adding 3 Units of ExoI, 1 Unit SAP and 10 × SAP reaction buffer, which were incubated for 60 min at 37°C, followed by 15 min at 80°C for enzyme denaturation. The sequencing reaction was carried out using 1 μl ExoI/SAP-purified PCR amplifications, 0.5 μl Big Dye Terminator Kit (Applied Biosystems, Foster City, CA, USA) and 1 μl (3.2 μM) of the reverse primer from the second-round BSP in a total volume of 5 μl. The sequencing reaction was then performed as follows: 96°C for 1 min and 25 cycles of 96°C for 10 s, 50°C for 5 s and 60°C for 4 min. Finally, the sequencing reaction fragments were purified using an EDTA/ethanol protocol and sequenced on an ABI prism 3730 Genetic Analyzer (Applied Biosystems) with Dye terminators (Perkin-Elmer, Foster City, CA, USA).

Methylation Quantification of DNA Sequencing Data

The percentage of methylation was calculated as the peak height of C vs the peak height of C plus the peak height of T for each CpG site as shown in the computer-generated sequencing chromatogram extracted from the Chromas program (Version 2.32, Technelysium) (Supplementary Figure 1). A single C at the corresponding CpG site was considered as 100% methylation, a single T as no methylation and overlapping C and T as partial methylation (0–100%).

Cloning and Sequencing

PCR products were purified using a PCR purification kit (Qiagen, Hilden, Germany) and were cloned into the pGEM-T vector (Promega, Madison, WI, USA) according to the manufacturer's protocol. A total of 10 positive clones per sample were isolated using a kit (QiaPrep Spin Plasmid Miniprep; Qiagen) and sequenced. The methylation level for each CpG site was calculated by dividing the number of methylated CpGs in each site by the total number of clones sequenced.

Generation of DNA Methylation Standards

To further assess the reliability of BSP for quantification of methylation levels, we prepared control DNA standards containing various methylation percentages. Two subclones, one methylated and the other unmethylated, at all CpG sites in Gck promoter fragments of pGEM-T vectors, were selected from the bisulfite clones of rat liver tissues (as described above) and amplified with Gck pyrosequencing primers, Pyr-F and Pyr-R (Table 1), which were designed near the CpG-dense sites within the 601-bp fragment. The new PCR product (201 bp) contained seven CpG sites (Figure 1). The PCR reaction was carried out at 95°C for 10 min, followed by 30 cycles of 95°C for 1 min, 60°C for 1 min and 72°C for 1 min. The amplified methylated (M) and unmethylated (U) PCR products were then adjusted to a concentration of 50 ng/μl, mixed in proportions (M:U) of 0:6, 1:5, 2:4, 3:3, 4:2, 5:1, 6:0, and yielded samples with the following methylation levels: 0, 16.7, 33.3, 50, 66.7, 83.3 and 100%, respectively. Each DNA mixture was diluted to a final volume of 20 μl with ddH2O and was sequenced as described above and pyrosequenced simultaneously.

Pyrosequencing Assay

Single-stranded DNA from 8 μl of each PCR mixture sample was purified according to the PSQ 96-sample preparation guide using a vacuum filtration sample device as per the manufacturer's directions (Biotage, Charlottesville, VA, USA). The single-stranded product was annealed to 0.3 μM concentration of the sequencing primer Pyrs, placed at 85°C for 2 min and thereafter cooled to room temperature. Pyrosequencing was then performed on a PSQ HS 96 system using the Biotage reagent kit (Biotage) according to the manufacturer's instructions. Raw data were analyzed using the methylation quantification algorithm of the provided software.

Statistical Analysis

Data were given as mean±s.d. Linear regression, s.d. and evaluation of experimental data were performed using Microsoft Excel software. Statistical analysis was performed using Student’s t-test. Significance was defined as P<0.05.

RESULTS

Direct BSP Amplification and Quantification of DNA Methylation

DNA collected from all samples was treated with bisulfite, and then the region of interest was amplified by PCR with optimized PCR programs for increasing the specificity and yield of PCR products. This process converts the originally unmethylated CpG dinucleotides to TpG, while conserving the originally methylated CpGs. When the PCR products were visualized, unique bands of desired sizes were treated with ExoI/SAP and other procedures before direct sequencing.

In this study, we amplified Gck, LPK and Glut2 promoters from bisulfite-treated genomic DNA of rat liver tissues and BRL cells. The specific PCR bands are shown on 2.0% agarose gel (Supplementary Figure 2A–C). As sequences were read with a reverse primer, the inversed/complemented files made it convenient to read potential CpGs. Representative sequencing chromatograms of Gck, LPK and Glut2 (Figure 2a and c, Supplementary Figure 3) indicated high-quality data with sharp or evenly distributed peaks, and almost no background noise was observed. In addition, all sites of non-CpG cytosines were displayed as thymines in these figures, which implied that the bisulfite-induced conversion of unmethylated cytosines to uracil was complete in our DNA samples, and this serves as the basis for the subsequent calculation of methylation rates by measuring relative peak heights of cytosine (C) and thymine (T) peaks at each CpG in the sequencing traces. To examine the feasibility of our strategy, we assessed the methylation status of 11 CpG sites located in the promoter of the Gck gene (Figure 1).

Figure 2
figure 2

Validation of the present method with cloning-based sequencing of the Gck promoter amplified from DNA isolated from rat livers and BRL cells. A representative comparative result of methylation states between direct bisulfite-PCR sequencing chromatograms (upper) and cloning-based sequencing (lower) of the same fragment from a rat liver (a) and the BRL cell line (c). Open circles show no methylation, and closed circles indicate methylated cytosine. Comparison of methylation levels of the Gck promoter from rat livers (n=5) (b) and the BRL cells (n=3) (d) using the two methods.

The Gck promoter was amplified from rat liver tissues (n=5) and BRL cells (n=3), and the percentage of methylation at every CpG site was calculated as C/(C+T). As shown in Figure 2, the methylation degree of rat liver samples was relatively even (45–95%) at 11 CpG sites (Figure 2b), whereas the level of BRL cells ranged from 10 to 100% methylation (Figure 2d).

Comparison Between Direct BSP Sequencing and Cloning-Based Sequencing for the Quantification of DNA Methylation

To further validate our findings, we compared these direct sequencing outcomes with those of sequencing from cloned PCR products obtained from the same samples. We sequenced 10 clones of each individual PCR reaction. As shown in Figure 2b and d, methylation states of individual CpG sites determined by cloning-based sequencing were similar to those measured by the direct BSP sequencing method. In addition, there were no significant differences (P>0.05) in methylation levels between the two methods. We also compared the two methods for quantification of DNA methylation in the hepatic Gck promoter of three different age groups of rats from a previous study.23 Our method indicates a very good overall correlation with cloning-based sequencing. The results in Figure 2 also showed high consistency between the two methods, except for several analyzed sites of the Gck promoter. However, we failed to identify which approach caused the differences. The number of clones might not be fully representative of the sequence in the template mixture; thus, a more precise estimation of direct BSP sequencing is required to allow comparison of quantification data with the direct sequencing method.

Accuracy Determination of BSP Sequencing for DNA Methylation Quantification

To more precisely estimate our the sensitivity and linearity of our method, various samples of different methylation rates from mixtures of fully methylated and non-methylated Gck promoter fragments amplified from two clones (electrophoresis in Supplementary Figure 2D) at different proportions (representing 0, 16.7, 33.3, 50, 66.7, 83.3 and 100% methylation) were detected by direct sequencing and pyrosequencing assays, respectively. Figure 3a showed typical pyrosequencing programs of these samples. Unmethylated DNA was calculated automatically as <4% methylation. With the expected methylation rates gradually increasing, the value calculated by C/C+T increased proportionally from 0 to 100% methylation. A relatively good linear relationship (R2=0.9739) was observed in Figure 4a.

Figure 3
figure 3

Representative pyrograms and sequencing chromatograms for different methylation levels of the Gck promoter region by pyrosequencing and our method. (a) Pyrosequencing assay. The expected and experimental values of methylation levels are shown at the two sides of each pyrogram or on the top of each CpG site. Shaded bars encompass C/CtT pairs. (b) Direct BSP sequencing. The expected values are shown on the left side of each chromatogram. The open boxes indicate the CpG sites.

Figure 4
figure 4

Linear relationship between observed and expected methylation levels by the pyrosequencing assay and our method. Quantitative results from multiple mixing experiments were compiled for both methods. The average quantification value for the same M/U mixture from 3 repeated sequences (a sequenced DNA fragment spanning 4 CpG sites, a total of 12 CpG sites) was plotted against its expected value. Linear regressions are shown for both pyrosequencing (a) and our method (b).

Similarly, PCR products from the same mixed samples in the pyrosequencing assay were measured by our method. As shown in Figure 3b, the representative DNA sequence chromatograms showed only a T peak, whereas completely methylated sites appeared as only a C peak at the CpG sites in the chromatograms. The ratio of C gradually increased with the expected methylation rates. Figure 4b showed a good correlation (R2=0.9673) with the data obtained from our system. This indicated that DNA methylation and peak height ratios in these standard samples increased proportionally, allowing the determination of different methylation states for the test samples. In addition, our method showed smaller system deviation than did the pyrosequencing method (slope of 0.9297 vs 0.8796). However, the standard curve did not cross the origin (intercept value of 4.5), which implied a <5% background noise in the observed samples, which might result from failing to distinguish between a real peak and background noise when a small C (or T) peak existed in a sequencing trace.

In addition, to evaluate the interassay variability of the test, we detected the PCR mixture amplification from the LPK promoter of different methylation rates, using the same procedure as described above. That standard curve (Supplementary Figure 4) showed a similar correlation and system deviation with Figure 4b, which implied that our method could accurately quantify DNA methylation levels on multiple independent CpG sites in different targets sequence.

DISCUSSION

DNA methylation is a crucial epigenetic modification of the genome that is involved in regulating many human diseases. With the biological and clinical importance of DNA methylation increasing rapidly, the technology for analyzing DNA methylation patterns has undergone swift development in recent years.

This report describes a simple and rapid DNA methylation quantification method, which consists of treating genomic DNA extracted from a sample with bisulfite, amplifying a target gene by PCR, purifying the BSP products to remove dNTP and primers, PCR sequencing and finally measuring the content of methyl cytosine on the basis of the trace file data generated from sequencing chromatograms. Compared with conventional bisulfite sequencing, this method does not include complicated procedures of cloning PCR products into individual colonies and sequencing each clone. We made a special effort to optimize the protocols of bisulfite conversion and PCR amplification to render them applicable for direct DNA methylation analysis from bisulfite-PCR sequencing. We first improved measures for bisulfite treatment of purified genomic DNA to ensure the efficiency of bisulfite conversion. In addition, to increase the specificity and efficiency of PCR products, we invented a set of PCR for various special gene BSP amplifications (Table 2). First, we designed primers for bisulfite-converted DNA using the Methprimer software. In general, primer selection is one of the most critical steps in methylation analysis; therefore, it would be better to design two or three pairs of primers and then choose the most specific one or two from them to amplify target genes using our novel PCR program. PCR was performed in three cycle programs with two different annealing temperatures. In the first cycle program, one of the primers (usually the reverse primer) was added in the denaturation phase and amplified for two cycles with annealing temperatures at 60°C (2–4°C higher than the Tm of most primers with a length of 20 bp) to yield perfect DNA–DNA matches at a sufficiently high temperature with one primer in a one-directional reaction. Subsequently, in the next denaturation phase of the second cycle program, the other primer (usually the forward primer) was added and amplified for eight cycles with the same annealing temperatures to produce high-specific PCR products. This was then followed by the third cycle program in which the annealing temperature was set at 53°C (2–4°C lower than Tm of most primers with a length of 20 bp) with 30 cycles to produce sufficient PCR products to the fullest extent. If there are nonspecific bands or weak products displayed in a gel, we can change those annealing temperatures by 2–5°C or continue into the second round of PCR, which is a typical PCR reaction with a 58°C annealing temperature and can also be modified by 2–5°C using nested primers, semi-nested primers or even the same primer pair as the first round of PCR.

Using the above-mentioned improved multiplex reactions, we have already successfully amplified various high-specificity BSP products, such as LPK promoters (Supplemental Figure 2A), Glut2 promoters (Supplementary Figure 2B), Gck promoters (Supplementary Figure 2C) and other DNA sequences of interest (Supplementary Figure 2E) from various samples.

In general, with the obtained PCR products for methylation analysis, the traditional method is to clone and sequence at least 10 individual clones. However, this is a time-consuming work. Considering the fact that bisulfite treatment can convert unmethylated cytosines to thymines or leave methyl cytosine unchanged in PCR products, each cytosine within the CpG dinucleotide can obtain two peaks at one site (C/T) and can be regarded as a SNP. Although the direct quantification of SNPs by measuring peak height ratio from direct PCR sequencing data is regarded as an accurate and sensitive method,26 the quantification of cytosine methylation by direct BSP sequencing is considered impossible because it faces several challenges, such as poor signal quality, overscaled cytosine signals and base-caller artifacts,20, 21 which are key confounding factors that influence the accurate assessment of DNA methylation by obscuring the real peak at CpG sites. Recent studies have described several newly developed analysis methods20, 22 on the basis of direct sequencing technology for methylation studies. Owing to their complicated procedures, their algorithms and workflow seem too difficult to be widely used.

To investigate whether it is feasible for direct PCR sequencing to accurately quantify DNA methylation with four-dye sequencing trace files by optimizing the corresponding procedures to eliminate the interfering factors, we first established a novel PCR program as described above. In addition, we used the ExoI/SAP enzymatic purification to effectively remove single bands, dNTP and unincorporated primers, which is easier, less expensive and requires smaller amounts of PCR products, compared with commercial PCR purification kits. Using these improved strategies, purified PCR products were sequenced with reverse primers (which are often better than forward primers), and the resulting signals in the sequencing chromatograms displayed evenly spaced peaks with uniform peak heights and unnoticeable baseline noise (Figures 2a, c and 3b, Supplementary Figure 3). We measured the peak height of C and T, and quantified the amount of methylation using the formula: % methylated C=100% × peak height C/(peak height C+peak height T).

Compared with the results of cloning-based sequencing in this study, the patterns of CpG methylation analyzed by the two methods were quite similar, with minor differences at some of the CpG sites (Figure 2). We selected another reliable method for quantification of DNA methylation, the pyrosequencing assay, to estimate the feasibility and precision of our method. Pyrosequencing is a sensitive and background-free assay, but it requires the CpG sites of interest very close to the sequencing primer, and the entire sequence length cannot be very long. Therefore, quantification of more CpG sites requires multiple primers and multiple reactions.18, 27 We analyzed different methylation levels of the same PCR products with a length of 30 bp using our method and pyrosequencing. The two methods showed similar linearity and accuracy (Figure 4). Our strategy, however, generates minor background effects that are unavoidable in automated sequencing. However, given that the aim of quantitative evaluation of DNA methylation patterns was to determine whether DNA methylation has an essential role in the regulation of gene expression through comparison of DNA methylation levels among different samples, our method is sufficient to fulfill this purpose. In addition, the mean values of the coefficient of variation of our method seemed to be relatively high compared with those of the pyrosequencing assay. One possible explanation for this is that the peak heights in trace files generated by direct PCR sequencing were affected by its neighbor base; therefore, the calculated methylation rates by our method were variable at different CpG sites with the same methylation level. Furthermore, systematic biases in the test system would lead to deviations from expected values and to a higher variance in the complete data, but it would still allow detection of relative differences in methylation rates at individual CpG positions.

In conclusion, we developed an attractive method for rapid analysis of DNA methylation through a series of optimization strategies and techniques aimed at solving the problems of direct bisulfite-PCR sequencing. The accuracy of our method is comparable with conventional screening methods such as pyrosequencing and cloning-based sequencing, while its speed and simplicity are superior to the latter two methods. Accordingly, the novel method is anticipated to be useful in screening large numbers of clinical samples across multiple genes.