Genome-wide profiling of promoter methylation in human

Article metrics

Abstract

DNA methylation in the promoter region of a gene is associated with a loss of that gene's expression and plays an important role in gene silencing. The inactivation of tumor-suppressor genes by aberrant methylation in the promoter region is well recognized in carcinogenesis. However, there has been little study in this area when it comes to genome-wide profiling of the promoter methylation. Here, we developed a genome-wide profiling method called Microarray-based Integrated Analysis of Methylation by Isoschizomers to analyse the DNA methylation of promoter regions of 8091 human genes. With this method, resistance to both the methylation-sensitive restriction enzyme HpaII and the methylation-insensitive isoschizomer MspI was compared between samples by using a microarray with promoter regions of the 8091 genes. The reliability of the difference in HpaII resistance was judged using the difference in MspI resistance. We demonstrated the utility of this method by finding epigenetic mutations in cancer. Aberrant hypermethylation is known to inactivate tumour suppressor genes. Using this method, we found that frequency of the aberrant promoter hypermethylation in cancer is higher than previously hypothesized. Aberrant hypomethylation is known to induce activation of oncogenes in cancer. Genome-wide analysis of hypomethylated promoter sequences in cancer demonstrated low CG/GC ratio of these sequences, suggesting that CpG-poor genes are sensitive to demethylation activity in cancer.

Main

Microarray-based methods of comparing differences in DNA methylation in the genome of two samples using methylation-sensitive restriction enzymes (Yan et al., 2001; Hatada et al., 2002) have two problems. The first is that the microarrays contain clones from libraries of CpG islands. CpG islands are CpG-rich regions of the genome originally thought to be associated with the 5′ region of genes. There were several approaches using CpG islands libraries for microarrays (Yan et al., 2001; Hatada et al., 2002; Heisler et al. 2005; Weber et al., 2005). Although 60% of human genes have CpG islands in the promoter or first exon, more than 80% of all CpG islands have no relation to genes and are unlikely to regulate gene expression (Takai and Jones, 2002). To solve this problem, we used a microarray with 60-mer oligonucleotides derived from promoter regions of 8091 human genes. DNA methylation in promoter regions is most important for the regulation of gene expression. The second problem is the risk of false positives resulting from restriction site polymorphisms and/or incomplete digestion of DNA. To resolve this issue, we developed a new method called Microarray-based Integrated Analysis of Methylation by Isoschizomers (MIAMI). We utilized resistance to a methylation-insensitive restriction enzyme, MspI, to judge the false-positive results for resistance to the methylation-sensitive isoschizomer HpaII (Figure 1a). If two samples have a restriction site polymorphism at an HpaII site and/or one of the samples has incomplete digestion at a HpaII site, they will differ in resistance to HpaII. However, in this case the resistance to methylation-insensitive MspI at this site will also differ between samples because both enzymes recognize the same recognition site, CCGG. Therefore, we can treat such changes as false positives based on MspI resistance (MR).

Figure 1
figure1figure1

MIAMI. (a) Schematic flowchart for the MIAMI method for comparison of sample A and sample B. Details were described in the text. (b) Application of the MIAMI method to a lung cancer cell line (1–87, abbreviated as LC) and a normal lung (abbreviated as C). Values for log(HRLC/HRC) are plotted on the x-axis and log(MRLC/MRC) are plotted on the y-axis. Green lines are located log5 of the horizontal distance from the y-axis. The regression line is in yellow and red lines are located log5 of the horizontal distance from this line. Points located more than this distance right of the regression line are judged as hypermethylated. Points located more than log5 of the horizontal distance left of the regression line are judged as hypomethylated. Genes indicated by red and green circles are located more than this distance from the y-axis. Hypermethylation was confirmed for genes indicated by red circles that were found to meet the criteria. Hypermethylation was not confirmed for genes indicated by green circles that did not meet the criteria, although they were located log5 of the horizontal distance from the y-axis. Orange and blue circles meet the criteria but are less than log5 of the distance from the y-axis. Hypermethylation was confirmed for genes indicated by orange circles but not blue circles. (c) Summary of methylation changes in a lung cancer cell line (1–87, abbreviated as LC) compared to a normal lung (abbreviated as C). Methylation change (horizontal distance from the regression line) for each gene is plotted on the y-axis. Red broken lines indicate the threshold we used (log5). Genes are placed in order of position along x-axis. (d) Average ratio of CpG contents to GC contents (CG/GC) and average ratio of AT contents to TA contents (AT/TA) were calculated for hypermethylated, unchanged, and hypomethylated genes.

We constructed a 60-mer-oligonucleoide microarray containing portions of HpaII fragments located in promoter regions of 8091 genes. We targeted the region from 600 base pairs upstream to 200 base pairs downstream of the transcriptional start sites for genes whose start sites were characterized on the basis of the National Center for Biotechnology Information annotation and/or Database of Transcriptional Start Sites. The probe nearest to a transcriptional start was selected on the condition that it does not have self-complementarity (Primer 3 program; Rozen et al., 2000) and homology to the human genome (megaBlast program; Altschul et al., 1990). Microarrays were made using an ink-jet oligonucleotide synthesizer as described (Hughes et al., 2001). Average position of the 8091 probes was 36 base pairs upstream of the transcription start sites. Average GC content of the probes was 65%. All probes were included in the HpaII fragments less than 600 base pairs. Average fragment length of the probe containing HpaII fragments was 194 base pairs.

We defined resistance as reciprocal of sensitivity. Therefore, HpaII-sensitive (cleavable) DNA and MspI-sensitive (cleavable) DNA were amplified and used for calculating the HpaII resistance (HR) and MR, respectively. For HR, HpaII-cleavable unmethylated DNA was amplified (I). HpaII-cleaved DNA fragments were ligated to an adaptor and subjected to first PCR (Figure 1a). At this stage, only DNA fragments that had methylated internal HpaII sites before the PCR retained HpaII (MspI) sites. Therefore, MspI digestion made it impossible to amplify these methylated fragments. In the second main PCR, only unmethylated DNA fragments were amplified. Amplified unmethylated HpaII-cleaved DNA fragments from two samples were labeled with Cy3 and Cy5, respectively, and co-hybridized to the microarray with 60-mer oligonucleotides from promoter regions of 8091 genes. After hybridization, the microarray was scanned and fluorescence intensities on a scanned image were quantified, corrected for background noise, and normalized with the software DNASIS Array (Hitachi Software Engineering). Spots with both Cy3 and Cy5 signals less than 0.001% of total signals were removed before analysis. HR was defined as 1/(normalized HpaII intensity). Therefore, the ratio of HR of two samples (HRB/HRA) can be represented by (normalized HpaII intensity)A/(normalized HpaII intensity)B. For MR, all MspI -cleavable DNA (unmethylated plus methylated) was amplified (II). MspI-cleaved DNA fragments were amplified and labeled similarly as HpaII-cleaved DNA fragments then co-hybridized to another microarray with the same 8091 genes. MR was defined as 1/(normalized MspI intensity). Therefore, the ratio of MR of two samples (MRB/MRA) can be represented by (normalized MspI intensity)A/(normalized MspI intensity)B. Details for all procedures are described in Supplementary information 1.

We applied the MIAMI method to a lung cancer cell line (1–87, abbreviated as LC) and a normal lung (abbreviated as C). Values for log(HRLC/HRC) and log(MRLC/MRC) are plotted on the x and the y-axis, respectively, of Figure 1b. Various genes whose HR changed more than fivefold (abs[log(HRLC/HRC)]>log5, areas more than log5 of the horizontal distance from the y-axis) were selected as candidates (indicated by red and green circles in Figure 1b and Supplementary Figure 1). These genes were confirmed to differ in methylation between the cancer and the normal lung by combined bisulfite restriction (COBRA) analysis, with the genes indicated by red circles hypermethylated in the cancer cells and the genes indicated by green circles having no methylation-based changes (Figure 2a). To characterize these false positives without changes in methylation (green circles), PCR was conducted followed by digestion with HpaII to test for site polymorphisms. We found these false positives have site polymorphisms between the cancer and the normal lung (Figure 2b). All these false positives were close to the regression line (yellow line in Figure 1b and Supplementary Figure 1) where ideal changes in HR and MR are postulated to be equal. Therefore, we made threshold criteria with which to judge points located more than log5 of the horizontal distance from the regression line as altered genes. (Figure 1b and Supplementary Figure 1). Points located more than this distance right of the regression line were judged as hypermethylated and points located more than this distance left of the regression line were judged as hypomethylated. Using our criteria, we could neglect all false positives (green circles) and all genes meeting the criteria (red circles) had methylation changes, indicating our threshold is quite reasonable for selecting methylation-changed genes. Next we chose six genes that were located more than log5 of the horizontal distance from the regression line but less than log5 from the y-axis (indicated by orange and blue circles in Figure 1b and Supplementary Figure 1). These genes can be judged as hypermethylated using our criteria but their changes in HR are less than fivefold. COBRA analysis indicated that five of the six had actually methylation-based changes (Figure 2a), indicating again our threshold criteria is useful for selecting methylation-changed genes (orange circles indicate positives and blue circles indicates false positive). Conventional, independent COBRA experiments using gene-specific primers confirmed 17 of 18 hypermethylations that were identified by integrated analysis of HR and MR at a threshold of log5. This suggests that our empirical rate of false positives is 6%. We used our threshold criteria to calculate the ratio of changes and found that 5.7% of the promoters of the genes were hypermethylated and 0.6% were hypomethylated in lung cancer (Figure1c and Supplementary information 2). This frequency is much higher than a previous result in lung cancers (Yan et al., 2001), suggesting high sensitivity. Further improvement such as using linear amplification could make this method more efficient because it is expected that a proportion of fragments will not amplify and give no signal by PCR. Actually we removed 14% of spots with both Cy3 and Cy5 signals less than 0.001% of total signals for analysis to get reproducible results.

Figure 2
figure2

Characterization of genes detected by MIAMI in a lung cancer. The PCR primers used are indicated in Supplementary information 3. (a) COBRA analysis of indicated genes. Genes indicated by red, orange, and blue circles met our criteria whereas indicated by green circles did not. COBRA analysis confirmed hypermethylation for genes indicated in red and orange and not genes indicated by green and blue. U indicates bands originating from unmethylated DNA. Other bands originated from methylated DNA. (b) Characterization of genes not meeting our criteria. MspI (HpaII) polymorphisms were detected by PCR followed by digestion with MspI. All the genes not meeting the criteria have MspI (HpaII) polymorphisms. (c) COBRA and RT–PCR analysis of CIDEB and MLH3 genes in six lung cancer cell lines (1–87, RERF-LCMS, EBC-1, LK-2, VMRC-LCP and LK79). U indicates bands originating from unmethylated DNA. Hypermethylation was observed for five of six lung cancer cell lines for CIDEB and two of six for MLH3. RT–PCR analysis showed that expression was reduced in all these hypermethylated cell lines. G3PDH was used for a control. (d) COBRA analysis of CIDEB genes in primary tumours. Eight adenocarcinomas, eight squamous cell carcinomas and five small cell carcinomas were used for analysis. Seventy-one per cent (15/21) of primary tumours were hypermethylated.

Next we analysed the character of 5′ sequences (from 1000 base pairs upstream to 200 base pairs downstream of the transcriptional start sites) for these hypermethylated and hypomethylated genes. Average ratio of CpG contents to GC contents (CG/GC) was calculated for hypermethylated, unchanged, and hypomethylated genes (Figure 1d). We found hypomethylated genes had a low CG/GC ratio compared to genes without methylation change (P=4.0 × 10−15, Figure 1d). However, the AT/TA ratio showed no such tendency (Figure 1d). This suggests that CpG-poor genes are easily demethylated compared to CpG-rich genes. In other words, CpG-poor genes are more sensitive to demethylation activity than to CpG-rich genes. This could be explained by protection of demethylation activity by a methyl-CpG binding protein. Promoters with a low density of methyl-CpGs bind MeCP-1 less strongly than those with a high density of methyl-CpGs (Boyes and Bird, 1992). Therefore, it is intriguing to speculate that CpG-poor genes are less protected by MeCP-1 from demethylation activity. Aberrant hypomethylation is related to the activation of oncogenes. Therefore, our finding of the unique character of hypomethylated genes will help us to understand the mechanism of carcinogenesis.

Aberrant hypermethylation is known to inactivate tumour suppressor genes (Baylin et al., 1997; Ushijima, 2005). Among the hypermethylated genes we identified, further analysis of CIDEB and MLH3 were performed. CIDEB (Cell death-inducing DFFA-like effector b) activates apoptosis in mammalian cells (Inohara et al., 1998) and is located at 14q11 where LOH frequently occurs in lung cancers (Abujiang et al., 1998). MLH3, MutL Homolog 3, is a DNA mismatch repair gene associated with mammalian microsatellite instability (Lipkin et al., 2000). MLH1, from the same family, was frequently mutated in hereditary nonpolyposis colon cancer (Papadopoulos et al., 1994) and was involved in microsatellite instability DNA in colon cancers (Jager et al., 1997). A methylation-based analysis of an additional five lung cancer cell lines using COBRA revealed hypermethylation in five of six for CIDEB and two of six for MLH3 (Figure 2c). RT–PCR analysis showed that expression was reduced in all hypermethylated cancers (Figure 2c), indicating that the expression profile of the genes completely correlated with the methylation profile of the genes. Further methylation analysis was performed for CIDEB in primary tumours using COBRA. We found 71% (15/21) of primary lung cancers were hypermethylated in the promoter of CIDEB (Figure 2d). The present study was approved by the Ethics Committees of Tohoku University School of Medicine and Gunma University. Following a complete description of the research protocol, written informed consent was obtained from each participant.

In conclusion, we conclude MIAMI is a powerful method for genome-wide profiling of promoter methylation in the human genome. This method is useful for epigenetic studies of cancers.

References

  1. Abujiang P, Mori TJ, Takahashi T, Tanaka F, Kasyu I, Hitomi S et al. (1998). Oncogene 17: 3029–3033.

  2. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ . (1990). J Mol Biol 215: 403–410.

  3. Baylin SB, Herman JG, Graff JR, Vertino PM, Issa J-P . (1997) In: Vande W and Klein G (eds). Advances in Cancer Research. Academic Press: San Diego. pp 141–196.

  4. Boyes J, Bird A . (1992). EMBO J 11: 327–333.

  5. Hatada I, Kato A, Morita S, Obata Y, Nagaoka K, Sakurada A et al. (2002). J Hum Genet 47: 448–451.

  6. Heisler LE, Torti D, Boutrous PC, Watson J, Chan C, Winegarden N et al. (2005). Nucleic Acids Res 33: 2952–2961.

  7. Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW et al. (2001). Nat Biotechnol 19: 342–347.

  8. Inohara N, Koseki T, Chen S, Wu X, Nunez G . (1998). EMBO J 17: 2526–2533.

  9. Jager AC, Bisgaard ML, Myrhoj T, Bernstein I, Rehfeld JF, Nielsen FC . (1997). Am J Hum Genet 61: 129–138.

  10. Lipkin SM, Wang V, Jacoby R, Banerjee-Basu S, Baxevanis AD, Ly nch HT et al. (2000). Nat Genet 24: 27–35.

  11. Papadopoulos N, Nicolaides NC, Wei YF, Ruben SM, Carter KC, Rosen CA et al. (1994). Science 263: 1625–1629.

  12. Rozen S, Skaletsky HJ . (2000) In: Krawetz S and Misener S (eds). Bioinformatics Methods and Protocols: Methods in Molecular Biology. Humana Press: Totowa. pp 365–386.

  13. Takai D, Jones PA . (2002). Proc Natl Acad Sci USA 99: 3740–3745.

  14. Ushijima T . (2005). Nat Rev Cancer 5: 223–231.

  15. Weber M, Davies JJ, Wittig D, Oakeley EJ, Haase M, Lam WL et al. (2005). Nat Genet 37: 853–862.

  16. Yan PS, Chen CM, Shi H, Rahmatpanah F, Wei SH, Caldwell CW et al. (2001). Cancer Res 61: 8375–8380.

Download references

Acknowledgements

This work was supported in part by grants from the Japanese Science and Technology Agency (IH), the Ministry of Education, Culture, Sports, Science and Technology of Japan (IH), and the Ministry of Health, Labour and Welfare of Japan (IH). We thank the Cancer Cell Repository (Institute of Development, Aging and Cancer, Tohoku University) for providing cancer cell lines and Miss Asano for technical assistance.

Author information

Correspondence to I Hatada.

Additional information

Supplementary Information accompanies the paper on Oncogene website (http://www.nature.com/onc)

Supplementary information

Rights and permissions

Reprints and Permissions

About this article

Keywords

  • DNA methylation
  • genome-wide profiling
  • epigenetics
  • microarray

Further reading