Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Large-scale human promoter mapping using CpG islands

Abstract

Vertebrate genomic DNA is generally CpG depleted1,2, possibly because methylation of cytosines at 80% of CpG dinucleotides results in their frequent mutation to thymine, and thus CpG to TpG dinucleotides3. There are, however, genomic regions of high G+C content (CpG islands), where the occurrence of CpGs is significantly higher, close to the expected frequency, whereas the methylation concentration is significantly lower than the overall genome4. CpG islands5 are longer than 200 bp and have over 50% of G+C content and CpG frequency, at least 0.6 of that statistically expected. Approximately 50% of mammalian gene promoters are associated with one or more CpG islands6. Although biologists often intuitively use CpG islands for 5′ gene identification7,8, this has not been rigorously quantified9. We have determined the features that discriminate the promoter-associated and non-associated CpG islands. This led to an effective algorithm for large-scale promoter mapping (with 2-kb resolution) with a concentration of false-positive predictions of promoters much lower than previously obtained. Using this algorithm, we correctly discriminated approximately 85% of the CpG islands within an interval (−500 to +1500) around a transcriptional start site (TSS) from those that lie further away from TSSs. We also correctly mapped approximately 93% of the promoters containing CpG islands.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

References

  1. Bird, A.P. DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8, 1499–1504 (1980).

    Article  CAS  Google Scholar 

  2. Jones, P.A., Rideout, W.M. 3d, Shen, J.C., Spruck, C.H. & Tsai, Y.C. Methylation, mutation and cancer. Bioessays 14, 33–36 (1992).

    Article  CAS  Google Scholar 

  3. Bird, A. DNA methylation de novo. Science 286, 2287–2288 (1999).

    Article  CAS  Google Scholar 

  4. Antequera, F. & Bird, A. CpG islands. EXS 64, 169–185 (1993).

    CAS  PubMed  Google Scholar 

  5. Gardiner-Garden, M. & Frommer, M. CpG islands in vertebrate genomes. J. Mol. Biol. 196, 261–282 (1987).

    Article  CAS  Google Scholar 

  6. Antequera, F. & Bird, A. Number of CpG islands and genes in human and mouse. Proc. Natl Acad. Sci. USA 90, 11995–11999 (1993).

    Article  CAS  Google Scholar 

  7. Cross, S.H. & Bird, A.P. CpG islands and genes. Curr. Opin. Genet. Dev. 5, 309–314 (1995).

    Article  CAS  Google Scholar 

  8. Dunham, I. et al. The DNA sequence of human chromosome 22. Nature 402, 489–495 (1999).

    Article  CAS  Google Scholar 

  9. Pedersen, A.G., Baldi, P., Chauvin, Y. & Brunak, S. The biology of eukaryotic promoter prediction—a review. Comput. Chem. 23, 191–207 (1999).

    Article  CAS  Google Scholar 

  10. Venables, W.N. & Ripley, B.D. Modern Applied Statistics with S-Plus (Springer, New York, 1994).

    Book  Google Scholar 

  11. McLachlan, G.J. Discriminant Analysis and Statistical Pattern Recognition (Wiley, New York, 1992).

    Book  Google Scholar 

  12. Prestridge, D.S. Predicting Pol II promoter sequences using transcription factor binding sites. J. Mol. Biol. 249, 923–932 (1995).

    Article  CAS  Google Scholar 

  13. Toyota, M. & Issa, J.P. CpG island methylator phenotypes in aging and cancer. Semin. Cancer Biol. 9, 349–357 (1999).

    Article  CAS  Google Scholar 

  14. Baylin, S.B. & Herman, J.G. DNA hypermethylation in tumorigenesis: epigenetics joins genetics. Trends Genet. 16, 168–174 (2000).

    Article  CAS  Google Scholar 

  15. Barlow, D.P. Gametic imprinting in mammals. Science 270, 1610–1613 (1995).

    Article  CAS  Google Scholar 

  16. Singer-Sam, J. & Riggs, A.D. X chromosome inactivation and DNA methylation. EXS 64, 358–384 (1993).

    CAS  PubMed  Google Scholar 

  17. Larsen, F., Gundersen, G., Lopez, R. & Prydz, H. CpG islands as gene markers in the human genome. Genomics 13, 1095–1107 (1992).

    Article  CAS  Google Scholar 

  18. Cross, S.H., Charlton, J.A., Nan, X. & Bird, A.P. Purification of CpG islands using a methylated DNA binding column. Nature Genet. 6, 236–244 (1994).

    Article  CAS  Google Scholar 

  19. Cross, S.H., Clark, V.H. & Bird, A.P. Isolation of CpG islands from large genomic clones. Nucleic Acids Res. 27, 2099–2107 (1999).

    Article  CAS  Google Scholar 

  20. Zhang, M.Q. in Proceedings of Pacific Symposium on Biocomputing 1998 (eds Altman, R.B. et al.) 240–251 (World Scientific, Singapore, 1998).

    Google Scholar 

  21. Zhang, M.Q. Identification of protein coding regions in the human genome based on quadratic discriminant analysis. Proc. Natl Acad. Sci. USA 94, 565–568 (1997).

    Article  CAS  Google Scholar 

  22. Zhang, M.Q. Statistical features of human exons and their flanking regions. Hum. Mol. Genet. 7, 919–932 (1998).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank R. Bari for assistance in sequence annotation; T. Zhang for assistance in testing CpG_promoter; S.H. Cross for discussions; P. Rice and R. Lopez for consultations about the EMBOSS project and CpGPlot program; and J. Locker and S. Emmons for editing of the text. This work was supported by National Institutes of Health Grant HG01696 to M.Q.Z.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Q. Zhang.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ioshikhes, I., Zhang, M. Large-scale human promoter mapping using CpG islands. Nat Genet 26, 61–63 (2000). https://doi.org/10.1038/79189

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1038/79189

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing