Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web ( are being constructed: biological process, molecular function and cellular component.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1

    Goffeau, A. et al. Life with 6000 genes. Science 274, 546 (1996).

  2. 2

    Worm Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. The C. elegans Sequencing Consortium. Science 282, 2012–2018 (1998).

  3. 3

    Adams, M.D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).

  4. 4

    Meinke, D.W. et al. Arabidopsis thaliana: a model plant for genome analysis. Science 282, 662–682 (1998).

  5. 5

    Chervitz, S.A. et al. Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure. Nucleic Acids Res. 27, 74–78 (1999).

  6. 6

    Rubin, G.M. et al. Comparative genomics of the eukaryotes. Science 287, 2204–2215 (2000).

  7. 7

    Tang, Z., Kuo, T., Shen, J. & Lin, R.J. Biochemical and genetic conservation of fission yeast Dsk1 and human SR protein-specific kinase 1. Mol. Cell. Biol. 20, 816–824 (2000).

  8. 8

    Vajo, Z. et al. Conservation of the Caenorhabditis elegans timing gene clk-1 from yeast to human: a gene required for ubiquinone biosynthesis with potential implications for aging. Mamm. Genome 10, 1000–1004 (1999).

  9. 9

    Ohi, R. et al. Myb-related Schizosaccharomyces pombe cdc5p is structurally and functionally conserved in eukaryotes. Mol. Cell. Biol. 18, 4097–4108 (1998).

  10. 10

    Bassett, D.E. Jr et al. Genome cross-referencing and XREFdb: implications for the identification and analysis of genes mutated in human disease. Nature Genet. 15, 339–344 (1997).

  11. 11

    Kataoka T. et al. Functional homology of mammalian and yeast RAS genes. Cell 40,19–26 (1985).

  12. 12

    Botstein, D. & Fink, G.R. Yeast: an experimental organism for modern biology. Science 240, 1439–1443 (1988).

  13. 13

    Tatusov, R.L., Galperin, M.Y., Natale, D.A. & Koonin, E.V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000).

  14. 14

    Andrade, M.A. et al. Automated genome sequence analysis and annotation. Bioinformatics 15, 391–412 (1999).

  15. 15

    Fleischmann, W., Moller, S., Gateau, A. & Apweiler, R. A novel method for automatic functional annotation of proteins. Bioinformatics 15, 228–233 (1999).

  16. 16

    The FlyBase Consortium. The FlyBase database of the Drosophila Genome Projects and community literature. Nucleic Acids Res. 27, 85–88 (1999).

  17. 17

    Blake, J.A. et al. The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse. Nucleic Acids Res. 28, 108–111 (2000).

  18. 18

    Ringwald, M. et al. GXD: a gene expression database for the laboratory mouse—current status and recent enhancements. Nucleic Acids Res. 28, 115–119 (2000).

  19. 19

    Ball, C.A. et al. Integrating functional genomic information into the Saccharomyces Genome Database. Nucleic Acids Res. 28, 77–80 (2000).

  20. 20

    Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).

  21. 21

    Benson, D.A. et al. GenBank. Nucleic Acids Res. 28, 15–18 (2000).

  22. 22

    Baker, W. et al. The EMBL Nucleotide Sequence Database. Nucleic Acids Res. 28, 19–23 (2000).

  23. 23

    Tateno, Y. et al. DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Res. 28, 24–26 (2000).

  24. 24

    Barker, W.C. et al. The Protein Information Resource (PIR). Nucleic Acids Res. 28, 41–44 (2000).

  25. 25

    Mewes, H.W. et al. MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 28, 37–40 (2000).

  26. 26

    Costanzo, M.C. et al. The Yeast Proteome Database (YPD) and Caenorhabditis elegans Proteome Database (WormPD): comprehensive resources for the organization and comparison of model organism protein information. Nucleic Acids Res. 28, 73–76 (2000).

  27. 27

    Bateman, A. et al. The Pfam protein families database. Nucleic Acids Res. 28, 263–266 (2000).

  28. 28

    Lo Conte, L. et al. SCOP: a structural classification of proteins database. Nucleic Acids Res. 28, 257–259 (2000).

  29. 29

    Bairoch, A. The ENZYME database in 2000. Nucleic Acids Res. 28, 304–305 (2000).

  30. 30

    Enzyme Nomenclature. Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology on the Nomenclature and Classification of Enyzmes. NC-IUBMB. (Academic, New York, 1992).

  31. 31

    Tye, B.K. MCM proteins in DNA replication. Annu. Rev. Biochem. 68, 649–686 (1999).

  32. 32

    Eisen, M., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).

  33. 33

    Spellman, P.T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9, 3273–3297 (1998).

Download references


We thank K. Fasman and M. Rebhan for useful discussions, and Astra Zeneca for financial support. SGD is supported by a P41, National Resources, grant from National Human Genome Research Institute (NHGRI) grant HG01315; MGD by a P41 from NHGRI grant HG00330; GXD by National Institute of Child Health and Human Development grant HD33745; and FlyBase by a P41 from NHGRI grant HG00739 and the Medical Research Council, London.

Author information

Author notes

  1. FlyBase (

    • Berkeley Drosophila Genome Project (

      • Saccharomyces Genome Database (

        • Mouse Genome Database and Gene Expression Database (

          • *The Gene Ontology Consortium


            1. Search for Michael Ashburner in:

            2. Search for Catherine A. Ball in:

            3. Search for Judith A. Blake in:

            4. Search for David Botstein in:

            5. Search for Heather Butler in:

            6. Search for J. Michael Cherry in:

            7. Search for Allan P. Davis in:

            8. Search for Kara Dolinski in:

            9. Search for Selina S. Dwight in:

            10. Search for Janan T. Eppig in:

            11. Search for Midori A. Harris in:

            12. Search for David P. Hill in:

            13. Search for Laurie Issel-Tarver in:

            14. Search for Andrew Kasarskis in:

            15. Search for Suzanna Lewis in:

            16. Search for John C. Matese in:

            17. Search for Joel E. Richardson in:

            18. Search for Martin Ringwald in:

            19. Search for Gerald M. Rubin in:

            20. Search for Gavin Sherlock in:

            About this article

            Publication history



            Issue Date



            Further reading