Original Article

Oncogene (2007) 26, 1517–1521. doi:10.1038/sj.onc.1209952; published online 4 September 2006

The p53 knowledgebase: an integrated information resource for p53 research

Y P Lim1, T T Lim1,3, Y L Chan1,3, A C M Song1,3, B H Yeo1,3, B Vojtesek2, D Coomber2, G Rajagopal1 and D Lane2

  1. 1Bioinformatics Institute, Matrix, Singapore, Singapore
  2. 2Institute of Molecular and Cell Biology, Proteos, Singapore, Singapore

Correspondence: Professor Sir David Lane, Institute of Molecular and Cell Biology, 61 Biopolis Street, Proteos, Singapore 138673, Singapore. E-mail: d.p.lane@imcb.a-star.edu.sg; Dr G Rajagopal, Bioinformatics Institute, 30 Biopolis Street, #07-01 Matrix, Singapore 138671, Singapore. E-mail: guna@bii.a-star.edu.sg

3These authors contributed equally to this work.

Received 16 May 2006; Accepted 27 July 2006; Published online 4 September 2006.

Top

Abstract

The p53 tumor suppressor protein plays a central role in maintaining genomic integrity by occupying a nodal point in the DNA damage control pathway. Here it integrates a wide variety of signals, responding in one of several ways, that is, cell cycle arrest, senescence or programmed cell death (apoptosis). Mutations in the tumor suppressor gene tp53, which affects the key transcriptional regulatory processes in cell growth and death, occur frequently in cancer and helps explain why p53 has been called the guardian of the genome. There is a vast body of published knowledge on all aspects of p53's role in cancer. To facilitate research, it would be helpful if this information could be collected, curated and updated in a format that is easily accessible to the user community. To this end, we initiated the p53 knowledgebase project (http://p53.bii.a-star.edu.sg). The p53 knowledgebase is a user-friendly web portal incorporating visualization and analysis tools that integrates information from the published literature with other manually curated information to facilitate knowledge discovery. This includes curated information on sequence, structural, mutation, polymorphisms, protein–protein interactions, transcription factors, transcriptional targets, antibodies and post-translational modifications that involve p53. The goal is to collect and maintain all relevant data on p53 and present it in an easily accessible format that will be useful to researchers in the field.

Keywords:

database, p53, web portal

Top

Introduction

The p53 tumor suppressor protein (Lane and Crawford, 1979; Linzer and Levine, 1979) plays a central role in maintaining genomic integrity by occupying a nodal point in the DNA damage control pathway. Here it integrates a wide variety of signals, responding in one of several ways that is cell cycle arrest, senescence or programmed cell death (apoptosis). It does this through its role as a key transcriptional regulator of cell growth and death during times of cellular stress such as genotoxic shock (Kastan et al., 1991; Lu and Lane, 1993; Fritsche et al., 1993), hypoxia (Graeber et al., 1996), nucleotide pool reduction (Linke et al., 1996), thermal shock (Sugano et al., 1995; Nitta et al., 1997) and low pH (Williams et al., 1999). Mutations in the tumor suppressor gene tp53, which affects the key transcriptional regulatory processes in cell growth and death, occur frequently in cancer (Vogelstein, 1990; Hollstein et al., 1991) and helps explain why p53 has been called the guardian of the genome (Lane, 1992).

There has been extensive analysis of the role p53 has in the cell and its clinical relevance to cancer. In an effort to collate and order this data numerous reviews have been written. In fact, a Pubmed search with p53 as the search term identifies over 37 000 articles to date. Furthermore, large mutation databases have been constructed such as the IARC TP53 Database (Olivier et al., 2002) and the p53 web site by T Soussi (Beroud and Soussi, 2003), with about 1000–1500 new mutations being added to the IARC TP53 database annually.

The p53 knowledgebase described in this publication is a new resource that provides an extensive overview of publicly available information on all aspects of tp53 gene and protein. It includes data such as the isoforms, mutations, polymorphisms, protein–protein interactions, transcription factors, transcriptional targets, antibodies and post-translational modifications that involve the tp53 gene or its protein product p53. The p53 knowledgebase aims to integrate existing information about tp53, summarize via periodic curation using manual and natural language processing and provide the information to the user community through user-friendly visualization and analysis tools to aid in mining new knowledge and drive both discovery and hypothesis-based investigations. In the near future, we plan to expand on this effort to include additional information on other key players within the p53 pathway such as p63/73, the mdm2 family, the Rb pathway etc. with the ultimate goal of building a comprehensive resource that will integrate all relevant information on p53 in one user-friendly web portal that will be actively supported and curated with the support and collaboration of the user community. The knowledgebase is publicly available via the web at http://p53.bii.a-star.edu.sg.

Top

Results

Organization and features of the p53 knowledgebase

The knowledgebase is organized to facilitate easy access to the relevant information and benefited from feedback from the user community who are currently using it for their research. The p53 knowledgebase is a relational database built in mySQL and the data is organized into 11 sections (isoforms, mutations, polymorphisms, haplotypes, p53-related molecules, transcription factors, transcriptional targets, protein interactions, monoclonal antibodies, modifications and Protein Data Bank (PDB) structures). Information in each of these sections can be accessed directly through a set of search pages and viewers. For instance, mutation records can be accessed directly through a search page located in the mutation section of the p53 knowledgebase. Interactive viewers are also available for some sections such as the isoforms, transcriptional targets, protein interactions and monoclonal antibodies sections to aid searching of the data and visualization of search results (Figure 1). To ensure continuity and accuracy of the data incorporated in the various sections, curators have been assigned responsibilities of each of the sections and their names and contact details can be found at the top of each page. This ensures that the data is periodically updated and the curators will serve as a point of contact for members of the user community to provide feedback and new results.

Figure 1.
Figure 1 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

The viewers in the (a) transcriptional targets, (b) protein interactions and (c) monoclonal antibodies sections in the p53 knowledgebase. The search pages and viewers for each section in the knowledgebase have been designed to allow quick searches for specific records. (a) The transcriptional targets viewer allows the researcher to visualize the chromosomal position of a p53 target gene, color coded by its transcriptional activity. The researcher can obtain more detailed information by clicking on the gene name. (b) The protein interactions viewer allows the researcher to view all direct protein interactions with p53. The molecular action of the interaction is represented as a set of cartoons. Basic information such as the name of the interacting protein, the synonyms of the protein and the number of experimental evidence is presented in a summary section at the bottom. The researcher can navigate further to obtain more detailed information such as the gene ontology, function and structure of the interacting protein.

Full figure and legend (275K)

A unique feature in the p53 knowledgebase is the sequence viewers – the DNA sequence viewer and the protein sequence viewer (Figure 2). The sequence viewers provide an overview of available information about the genomic and amino-acid sequences of p53, such as the polymorphism sites, protein-binding sites and antigenic determinants (or epitopes) in a graphical, user-friendly format. From the sequence viewers, the user can navigate further to obtain more detailed information, such as the binding sequence of a specific protein-binding site and annotations of the binding protein.

Figure 2.
Figure 2 - Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

The protein sequence viewer component of the p53 knowledgebase, which consolidates information about the p53 protein found in the knowledgebase. A diagram of the p53 protein with its domains is located at the top and when a researcher clicks on any part of the diagram, the sequence information for that section of the p53 protein (indicated by the red box on the diagram) is displayed at the bottom, with other details such as mutation sites and protein-binding sites on the p53 protein aligned to the sequence.

Full figure and legend (256K)

The p53 knowledgebase aims to assist in information mining and knowledge generation by providing an alternative presentation of publicly available information. For example, the single nucleotide polymorphism (SNP) allele frequencies for various populations are categorized according to the geographical locations of these populations. This allows the researchers to visualize whether specific SNP allele frequencies are associated with physical factors such as the geographical location of the population.

Data entry

The information available in the p53 knowledgebase is extracted from existing databases and supplemented with manual curation from literature. Currently, there are over 20 000 records in the knowledgebase (Table 1) and ongoing efforts are in place to increase the number of records and maintain updated information in the knowledgebase. A systematic workflow for the curation process has been put in place for convenience and efficiency. This includes a basic quality control mechanism but we strongly encourage members of the user community to participate in checking the information presented to ensure the correctness of the data incorporated into the knowledgebase. A mechanism to provide feedback has been put in place via the web portal or through email (see below). A set of web-based maintenance pages allows data to be entered into the p53 knowledgebase using a web browser. Access to these maintenance pages is password protected and is available only to a group of curators. However, the growth of the knowledgebase is envisioned to be a collective effort from the scientists and curators in the p53 community. A feedback form is available at http://p53.bii.a-star.edu.sg/feedback.php for any contribution of new data or update on existing data.


Analysis tools

In addition to the data repository and visualization components, the p53 knowledgebase also contains a set of frequently used analysis tools, which is summarized in Table 2. These tools, both from external parties and developed by our bioinformaticians to meet specific needs of this project have been incorporated within the web portal to run on our backend compute platforms.


Top

Discussion

Given the tremendous interest in p53 and the massive amounts of published literature resulting from this each year, it will be difficult for a single investigator to review the literature comprehensively and analyse disparate data effectively. In an effort to address these concerns the p53 knowledgebase was constructed to enable researchers at all levels to easily access data relating to p53. The p53 knowledgebase is a unique resource that integrates information from a wide range of sources that can be accessed from a single web portal. The authors have aimed to design a resource that integrates analysis tools with a user-friendly graphical interface. As such, it enables detailed analysis of large databases quickly and also serves as a learning platform for researchers who are new to the field of p53 research. To ensure that the data is current, a curation process involving curators assigned to oversee each section has been put in place. They will also serve as a point of contact to gather feedback as well as a conduit to expand the information content in the respective sections with new data, validated by and useful to, the wider user community.

Pioneering efforts by existing resources on p53-related information such as IARC TP53 database and the p53 web site by T Soussi have inspired us. The p53 knowledgebase is designed to supplement and enhance the valuable contribution that the IARC and T Soussi database have made the community to facilitate knowledge discovery and drive both discovery and hypothesis-based investigation by researchers working on p53. In addition, other sources of information such as p53 interactions, structural and post-translational modifications etc. have been integrated through literature-based manual curation and automated annotations to provide a more complete understanding of this vital protein. It is envisaged that the p53 knowledgebase will evolve with increasing input from the p53 community, keeping information up to date and relevant. As such it will be a valuable research tool for all in the field. In the near future, we plan to expand on this effort to include additional information on other key players such as p63/73, the mdm2 family and members of the Rb pathway etc. with the ultimate goal of building a comprehensive resource that will aid and facilitate research on cancer.

Top

Materials and methods

The p53 knowledgebase contains mutation records obtained from the IARC TP53 Mutation Database. The polymorphism records are manually curated and supplemented by data from the International HapMap Project (The International HapMap Consortium, 2003), Perlegen Science Genotype Data (http://genome.perlegen.com), National Institute of Environmental Health Sciences SNPs (http://egp.gs.washington.edu) and IARC. The data in the Haplotype section in the knowledgebase is obtained from the International HapMap Project and analysed using the HapBlock tool (Zhang et al., 2005). The annotations of the molecules in the p53-related molecules section are extracted automatically from the UCSC Genome Browser (http://genome.ucsc.edu), Homologene (Wheeler et al., 2003) and the Biomolecular Network Database (Bader et al., 2001). The records in the transcription factor and transcriptional target sections are manually curated and supplemented by data from the Transcriptional Regulatory Element Database (Zhao et al., 2005), TRANSFAC (Wingender et al., 2000), Biomolecular Network Database, BioCarta (http://www.biocarta.com) and Kyoto Encyclopedia of Genes and Genomes (Kanehisa and Goto, 2000). The protein interaction records are manually curated and supplemented by records from AfCS-Nature Signaling Gateway (http://www.signaling-gateway.org) and the Biomolecular Network Database. The monoclonal antibody records are obtained from manually curated records provided by Borivoj Vojtesek from Institute of Molecular and Cell Biology. The PDB structures are obtained from the RCSB PDB (Berman et al., 2000). The isoforms and post-translational modifications are manually curated from literature.

Top

References

  1. Bader GD, Donaldson I, Wolting C, Ouellette BFF, Pawson T, Hogue CWV. (2001). BIND: The Biomolecular Interaction Network Database. Nucleic Acids Res 29: 242–245. | Article | PubMed | ISI | ChemPort |
  2. Berman HM, Westbrook J, Feng X, Gilliland G, Bhat TN, Weissig H et al. (2000). The Protein Data Bank. Nucleic Acids Res 28: 235–242. | Article | PubMed | ISI | ChemPort |
  3. Beroud C, Soussi T. (2003). The UMD-p53 database: new mutations and analysis tools. Hum Mutat 21: 176–181. | Article | PubMed | ISI | ChemPort |
  4. Combet C, Blanchet C, Geourjon C, Deleage G. (2000). NPS@: network protein sequence analysis. Trends Biochem Sci 25: 147–150. | Article | PubMed | ISI | ChemPort |
  5. Fritsche M, Haessler C, Brandner G. (1993). Induction of nuclear accumulation of the tumor-suppressor protein p53 by DNA-damaging agents. Oncogene 8: 307–318. | PubMed | ISI | ChemPort |
  6. Graeber TG, Osmanian C, Jacks T, Housman DE, Koch CJ, Lowe SW et al. (1996). Hypoxia-mediated selection of cells with diminished apoptotic potential in solid tumours. Nature 379: 88–91. | Article | PubMed | ISI | ChemPort |
  7. Hollstein M, Sidransky D, Vogelstein B, Harris CC. (1991). p53 mutations in human cancers. Science 253: 49–53. | Article | PubMed | ISI | ChemPort |
  8. Kanehisa M, Goto S. (2000). KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 28: 27–30. | Article | PubMed | ISI | ChemPort |
  9. Kastan MB, Onyekwere O, Sidransky D, Vogelstein B, Craig RW. (1991). Participation of p53 protein in the cellular response to DNA damage. Cancer Res 51: 6304–6311. | PubMed | ISI | ChemPort |
  10. Lane DP. (1992). Cancer. p53, guardian of the genome. Nature 358: 15–16. | Article | PubMed | ISI | ChemPort |
  11. Lane DP, Crawford LV. (1979). T antigen is bound to a host protein in SV40-transformed cells. Nature 278: 261–263. | Article | PubMed | ISI | ChemPort |
  12. Linke SP, Clarkin KC, Di Leonardo A, Tsou A, Wahl GM. (1996). A reversible, p53-dependent G0/G1 cell cycle arrest induced by ribonucleotide depletion in the absence of detectable DNA damage. Genes Dev 10: 934–947. | PubMed | ISI | ChemPort |
  13. Linzer DI, Levine AJ. (1979). Characterization of a 54 K dalton cellular SV40 tumor antigen present in SV40-transformed cells and uninfected embryonal carcinoma cells. Cell 17: 43–52. | Article | PubMed | ISI | ChemPort |
  14. Lu X, Lane DP. (1993). Differential induction of transcriptionally active p53 following UV or ionizing radiation: defects in chromosome instability syndromes? Cell 75: 765–778. | Article | PubMed | ISI | ChemPort |
  15. Nitta M, Okamura H, Aizawa S, Yamaizumi M. (1997). Heat shock induces transient p53-dependent cell cycle arrest at G1/S. Oncogene 15: 561–568. | Article | PubMed | ChemPort |
  16. Olivier M, Eeles R, Hollstein M, Khan MA, Harris CC, Hainaut P. (2002). The IARC TP53 database: new online mutation analysis and recommendations to users. Hum Mutat 19: 607–614. | Article | PubMed | ISI | ChemPort |
  17. Sugano T, Nitta M, Ohmori H, Yamaizumi M. (1995). Nuclear accumulation of p53 in normal human fibroblasts is induced by various cellular stresses which evoke the heat shock response, independently of the cell cycle. Jpn J Cancer Res 86: 415–418. | PubMed | ChemPort |
  18. The International HapMap Consortium (2003). The International HapMap Project. Nature 426: 789–796. | Article |
  19. Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22: 4673–4680. | Article | PubMed | ISI | ChemPort |
  20. Vogelstein B. (1990). A deadly inheritance. Nature 348: 681–682. | Article | PubMed | ISI | ChemPort |
  21. Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, Pontius JU et al. (2003). Database resources of the National Center for Biotechnology. Nucleic Acids Res 31: 28–33. | Article | PubMed | ISI | ChemPort |
  22. Williams AC, Collard TJ, Paraskeva C. (1999). An acidic environment leads to p53 dependent induction of apoptosis in human adenoma and carcinoma cell lines: implications for clonal selection during colorectal carcinogenesis. Oncogene 18: 3199–3204. | Article | PubMed | ISI | ChemPort |
  23. Wingender E, Chen X, Hehl R, Karas H, Liebich I, Matys V et al. (2000). TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 28: 316–319. | Article | PubMed | ISI | ChemPort |
  24. Zhang K, Qin Z, Chen T, Liu JS, Waterman MS, Sun F. (2005). HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics 21: 131–134. | Article | PubMed | ISI | ChemPort |
  25. Zhao F, Xuan Z, Liu L, Zhang MQ. (2005). TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies. Nucleic Acids Res 33: 103–107. | Article | ChemPort |
Top

Acknowledgements

We acknowledge colleagues in the Cancer Biology Group at the Bioinformatics Institute (Chung Cheuk Wang, Chua Gek Huey, Erwin Tantoso, Yang Yuchen, Wong Sum Thai, Yeo Zhenxuan and Felicia Ng) for their curation effort, as well as advice and support from the p53 Focus Group made up of colleagues from our sister institutes within and outside the Biopolis. We are grateful for the advice, guidance and support from Professors Arnold Levine from the IAS/Princeton and Sidney Brenner. We acknowledge the efforts of Danny Chuon from the Web Services team for infrastructure support and various teams within the BioComputing Center at the Bioinformatics Institute for their excellent technical support throughout this project. We are grateful for the generous help and support of members of the computational biology community who have provided codes and advice for incorporation into our visualization and analysis tools. This work is supported by the Biomedical Research Council of the Agency for Science, Technology and Research of Singapore.

Top

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated