Synopsis

Subject Categories: Bioinformatics | Proteomics

Molecular Systems Biology 4 Article number: 180  doi:10.1038/msb.2008.19
Published online: 15 April 2008
Citation: Molecular Systems Biology 4:180

A map of human protein interactions derived from co-expression of human mRNAs and their orthologs

Arun K Ramani1,4,a, Zhihua Li1,4, G Traver Hart1, Mark W Carlson2,a, Daniel R Boutz1 & Edward M Marcotte1,3,4

  1. Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas, Austin, TX, USA
  2. Department of Biomedical Engineering, University of Texas, Austin, TX, USA
  3. Department of Chemistry & Biochemistry, University of Texas, Austin, TX, USA
  4. These authors contributed equally to this work

Correspondence to: Edward M Marcotte1,3,4 Department of Chemistry & Biochemistry, University of Texas, Austin, TX 78712, USA. Tel.: +512 471 5435; Fax: +512 232 3432; Email: marcotte@icmb.utexas.edu

Received 20 August 2007; Accepted 20 February 2008; Published online 15 April 2008

aPresent address: The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK

aPresent address: Division of Cancer Biology and Tissue Engineering, Department of Oral and Maxillofacial Pathology, School of Dental Medicine, Tufts University, Boston, MA 02111, USA

Top

Article highlights

  • The scale of the human interactome appears to be beyond any individual technique; a combination of complementary approaches will be needed to map the complete human protein-protein interaction network.
  • Although current methods for mapping interactions focus largely on direct experimental observations, sufficient functional genomics data exist that physical protein associations can also be indirectly identified from these data.
  • We develop and apply such a method for inferring physical associations among human proteins based upon the co-expression of their mRNAs and that of their orthologs in five organisms (mustard plant, mouse, fly, nematode, and yeast). By this approach, we mapped 7,000 human protein physical associations, of which 5,589 are novel. From a panel of benchmark tests, including direct assay by quantitative mass spectrometry, we estimate the true positive rate of these associations to be 54 +/- 10% . This indirect approach is thus comparable in scale and quality, both in terms of false-positive and false-negative rates, to the current largest-scale experimental screens.B5

Top

Synopsis

Extended Synopsis

Although considerable progress has been made in mapping the protein interaction network of yeast (Reguly et al, 2006), only minimal progress has been made on the interaction networks of higher eukaryotes, due primarily to their scale. For the approx20–25 000 human proteins, we expect a network of roughly 1–400 000 interactions, but the human protein interaction map is currently perhaps only 10–30% complete (Hart et al, 2006). It is therefore important to identify and employ methods for discovering interacting proteins that do not require exhaustive experimental measurement of all pairs of proteins.

In this paper, we developed such a method for large-scale mapping of physical associations (i.e. membership in the same physical complex) among human proteins. Unlike traditional assays, we instead infer physical associations indirectly from mRNA transcriptional evidence and the evolutionary conservation of co-regulatory relationships. Previous studies have shown that the evolutionary conservation of mRNA co-expression (CCE) patterns of orthologous genes can also be used to transfer functional information across species (Teichmann and Babu, 2002; Stuart et al, 2003; van Noort et al, 2003; Bergmann et al, 2004; Snel et al, 2004). Similarly, transcriptional co-expression patterns have proved useful for inferring physical protein interactions (e.g. Deane et al, 2002; Jansen et al, 2003), with strongly co-expressed mRNAs more likely to indicate long-lived interactions (Ge et al, 2001; Jansen et al, 2002; Simonis et al, 2006). We took advantage of this signal to develop a supervised approach for identifying physical associations among human proteins based upon the co-expression of their mRNAs and that of their orthologs in five organisms (the mustard plant Arabidopsis thaliana, the mouse Mus musculus, the fly Drosophila melanogaster, the nematode Caenorhabditis elegans, and the yeast Saccharomyces cerevisiae). By this approach, we mapped 7000 predicted human protein physical associations, of which 5589 are novel.

As this assay relies upon indirect evidence, it is critical that putative physical associations discovered by this approach be carefully evaluated. Beyond testing directly for the recovery of known human protein interactions, we devised measures of the enrichment of true physical associations, including sharing of their functional annotation, direct assay of co-sedimentation of putative interaction partners in sucrose density gradients (for which we quantified the biochemical separation profiles of 3013 human proteins via the use of shotgun proteomics mass spectrometry), recovery of known interactions among orthologs in other organisms, and connectivity of orthologs in a yeast gene network. By these tests, we estimate the true-positive rate of the CCE-based associations to be 54plusminus10% , comparable to direct large-scale interaction assays. We further demonstrate the specific functions of four proteins based upon the interactions, specifically implicating the proteins in biogenesis of the 40S and 60S ribosomal subunits.

The scale of the human interactome appears to be beyond any individual technique; a combination of complementary approaches will be needed to map the complete human protein–protein interaction network. Although current methods for mapping interactions focus largely on direct experimental observations, sufficient functional genomics data exist that physical protein associations can also be indirectly identified from these data. We demonstrate that these approaches can be comparable in scale and quality, both in terms of false-positive and false-negative rates, to the current largest scale experimental screens. Finally, as CCE-based physical protein association mapping is based on conserved in vivo phenomena, this approach is likely to specifically discover associations relevant to in vivo biology.

Top

Acknowledgements

We thank Insuk Lee for critical comments, Vishy Iyer and members of his lab for help with DNA microarray analysis, Arlen Johnson and Nai Jung Hung for help with polysome profile analysis, and Scott Stevens for critical comments. This study was supported by grants from the NSF (IIS-0325116, EIA-0219061), NIH (GM06779-01, GM076536-01), Welch (F1515), and a Packard Fellowship (EMM).

Top

References

  1. Bergmann S, Ihmels J, Barkai N (2004) Similarities and differences in genome-wide expression data of six organisms. PLoS Biol 2: E9 | Article | PubMed | ChemPort |
  2. Deane CM, Salwinski L, Xenarios I, Eisenberg D (2002) Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 1: 349–356 | Article | PubMed | ISI | ChemPort |
  3. Ge H, Liu Z, Church GM, Vidal M (2001) Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat Genet 29: 482–486 | Article | PubMed | ISI | ChemPort |
  4. Hart GT, Ramani AK, Marcotte EM (2006) How complete are current yeast and human protein-interaction networks? Genome Biol 7: 120 | Article | PubMed | ChemPort |
  5. Jansen R, Greenbaum D, Gerstein M (2002) Relating whole-genome expression data with protein–protein interactions. Genome Res 12: 37–46 | Article | PubMed | ISI | ChemPort |
  6. Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M (2003) A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 302: 449–453 | Article | PubMed | ISI | ChemPort |
  7. Simonis N, Gonze D, Orsi C, van Helden J, Wodak SJ (2006) Modularity of the transcriptional response of protein complexes in yeast. J Mol Biol 363: 589–610 | Article | PubMed | ChemPort |
  8. Snel B, van Noort V, Huynen MA (2004) Gene co-regulation is highly conserved in the evolution of eukaryotes and prokaryotes. Nucleic Acids Res 32: 4725–4731 | Article | PubMed | ChemPort |
  9. Stuart JM, Segal E, Koller D, Kim SK (2003) A gene-coexpression network for global discovery of conserved genetic modules. Science 302: 249–255 | Article | PubMed | ISI | ChemPort |
  10. Teichmann SA, Babu MM (2002) Conservation of gene co-regulation in prokaryotes and eukaryotes. Trends Biotechnol 20: 407–410; discussion 410 | Article | PubMed | ISI | ChemPort |
  11. van Noort V, Snel B, Huynen MA (2003) Predicting gene function by conserved co-expression. Trends Genet 19: 238–242 | Article | PubMed | ISI | ChemPort |

Extra navigation

.
ADVERTISEMENT