Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Collecting and organizing systematic sets of protein data

Key Points

  • Systems biology aims to develop experimentally-validated quantitative models of complex biochemical pathways. The data needed for this task cannot be generated by 'omics'-type methods alone, but instead require a hypothesis-driven approach that integrates multiple experimental techniques.

  • There are trade-offs to consider when choosing the types of assay to use in a systems biology study because throughput, multiplexing, sample size, sampling density and ease of use cannot all be simultaneously maximized. Experimental studies should be designed to be compatible with the chosen modelling approach.

  • For monitoring protein-signalling events, affinity-based methods, mass-spectrometry and protein-activity assays each have their strengths and weaknesses. Recent advances continue to improve the usefulness of these assays for systems biology.

  • Single-cell measurements are another important element in developing accurate models of biochemical events. Because only few signals can be monitored at the single-cell level, these data are most effective when combined with population-level biochemical assays.

  • Appropriate data validation and normalization techniques are crucial for constructing consistent data sets to inform modelling approaches. Several methods of data scaling can be used to highlight different features of a data set.

Abstract

Systems biology, particularly of mammalian cells, is data starved. However, technologies are now in place to obtain rich data, in a form suitable for model construction and validation, that describes the activities, states and locations of cell-signalling molecules. The key is to use several measurement technologies simultaneously and, recognizing each of their limits, to assemble a self-consistent compendium of systematic data.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: The scope of systems biology.
Figure 2: Merits of different assay technologies.

Similar content being viewed by others

References

  1. Mockler, T. C. et al. Applications of DNA tiling arrays for whole-genome analysis. Genomics 85, 1–15 (2005).

    Article  CAS  Google Scholar 

  2. Fan, J. B., Chee, M. S. & Gunderson, K. L. Highly parallel genomic assays. Nature Rev. Genet. 7, 632–644 (2006).

    Article  CAS  Google Scholar 

  3. Joyce, A. R. & Palsson, B. O. The model organism as a system: integrating 'omics' data sets. Nature Rev. Mol. Cell Biol. 7, 198–210 (2006).

    Article  CAS  Google Scholar 

  4. Kim, T. H. & Ren, B. Genome-wide analysis of protein–DNA interactions. Annu. Rev. Genomics Hum. Genet. 7, 81–102 (2006).

    Article  Google Scholar 

  5. Ness, S. A. Basic microarray analysis: strategies for successful experiments. Methods Mol. Biol. 316, 13–33 (2006).

    PubMed  Google Scholar 

  6. Quackenbush, J. Microarray data normalization and transformation. Nature Genet. 32 (Suppl.), 496–501 (2002).

    Article  CAS  Google Scholar 

  7. Morris, M. & Watkins, S. M. Focused metabolomic profiling in the drug development process: advances from lipid profiling. Curr. Opin. Chem. Biol. 9, 407–412 (2005).

    Article  CAS  Google Scholar 

  8. Nielsen, J. & Oliver, S. The next wave in metabolome analysis. Trends Biotechnol. 23, 544–546 (2005).

    Article  CAS  Google Scholar 

  9. Gaudet, S. et al. A compendium of signals and responses triggered by prodeath and prosurvival cytokines. Mol. Cell. Proteomics 4, 1569–1590 (2005). An example of data-compendium assembly from a data set of 7,000 heterogeneous protein measurements. Shows the critical importance of appropriate data normalization and scaling techniques in building predictive models.

    Article  CAS  Google Scholar 

  10. Sasagawa, S., Ozaki, Y., Fujita, K. & Kuroda, S. Prediction and validation of the distinct dynamics of transient and sustained ERK activation. Nature Cell Biol. 7, 365–373 (2005). A mechanistic modelling effort is driven by an impressive data set of immunoblots and GTPase assays. An excellent example of a model carefully matched to experimental data.

    Article  CAS  Google Scholar 

  11. Schweitzer, B. et al. Immunoassays with rolling circle DNA amplification: a versatile platform for ultrasensitive antigen detection. Proc. Natl Acad. Sci. USA 97, 10113–10119 (2000).

    Article  CAS  Google Scholar 

  12. Debad, J. D., Glezer, E. N., Wohlstadter, J. N. & Sigal, G. B. in Electrogenerated Chemiluminescence (ed. Bard, A. J.) 43–78 (Marcel Dekker, New York, 2004).

    Google Scholar 

  13. Vignali, D. A. Multiplexed particle-based flow cytometric assays. J. Immunol. Methods 243, 243–255 (2000).

    Article  CAS  Google Scholar 

  14. Kortum, R. L. et al. The molecular scaffold kinase suppressor of Ras1 (KSR1) regulates adipogenesis. Mol. Cell. Biol. 25, 7592–7604 (2005).

    Article  CAS  Google Scholar 

  15. Haab, B. B. Advances in protein microarray technology for protein expression and interaction profiling. Curr. Opin. Drug Discov. Devel. 4, 116–123 (2001).

    CAS  PubMed  Google Scholar 

  16. MacBeath, G. Protein microarrays and proteomics. Nature Genet. 32 (Suppl.), 526–532 (2002).

    Article  CAS  Google Scholar 

  17. Wang, C. C. et al. Array-based multiplexed screening and quantitation of human cytokines and chemokines. J. Proteome Res. 1, 337–343 (2002).

    Article  CAS  Google Scholar 

  18. Olle, E. W. et al. Development of an internally controlled antibody microarray. Mol. Cell. Proteomics 4, 1664–1672 (2005).

    Article  CAS  Google Scholar 

  19. Jones, R. B., Gordus, A., Krall, J. A. & MacBeath, G. A quantitative protein interaction network for the ErbB receptors using protein microarrays. Nature 439, 168–174 (2006). Protein microarrays were used to measure the synoptic binding profile of all human SH2 and PTB domains for 61 phosphotyrosine sites in the ERBB1–4 receptors.

    Article  CAS  Google Scholar 

  20. Ptacek, J. et al. Global analysis of protein phosphorylation in yeast. Nature 438, 679–684 (2005).

    Article  CAS  Google Scholar 

  21. Hermann, T. & Patel, D. J. Adaptive recognition by nucleic acid aptamers. Science 287, 820–825 (2000).

    Article  CAS  Google Scholar 

  22. Tombelli, S., Minunni, M. & Mascini, M. Analytical applications of aptamers. Biosens. Bioelectron. 20, 2424–2434 (2005).

    Article  CAS  Google Scholar 

  23. Harlow, E. & Lane, D. Antibodies: a Laboratory Manual. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1988).

    Google Scholar 

  24. Colby, D. W. et al. Engineering antibody affinity by yeast surface display. Methods Enzymol. 388, 348–358 (2004).

    Article  CAS  Google Scholar 

  25. Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003).

    Article  CAS  Google Scholar 

  26. Ong, S. E. et al. Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol. Cell. Proteomics 1, 376–386 (2002).

    Article  CAS  Google Scholar 

  27. Zieske, L. R. A perspective on the use of iTRAQ reagent technology for protein complex and profiling studies. J. Exp. Bot. 57, 1501–1508 (2006).

    Article  CAS  Google Scholar 

  28. Zhang, Y. et al. Time-resolved mass spectrometry of tyrosine phosphorylation sites in the epidermal growth factor receptor signaling network reveals dynamic modules. Mol. Cell. Proteomics 4, 1240–1250 (2005).

    Article  CAS  Google Scholar 

  29. Schmelzle, K. & White, F. M. Phosphoproteomic approaches to elucidate cellular signaling networks. Curr. Opin. Biotechnol. 17, 406–414 (2006).

    Article  CAS  Google Scholar 

  30. Beausoleil, S. A. et al. Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc. Natl Acad. Sci. USA 101, 12130–12135 (2004).

    Article  CAS  Google Scholar 

  31. Moser, K. & White, F. M. Phosphoproteomic analysis of rat liver by high capacity IMAC and LC–MS/MS. J. Proteome Res. 5, 98–104 (2006).

    Article  CAS  Google Scholar 

  32. Nousiainen, M., Sillje, H. H., Sauer, G., Nigg, E. A. & Korner, R. Phosphoproteome analysis of the human mitotic spindle. Proc. Natl Acad. Sci. USA 103, 5391–5396 (2006).

    Article  CAS  Google Scholar 

  33. Janes, K. A. et al. A high-throughput quantitative multiplex kinase assay for monitoring information flow in signaling networks: application to sepsis-apoptosis. Mol. Cell. Proteomics 2, 463–473 (2003).

    Article  CAS  Google Scholar 

  34. Janes, K. A. et al. The response of human epithelial cells to TNF involves an inducible autocrine cascade. Cell 124, 1225–1239 (2006).

    Article  CAS  Google Scholar 

  35. Shults, M. D. & Imperiali, B. Versatile fluorescence probes of protein kinase activity. J. Am. Chem. Soc. 125, 14248–14249 (2003).

    Article  CAS  Google Scholar 

  36. Shults, M. D., Pearce, D. A. & Imperiali, B. Modular and tunable chemosensor scaffold for divalent zinc. J. Am. Chem. Soc. 125, 10591–10597 (2003).

    Article  CAS  Google Scholar 

  37. Shults, M. D., Janes, K. A., Lauffenburger, D. A. & Imperiali, B. A multiplexed homogeneous fluorescence-based assay for protein kinase activity in cell lysates. Nature Methods 2, 277–283 (2005).

    Article  CAS  Google Scholar 

  38. Evans, M. J. & Cravatt, B. F. Mechanism-based profiling of enzyme families. Chem. Rev. 106, 3279–3301 (2006).

    Article  CAS  Google Scholar 

  39. Jessani, N. et al. Carcinoma and stromal enzyme activity profiles associated with breast tumor growth in vivo. Proc. Natl Acad. Sci. USA 101, 13756–13761 (2004).

    Article  CAS  Google Scholar 

  40. Perfetto, S. P., Chattopadhyay, P. K. & Roederer, M. Seventeen-colour flow cytometry: unravelling the immune system. Nature Rev. Immunol. 4, 648–655 (2004).

    Article  CAS  Google Scholar 

  41. Ecker, R. C. & Steiner, G. E. Microscopy-based multicolor tissue cytometry at the single-cell level. Cytometry A 59, 182–190 (2004).

    Article  Google Scholar 

  42. Lahav, G. et al. Dynamics of the p53–Mdm2 feedback loop in individual cells. Nature Genet. 36, 147–150 (2004).

    Article  CAS  Google Scholar 

  43. Nair, V. D., Yuen, T., Olanow, C. W. & Sealfon, S. C. Early single cell bifurcation of pro- and antiapoptotic states during oxidative stress. J. Biol. Chem. 279, 27494–27501 (2004).

    Article  CAS  Google Scholar 

  44. Nelson, D. E. et al. Oscillations in NF-κB signaling control the dynamics of gene expression. Science 306, 704–708 (2004). Live-cell imaging and computational modelling are combined to link pulses of NF-κB nuclear translocation to the level of transcriptional activity.

    Article  CAS  Google Scholar 

  45. Eissing, T. et al. Bistability analyses of a caspase activation model for receptor-induced apoptosis. J. Biol. Chem. 279, 36892–36897 (2004).

    Article  CAS  Google Scholar 

  46. Geva-Zatorsky, N. et al. Oscillations and variability in the p53 system. Mol. Syst. Biol. 2, 2006.0033 (2006). An intensive effort in which live-cell measurements of p53 and MDM2 translocation dynamics in 1,000 single cells are used to constrain mechanistic network models and identify sources of cell-to-cell variability.

  47. Rossi, F. M., Kringstein, A. M., Spicher, A., Guicherit, O. M. & Blau, H. M. Transcriptional control: rheostat converted to on/off switch. Mol. Cell 6, 723–728 (2000).

    Article  CAS  Google Scholar 

  48. Tyas, L., Brophy, V. A., Pope, A., Rivett, A. J. & Tavare, J. M. Rapid caspase-3 activation during apoptosis revealed using fluorescence-resonance energy transfer. EMBO Rep. 1, 266–270 (2000).

    Article  CAS  Google Scholar 

  49. Krutzik, P. O. & Nolan, G. P. Intracellular phospho-protein staining techniques for flow cytometry: monitoring single cell signaling events. Cytometry A 55, 61–70 (2003).

    Article  Google Scholar 

  50. Sachs, K., Perez, O., Pe'er, D., Lauffenburger, D. A. & Nolan, G. P. Causal protein-signaling networks derived from multiparameter single-cell data. Science 308, 523–529 (2005). A novel method for using flow cytometry data to automatically generate network topology models.

    Article  CAS  Google Scholar 

  51. Irish, J. M. et al. Single cell profiling of potentiated phospho-protein networks in cancer cells. Cell 118, 217–228 (2004).

    Article  CAS  Google Scholar 

  52. Perlman, Z. E. et al. Multidimensional drug profiling by automated microscopy. Science 306, 1194–1198 (2004).

    Article  CAS  Google Scholar 

  53. Soen, Y., Mori, A., Palmer, T. D. & Brown, P. O. Exploring the regulation of human neural precursor cell differentiation using arrays of signaling microenvironments. Mol. Syst. Biol. 2, 37 (2006).

    Article  Google Scholar 

  54. Wu, J. Q. & Pollard, T. D. Counting cytokinesis proteins globally and locally in fission yeast. Science 310, 310–314 (2005). Fluorescence microscopy and flow cytometry of yellow-FP-tagged genes was used to determine the absolute global and local concentrations of 40 proteins in the yeast cytokinesis network. This is the largest survey of absolute endogenous-protein concentrations so far.

    Article  CAS  Google Scholar 

  55. Newman, J. R. et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846 (2006).

    Article  CAS  Google Scholar 

  56. Bar-Even, A. et al. Noise in protein expression scales with natural protein abundance. Nature Genet. 38, 636–643 (2006).

    Article  CAS  Google Scholar 

  57. Danuser, G. & Waterman-Storer, C. M. Quantitative fluorescent speckle microscopy of cytoskeleton dynamics. Annu. Rev. Biophys. Biomol. Struct. 35, 361–387 (2006).

    Article  CAS  Google Scholar 

  58. Ponti, A., Machacek, M., Gupton, S. L., Waterman-Storer, C. M. & Danuser, G. Two distinct actin networks drive the protrusion of migrating cells. Science 305, 1782–1786 (2004). Statistical modelling of high-resolution live-cell microscopy data reveals remarkable kinetic differences between subsets of the actin network in migrating cells.

    Article  CAS  Google Scholar 

  59. Sasik, R., Calvo, E. & Corbeil, J. Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model. Bioinformatics 18, 1633–1640 (2002).

    Article  CAS  Google Scholar 

  60. Mashima, T., Naito, M., Fujita, N., Noguchi, K. & Tsuruo, T. Identification of actin as a substrate of ICE and an ICE-like protease and involvement of an ICE-like protease but not ICE in VP-16-induced U937 apoptosis. Biochem. Biophys. Res. Commun. 217, 1185–1192 (1995).

    Article  CAS  Google Scholar 

  61. Sreekumar, A. et al. Profiling of cancer cells using protein microarrays: discovery of novel radiation-regulated proteins. Cancer Res. 61, 7585–7593 (2001).

    CAS  Google Scholar 

  62. Knezevic, V. et al. Proteomic profiling of the cancer microenvironment by antibody arrays. Proteomics 1, 1271–1278 (2001).

    Article  CAS  Google Scholar 

  63. Schweitzer, B. et al. Multiplexed protein profiling on microarrays by rolling-circle amplification. Nature Biotechnol. 20, 359–365 (2002).

    Article  CAS  Google Scholar 

  64. Nielsen, U. B., Cardone, M. H., Sinskey, A. J., MacBeath, G. & Sorger, P. K. Profiling receptor tyrosine kinase activation by using Ab microarrays. Proc. Natl Acad. Sci. USA 100, 9330–9335 (2003).

    Article  Google Scholar 

  65. Paweletz, C. P. et al. Reverse phase protein microarrays which capture disease progression show activation of pro-survival pathways at the cancer invasion front. Oncogene 20, 1981–1989 (2001).

    Article  CAS  Google Scholar 

  66. Chan, S. M., Ermann, J., Su, L., Fathman, C. G. & Utz, P. J. Protein microarrays for multiplex analysis of signal transduction pathways. Nature Med. 10, 1390–1296 (2004).

    Article  CAS  Google Scholar 

  67. Schweitzer, B. et al. Multiplexed protein profiling on microarrays by rolling circle amplification. Nature Biotechnol. 20, 359–365 (2002).

    Article  CAS  Google Scholar 

  68. El-Ali, J., Sorger, P. K. & Jenson, K. F. Cells on chips. Nature 442, 403–411 (2006).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

This work was funded by a systems biology centre grant from the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Douglas A. Lauffenburger.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Related links

Related links

FURTHER INFORMATION

Homepage of the Center for Cell Decision Processes

Glossary

Systematic set

A data set in which all data are collected from the same experimental system in such a way that all data can be directly compared, regardless of when measurements were made.

Signal

Any type of biomolecule that transfers information in a signalling network.

Perturbation

Any experimental condition that is applied to a cell that causes a shift in the cell's behaviour away from the basal state. This includes extracellular stimulation by physiological ligands, inhibition of protein activities by small molecule inhibitors, or alterations in protein-expression levels by RNA interference or overexpression.

Immunoblot

Also known as a western blot. Following gel-based separation by mass, charge or both, proteins are transferred to a membrane and probed with target-specific antibodies.

Enzyme-linked-immunosorbent assay

(ELISA). ELISAs involve adsorbing or coupling capture antibodies to a 96-well plate. Following protein capture, a target protein is detected, either directly (if it was labelled in the sample) or indirectly, through a labelled detection antibody.

Flow cytometry

A method in which fluorescence-intensity data are recorded from particles in solution as they flow past a detector.

Protein profiling

A method that assesses the expression level of a large set of proteins in a specific tissue or cell type. It is analogous to transcriptional profiling by DNA microarrays.

Protein-interaction microarray

A protein microarray that is used to assay protein interactions. In such arrays, the capture reagents are purified proteins or protein domains, and the analyte solution contains a potential binding partner. Detection strategies are the same as in antibody microarrays (direct labelling or sandwich).

Substrate-protein microarray

A protein microarray that is used to identify substrates of enzymes, such as kinases. In this format, the array consists of potential substrates, and the analyte contains a purified enzyme. Modification of the substrates on the array (for example, phosphorylation) by the analyte is detected by radiolabel incorporation or other labelling strategies.

Microfluidic device

A device for fluid handling in which the smallest dimensions of the features (channels, valves and so on) are on the scale of a few to a few hundred micrometers.

Stable-isotope labelling with amino acids in culture

(SILAC). This method labels proteins from different samples with heavy atoms, yielding mass differences of several Daltons between the same peptide from different samples.

Isobaric tags for relative and absolute quantification

(iTRAQ). iTRAQ labels are initially isobaric, ensuring that the same peptides from different samples behave identically in the full mass spectrum (MS) mode, but they fragment to generate marker ions that differ by a single Dalton in tandem MS mode during peptide identification.

Marker ion

An ion that carries the isotope label in the breakdown of a peptide during tandem mass spectrometry analysis.

Chemosensor

In the context of kinase assays, a chemosensor is a substrate peptide that contains the non-natural amino acid Sox, which displays chelation-enhanced fluorescence when the peptide is phosphorylated.

Activity-based protein profiling

(ABPP). A method that uses reactive probes carrying a label that will covalently bind specifically to active enzymes of a certain class. The label is often a fluorophore, enabling visualization and quantification of coupled enzymes on gels, antibody microarrays or in cells. Recently, reactive probes have been labelled with an affinity tag for capture of the coupled enzymes, quantification and identification by mass spectrometry.

Image cytometry

A method that uses microscope optics to collect low-resolution data from cells that are adhered to a slide.

Bayesian network inference

A statistical method for inferring the probable relationships between measured variables.

Data validation

The process of verifying assay accuracy.

Data normalization

The adjustment of measured values to account for possible run-to-run and day-to-day variability in the assays.

Fluorescence speckle microscopy

Speckles that form by the random association of fluorophores with macromolecular structures are tracked by live-cell imaging. The information in the dynamic behaviour of these speckles is converted into a quantitative spatio-temporal readout of cytoskeleton-polymer transport and turnover.

Orthogonal design

A method of validation in which conditions that were previously varied from experiment to experiment in the course of collecting a full data set are varied in a single experiment such that what was previously separated in time now becomes contemporaneous.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Albeck, J., MacBeath, G., White, F. et al. Collecting and organizing systematic sets of protein data. Nat Rev Mol Cell Biol 7, 803–812 (2006). https://doi.org/10.1038/nrm2042

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrm2042

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing