Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Functional annotation and network reconstruction through cross-platform integration of microarray data


The rapid accumulation of microarray data translates into a need for methods to effectively integrate data generated with different platforms. Here we introduce an approach, 2nd-order expression analysis, that addresses this challenge by first extracting expression patterns as meta-information from each data set (1st-order expression analysis) and then analyzing them across multiple data sets. Using yeast as a model system, we demonstrate two distinct advantages of our approach: we can identify genes of the same function yet without coexpression patterns and we can elucidate the cooperativities between transcription factors for regulatory network reconstruction by overcoming a key obstacle, namely the quantification of activities of transcription factors. Experiments reported in the literature and performed in our lab support a significant number of our predictions.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Get just this article for as long as you need it


Prices may be subject to local taxes which are calculated during checkout

Figure 1: The expression profiles and 1st-order expression correlation profiles of gene pairs POG1-MPT5 and SDA1-CDC5 over six microarray data sets (data set details in Supplementary Methods online).
Figure 2: Northern blot analysis showing the abundance of different cellular rRNAs in wild-type and ΔYOR309C cells.
Figure 3: Reconstruction of regulatory networks by 2nd-order expression analysis.
Figure 4: Hierarchical clustering of transcription modules based on their average 1st-order expression correlation profiles.


  1. Edgar, R., Domrachev, M. & Lash, A.E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).

    Article  CAS  Google Scholar 

  2. Gollub, J. et al. The Stanford Microarray Database: data access and quality assessment tools. Nucleic Acids Res. 31, 94–96 (2003).

    Article  CAS  Google Scholar 

  3. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).

    Article  CAS  Google Scholar 

  4. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J. & Church, G.M. Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285 (1999).

    Article  CAS  Google Scholar 

  5. Zhou, X., Kao, M.C. & Wong, W.H. Transitive functional annotation by shortest-path analysis of gene expression data. Proc. Natl. Acad. Sci. USA 99, 12783–12788 (2002).

    Article  CAS  Google Scholar 

  6. Rhodes, D.R. et al. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc. Natl. Acad. Sci. USA 101, 9309–9314 (2004).

    Article  CAS  Google Scholar 

  7. Gao, F., Foat, B.C. & Bussemaker, H.J. Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data. BMC Bioinformatics 5, 31 (2004).

    Article  Google Scholar 

  8. Stuart, J.M., Segal, E., Koller, D. & Kim, S.K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).

    Article  CAS  Google Scholar 

  9. Horak, C.E. et al. Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae. Genes Dev. 16, 3017–3033 (2002).

    Article  CAS  Google Scholar 

  10. Martins, L.J. et al. Metalloregulation of FRE1 and FRE2 homologs in Saccharomyces cerevisiae. J. Biol. Chem. 273, 23716–23721 (1998).

    Article  CAS  Google Scholar 

  11. Lee, T.I. et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804 (2002).

    Article  CAS  Google Scholar 

  12. Ito, T. et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001).

    Article  CAS  Google Scholar 

  13. Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).

    Article  CAS  Google Scholar 

  14. Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).

    Article  CAS  Google Scholar 

  15. Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).

    Article  CAS  Google Scholar 

  16. Futcher, B. Transcriptional regulatory networks and the yeast cell cycle. Curr. Opin. Cell Biol. 14, 676–683 (2002).

    Article  CAS  Google Scholar 

  17. Mountain, H.A., Bystrom, A.S. & Korch, C. The general amino acid control regulates MET4, which encodes a methionine-pathway-specific transcriptional activator of Saccharomyces cerevisiae. Mol. Microbiol. 7, 215–228 (1993).

    Article  CAS  Google Scholar 

  18. Zhou, K., Brisco, P.R., Hinkkanen, A.E. & Kohlhaw, G.B. Structure of yeast regulatory gene LEU3 and evidence that LEU3 itself is under general amino acid control. Nucleic Acids Res. 15, 5261–5273 (1987).

    Article  CAS  Google Scholar 

  19. Giaever, G. et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387–391 (2002).

    Article  CAS  Google Scholar 

  20. Primig, M. et al. The core meiotic transcriptome in budding yeasts. Nat. Genet. 26, 415–423 (2000).

    Article  CAS  Google Scholar 

  21. Chu, S. et al. The transcriptional program of sporulation in budding yeast. Science 282, 699–705 (1998).

    Article  CAS  Google Scholar 

  22. Tanay, A., Sharan, R., Kupiec, M. & Shamir, R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc. Natl. Acad. Sci. USA 101, 2981–2986 (2004).

    Article  CAS  Google Scholar 

  23. Segal, E. et al. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 34, 166–176 (2003).

    Article  CAS  Google Scholar 

  24. Bar-Joseph, Z. et al. Computational discovery of gene modules and regulatory networks. Nat. Biotechnol. 21, 1337–1342 (2003).

    Article  CAS  Google Scholar 

  25. Natarajan, K. et al. Transcriptional profiling shows that Gcn4p is a master regulator of gene expression during amino acid starvation in yeast. Mol. Cell. Biol. 21, 4347–4368 (2001).

    Article  CAS  Google Scholar 

  26. Roberts, C.J. et al. Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science 287, 873–880 (2000).

    Article  CAS  Google Scholar 

  27. Hughes, T.R. et al. Functional discovery via a compendium of expression profiles. Cell 102, 109–126 (2000).

    Article  CAS  Google Scholar 

  28. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000).

    Article  CAS  Google Scholar 

  29. Tseng, G. & Wong, W. A Method for Tight Clustering: with Application to Microarray. Proc. 2nd IEEE Computer Society Bioinformatics Conference, 396–397 (2003).

  30. Peng, W.T., Krogan, N.J., Richards, D.P., Greenblatt, J.F. & Hughes, T.R. ESF1 is required for 18S rRNA synthesis in Saccharomyces cerevisiae. Nucleic Acids Res. 32, 1993–1999 (2004).

    Article  CAS  Google Scholar 

Download references


We thank Robert Gentleman for making his computer resources available for part of this project, Timothy Hughes for technical advice and Michelle Arbeitman for sharing her lab space. We also thank two anonymous reviewers for their helpful comments. The work of X.J.Z. was supported by the National Science Foundation grant DMS0090166 to W.H.W., the Faculty Setup Grant from USC and the National Institutes of Health (NIH) grant R01GM067243 to Simon Tavaré. The work of M.-C.J.K was supported by a Howard Hughes Pre-doctoral Fellowship. The work of H.H. was supported by the NIH grant P20CA96470 to W.H.W. and the Faculty Setup Grant from UC Berkeley. The work of W.H.W. was supported by the NIH grant R01HG02341.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Xianghong Jasmine Zhou or Wing Hung Wong.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Table 1

Functional prediction of unknown yeast genes (PDF 18 kb)

Supplementary Table 2

60 derived transcription modules (PDF 15 kb)

Supplementary Methods (PDF 33 kb)

Supplementary Notes (PDF 64 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Zhou, X., Kao, MC., Huang, H. et al. Functional annotation and network reconstruction through cross-platform integration of microarray data. Nat Biotechnol 23, 238–243 (2005).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing