Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Microarray data normalization and transformation

Abstract

Underlying every microarray experiment is an experimental question that one would like to address. Finding a useful and satisfactory answer relies on careful experimental design and the use of a variety of data-mining tools to explore the relationships between genes or reveal patterns of expression. While other sections of this issue deal with these lofty issues, this review focuses on the much more mundane but indispensable tasks of 'normalizing' data from individual hybridizations to make meaningful comparisons of expression levels, and of 'transforming' them to select genes for further analysis and data mining.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: An R-I plot displays the log2(Ri/Gi) ratio for each element on the array as a function of the log10(Ri*Gi) product intensities and can reveal systematic intensity-dependent effects in the measured log2(ratio) values.
Figure 2: Application of local (pen group) lowess can correct for both systematic variation as a function of intensity and spatial variation between spotting pens on a DNA microarray.
Figure 3: The use of replicates can help eliminate questionable or inconsistent data from further analysis.
Figure 4: Local variation as a function of intensity can be used to identify differentially expressed genes by calculating an intensity-dependent Z-score.

Similar content being viewed by others

References

  1. Chatterjee, S. & Price, B. Regression Analysis by Example (John Wiley & Sons, New York, 1991).

    Google Scholar 

  2. Tseng, G.C., Oh, M.K., Rohlin, L., Liao, J.C. & Wong, W.H. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 29, 2549–2557 (2001).

    Article  CAS  Google Scholar 

  3. Chen, Y., Dougherty, E.R. & Bittner, M.L. Ratio-based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Optics 2, 364–374 (1997).

    Article  CAS  Google Scholar 

  4. Yang, Y.H. et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15 (2002).

    Article  Google Scholar 

  5. Yang, I.V. et al. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol. 3, research0062.1–0062.12 (2002).

    Google Scholar 

  6. Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. J. Amer. Stat. Assoc. 74, 829–836 (1979).

    Article  Google Scholar 

  7. Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).

    Article  Google Scholar 

  8. Churchill, G.A. Fundamentals of experimental design for cDNA microarrays. Nature Genet. 32, 490–495 (2002).

    Article  CAS  Google Scholar 

  9. Bevington, P.R. & Robinson, D.K. Data Reduction and Error Analysis for the Physical Sciences (McGraw-Hill, New York, 1991).

    Google Scholar 

  10. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).

    Article  Google Scholar 

  11. Wen, X., Fuhrman, S., Michaels, G.S., Carr, D.B., Smith, S., Barker, J.L. & Somogy, R. Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl Acad. Sci. USA 95, 334–339 (1998).

    Article  CAS  Google Scholar 

  12. Tamayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci. USA 96, 2907–2912 (1999).

    Article  CAS  Google Scholar 

  13. Li, C. & Wong, W. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA 98, 31–36 (2001).

    Article  CAS  Google Scholar 

  14. Ideker, T., Thorsson, V., Siegel, A.F. & Hood, L.E. Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J. Comput. Biol. 7, 805–817 (2001).

    Article  Google Scholar 

  15. Rocke, D. & Durbin, B. A model for measurement error for gene expression arrays. J. Comput. Biol. 8, 557–569 (2001).

    Article  CAS  Google Scholar 

  16. Stoeckert, C. Microarray databases: standards and ontologies. Nature Genet. 32, 469–473 (2002).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The work presented here evolved from looking at a large body of data and would have been much less useful without the contributions of Norman H. Lee, Renae L. Malek, Priti Hegde, Ivana Yang, Shuibang Wang, Yonghong Wang, Simon Kwong, Heenam Kim, Wei Liang, Vasily Sharov, John Braisted, Alex Saeed, Joseph White, Jerry Li, Renee Gaspard, Erik Snesrud, Yan Yu, Emily Chen, Jeremy Hasseman, Bryan Frank, Lara Linford, Linda Moy, Tara Vantoai, Gary Churchill and Roger Bumgarner. J.Q. is supported by grants from the US National Science Foundation, the National Heart, Lung, and Blood Institute, and the National Cancer Institute. The MIDAS software system used for the normalization and data filtering presented here is freely available as either executable or source code from http://www.tigr.org/software, along with the MADAM data-management system, the Spotfinder image-processing software, and the MeV clustering and data-mining tool.

Author information

Authors and Affiliations

Authors

Ethics declarations

Competing interests

The author declares no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Quackenbush, J. Microarray data normalization and transformation. Nat Genet 32 (Suppl 4), 496–501 (2002). https://doi.org/10.1038/ng1032

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng1032

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing