Review Article | Published:

Microarray data normalization and transformation

Nature Genetics volume 32, pages 496501 (2002) | Download Citation



Underlying every microarray experiment is an experimental question that one would like to address. Finding a useful and satisfactory answer relies on careful experimental design and the use of a variety of data-mining tools to explore the relationships between genes or reveal patterns of expression. While other sections of this issue deal with these lofty issues, this review focuses on the much more mundane but indispensable tasks of 'normalizing' data from individual hybridizations to make meaningful comparisons of expression levels, and of 'transforming' them to select genes for further analysis and data mining.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.


  1. 1.

    & Regression Analysis by Example (John Wiley & Sons, New York, 1991).

  2. 2.

    , , , & Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 29, 2549–2557 (2001).

  3. 3.

    , & Ratio-based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Optics 2, 364–374 (1997).

  4. 4.

    et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15 (2002).

  5. 5.

    et al. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol. 3, research0062.1–0062.12 (2002).

  6. 6.

    Robust locally weighted regression and smoothing scatterplots. J. Amer. Stat. Assoc. 74, 829–836 (1979).

  7. 7.

    , , , & Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).

  8. 8.

    Fundamentals of experimental design for cDNA microarrays. Nature Genet. 32, 490–495 (2002).

  9. 9.

    & Data Reduction and Error Analysis for the Physical Sciences (McGraw-Hill, New York, 1991).

  10. 10.

    , , & Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).

  11. 11.

    , , , , , & Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl Acad. Sci. USA 95, 334–339 (1998).

  12. 12.

    et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci. USA 96, 2907–2912 (1999).

  13. 13.

    & Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA 98, 31–36 (2001).

  14. 14.

    , , & Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J. Comput. Biol. 7, 805–817 (2001).

  15. 15.

    & A model for measurement error for gene expression arrays. J. Comput. Biol. 8, 557–569 (2001).

  16. 16.

    Microarray databases: standards and ontologies. Nature Genet. 32, 469–473 (2002).

Download references


The work presented here evolved from looking at a large body of data and would have been much less useful without the contributions of Norman H. Lee, Renae L. Malek, Priti Hegde, Ivana Yang, Shuibang Wang, Yonghong Wang, Simon Kwong, Heenam Kim, Wei Liang, Vasily Sharov, John Braisted, Alex Saeed, Joseph White, Jerry Li, Renee Gaspard, Erik Snesrud, Yan Yu, Emily Chen, Jeremy Hasseman, Bryan Frank, Lara Linford, Linda Moy, Tara Vantoai, Gary Churchill and Roger Bumgarner. J.Q. is supported by grants from the US National Science Foundation, the National Heart, Lung, and Blood Institute, and the National Cancer Institute. The MIDAS software system used for the normalization and data filtering presented here is freely available as either executable or source code from, along with the MADAM data-management system, the Spotfinder image-processing software, and the MeV clustering and data-mining tool.

Author information


  1. The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, Maryland 20850, USA

    • John Quackenbush


  1. Search for John Quackenbush in:

Competing interests

The author declares no competing financial interests.

About this article

Publication history