Abstract
Underlying every microarray experiment is an experimental question that one would like to address. Finding a useful and satisfactory answer relies on careful experimental design and the use of a variety of data-mining tools to explore the relationships between genes or reveal patterns of expression. While other sections of this issue deal with these lofty issues, this review focuses on the much more mundane but indispensable tasks of 'normalizing' data from individual hybridizations to make meaningful comparisons of expression levels, and of 'transforming' them to select genes for further analysis and data mining.
Your institute does not have access to this article
Relevant articles
Open Access articles citing this article.
-
Multi-staged gene expression profiling reveals potential genes and the critical pathways in kidney cancer
Scientific Reports Open Access 04 May 2022
-
A scalable association rule learning and recommendation algorithm for large-scale microarray datasets
Journal of Big Data Open Access 28 March 2022
-
Midkine release during hemodialysis is predictive of hypervolemia and associates with excess (cardiovascular) mortality in patients with end-stage renal disease: a prospective study
International Urology and Nephrology Open Access 24 February 2022
Access options
Subscribe to Journal
Get full journal access for 1 year
$59.00
only $4.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.




References
Chatterjee, S. & Price, B. Regression Analysis by Example (John Wiley & Sons, New York, 1991).
Tseng, G.C., Oh, M.K., Rohlin, L., Liao, J.C. & Wong, W.H. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 29, 2549–2557 (2001).
Chen, Y., Dougherty, E.R. & Bittner, M.L. Ratio-based decisions and the quantitative analysis of cDNA microarray images. J. Biomed. Optics 2, 364–374 (1997).
Yang, Y.H. et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15 (2002).
Yang, I.V. et al. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol. 3, research0062.1–0062.12 (2002).
Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. J. Amer. Stat. Assoc. 74, 829–836 (1979).
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A. & Vingron, M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18, S96–S104 (2002).
Churchill, G.A. Fundamentals of experimental design for cDNA microarrays. Nature Genet. 32, 490–495 (2002).
Bevington, P.R. & Robinson, D.K. Data Reduction and Error Analysis for the Physical Sciences (McGraw-Hill, New York, 1991).
Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci. USA 95, 14863–14868 (1998).
Wen, X., Fuhrman, S., Michaels, G.S., Carr, D.B., Smith, S., Barker, J.L. & Somogy, R. Large-scale temporal gene expression mapping of central nervous system development. Proc. Natl Acad. Sci. USA 95, 334–339 (1998).
Tamayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Natl Acad. Sci. USA 96, 2907–2912 (1999).
Li, C. & Wong, W. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl Acad. Sci. USA 98, 31–36 (2001).
Ideker, T., Thorsson, V., Siegel, A.F. & Hood, L.E. Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data. J. Comput. Biol. 7, 805–817 (2001).
Rocke, D. & Durbin, B. A model for measurement error for gene expression arrays. J. Comput. Biol. 8, 557–569 (2001).
Stoeckert, C. Microarray databases: standards and ontologies. Nature Genet. 32, 469–473 (2002).
Acknowledgements
The work presented here evolved from looking at a large body of data and would have been much less useful without the contributions of Norman H. Lee, Renae L. Malek, Priti Hegde, Ivana Yang, Shuibang Wang, Yonghong Wang, Simon Kwong, Heenam Kim, Wei Liang, Vasily Sharov, John Braisted, Alex Saeed, Joseph White, Jerry Li, Renee Gaspard, Erik Snesrud, Yan Yu, Emily Chen, Jeremy Hasseman, Bryan Frank, Lara Linford, Linda Moy, Tara Vantoai, Gary Churchill and Roger Bumgarner. J.Q. is supported by grants from the US National Science Foundation, the National Heart, Lung, and Blood Institute, and the National Cancer Institute. The MIDAS software system used for the normalization and data filtering presented here is freely available as either executable or source code from http://www.tigr.org/software, along with the MADAM data-management system, the Spotfinder image-processing software, and the MeV clustering and data-mining tool.
Author information
Authors and Affiliations
Ethics declarations
Competing interests
The author declares no competing financial interests.
Rights and permissions
About this article
Cite this article
Quackenbush, J. Microarray data normalization and transformation. Nat Genet 32, 496–501 (2002). https://doi.org/10.1038/ng1032
Issue Date:
DOI: https://doi.org/10.1038/ng1032
Further reading
-
A scalable association rule learning and recommendation algorithm for large-scale microarray datasets
Journal of Big Data (2022)
-
Multi-staged gene expression profiling reveals potential genes and the critical pathways in kidney cancer
Scientific Reports (2022)
-
Assessing spatial vulnerability of Bangladesh to climate change and extremes: a geographic information system approach
Mitigation and Adaptation Strategies for Global Change (2022)
-
Green building aspects in Bangladesh: A study based on experts opinion regarding climate change
Environment, Development and Sustainability (2022)
-
Midkine release during hemodialysis is predictive of hypervolemia and associates with excess (cardiovascular) mortality in patients with end-stage renal disease: a prospective study
International Urology and Nephrology (2022)