Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Repeatability of published microarray gene expression analyses

Abstract

Given the complexity of microarray-based gene expression studies, guidelines encourage transparent design and public data availability. Several journals require public data deposition and several public databases exist. However, not all data are publicly available, and even when available, it is unknown whether the published results are reproducible by independent scientists. Here we evaluated the replication of data analyses in 18 articles on microarray-based gene expression profiling published in Nature Genetics in 2005–2006. One table or figure from each article was independently evaluated by two teams of analysts. We reproduced two analyses in principle and six partially or with some discrepancies; ten could not be reproduced. The main reason for failure to reproduce was data unavailability, and discrepancies were mostly due to incomplete data annotation or specification of data processing and analysis. Repeatability of published microarray studies is apparently limited. More strict publication rules enforcing public data availability and explicit description of data processing and analysis should be considered.

This is a preview of subscription content

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1

Accession codes

Accessions

Gene Expression Omnibus

References

  1. Schena, M. Microarray analysis. (John Wiley & Sons, Hoboken, New Jersey, 2003).

    Google Scholar 

  2. Brazma, A. et al. Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat. Genet. 29, 365–371 (2001).

    CAS  Article  Google Scholar 

  3. Allison, D.B., Cui, X., Page, G.P. & Sabripour, M. Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7, 55–65 (2006).

    CAS  Article  Google Scholar 

  4. Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).

    CAS  Article  Google Scholar 

  5. Anonymous. Minimum compliance for a microarray experiment? Nat. Genet. 38, 1089 (2006).

  6. Ball, C.A. et al. Submission of microarray data to public repositories. PLoS Biol. 2, e317 (2004).

    Article  Google Scholar 

  7. Edgar, R., Domrachev, M. & Lash, A.E. Gene Expression Omnibus: NCBI gene expression and hybridization assay repository. Nucleic Acids Res. 30, 207–210 (2002).

    CAS  Article  Google Scholar 

  8. Brazma, A. et al. Array Express – a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 31, 68–71 (2003).

    CAS  Article  Google Scholar 

  9. Larsson, O. & Sandberg, R. Lack of correct data format and comparability limits future integrative microarray research. Nat. Biotechnol. 24, 1322–1323 (2006).

    CAS  Article  Google Scholar 

  10. Dupuy, A. & Simon, R.M. Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. J. Natl. Cancer Inst. 99, 147–157 (2007).

    Article  Google Scholar 

  11. Ioannidis, J.P., Polyzos, N.P. & Trikalinos, T.A. Selective discussion and transparency in microarray research findings for cancer outcomes. Eur. J. Cancer 43, 1999–2010 (2007).

    Article  Google Scholar 

  12. International Committee of Medical Journal Editors. Uniform Requirements for Manuscripts Submitted to Biomedical Journals: Writing and Editing for Biomedical Publication. <http://www.icmje.org/#prepare> (2008).

  13. Ioannidis, J.P. Molecular evidence-based medicine: evolution and integration of information in the genomic era. Eur. J. Clin. Invest. 37, 340–349 (2007).

    CAS  Article  Google Scholar 

  14. Ingraham, C.R. et al. Abnormal skin, limb and craniofacial morphogenesis in mice deficient for interferon regulatory factor 6 (Irf6). Nat. Genet. 38, 1335–1340 (2006).

    CAS  Article  Google Scholar 

  15. Carroll, J.S. et al. Genome-wide analysis of estrogen receptor binding sites. Nat. Genet. 38, 1289–1297 (2006).

    CAS  Article  Google Scholar 

  16. Dierick, H.A. & Greenspan, R.J. Molecular analysis of flies selected for aggressive behavior. Nat. Genet. 38, 1023–1031 (2006).

    CAS  Article  Google Scholar 

  17. Pickersgill, H. et al. Characterization of the Drosophila melanogaster genome at the nuclear lamina. Nat. Genet. 38, 1005–1014 (2006).

    CAS  Article  Google Scholar 

  18. Tirosh, I., Weinberger, A., Carmi, M. & Barkai, N. A genetic signature of interspecies variations in gene expression. Nat. Genet. 38, 830–834 (2006).

    CAS  Article  Google Scholar 

  19. Loh, Y.H. et al. The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat. Genet. 38, 431–440 (2006).

    CAS  Article  Google Scholar 

  20. Malek, R.L. et al. Physiogenomic resources for rat models of heart, lung and blood disorders. Nat. Genet. 38, 234–239 (2006).

    CAS  Article  Google Scholar 

  21. Mehrabian, M. et al. Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits. Nat. Genet. 37, 1224–1233 (2005).

    CAS  Article  Google Scholar 

  22. Mito, Y., Henikoff, J.G. & Henikoff, S. Genome-scale profiling of histone H3.3 replacement patterns. Nat. Genet. 37, 1090–1097 (2005).

    CAS  Article  Google Scholar 

  23. Gupta, P.B. et al. The melanocyte differentiation program predisposes to metastasis after neoplastic transformation. Nat. Genet. 37, 1047–1054 (2005).

    CAS  Article  Google Scholar 

  24. Frey, B.J. et al. Genome-wide analysis of mouse transcripts using exon microarrays and factor graphs. Nat. Genet. 37, 991–996 (2005).

    CAS  Article  Google Scholar 

  25. Ule, J. et al. Nova regulates brain-specific splicing to shape the synapse. Nat. Genet. 37, 844–852 (2005).

    CAS  Article  Google Scholar 

  26. Schadt, E.E. et al. An integrative genomics approach to infer causal associations between gene expression and disease. Nat. Genet. 37, 710–717 (2005).

    CAS  Article  Google Scholar 

  27. Denver, D.R. et al. The transcriptional consequences of mutation and natural selection in Caenorhabditis elegans. Nat. Genet. 37, 544–548 (2005).

    CAS  Article  Google Scholar 

  28. Van Driessche, N. et al. Epistasis analysis with global transcriptional phenotypes. Nat. Genet. 37, 471–477 (2005).

    CAS  Article  Google Scholar 

  29. Schmid, M. et al. A gene expression map of Arabidopsis thaliana development. Nat. Genet. 37, 501–506 (2005).

    CAS  Article  Google Scholar 

  30. Tanaka, H., Bergstrom, D.A., Yao, M.C. & Tapscott, S.J. Widespread and nonrandom distribution of DNA palindromes in cancer cells provides a structural platform for subsequent gene amplification. Nat. Genet. 37, 320–327 (2005).

    CAS  Article  Google Scholar 

  31. Roepman, P. et al. An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinomas. Nat. Genet. 37, 182–186 (2005).

    CAS  Article  Google Scholar 

  32. Sweet-Cordero, A. et al. An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat. Genet. 37, 48–55 (2005).

    CAS  Article  Google Scholar 

  33. Oleksiak, M.F., Roach, J.L. & Crawford, D.L. Natural variation in cardiac metabolism and gene expression in Fundulus heteroclitus. Nat. Genet. 37, 67–72 (2005).

    CAS  Article  Google Scholar 

  34. Larkin, J.E. et al. Independence and reproducibility across microarray platforms. Nat. Methods 2, 337–344 (2005).

    CAS  Article  Google Scholar 

  35. Chen, J.J. et al. Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data. BMC Bioinformatics 8, 412 (2007).

    Article  Google Scholar 

  36. Miron, M. & Nadon, R. Inferential literacy for experimental high-throughput biology. Trends Genet. 22, 84–89 (2006).

    CAS  Article  Google Scholar 

  37. Shields, R. MIAME, we have a problem. Trends Genet. 22, 65–66 (2006).

    CAS  Article  Google Scholar 

  38. Draghici, S., Khatri, P., Eklund, A.C. & Szallasi, Z. Reliability and reproducibility issues in DNA microarray measurements. Trends Genet. 22, 101–109 (2006).

    CAS  Article  Google Scholar 

  39. Piwowar, H.A., Day, R.S. & Fridsma, D.B. Sharing detailed research data is associated with increased citation rate. PLoS ONE. 2, e308 (2007).

    Article  Google Scholar 

  40. Brazma, A. & Parkinson, S. ArrayExpress service for reviewers/editors of DNA microarray papers. Nat. Biotechnol. 24, 1321–1322 (2006).

    CAS  Article  Google Scholar 

  41. Gentleman, R. Reproducible research: a bioinformatics case study. Stat. Appl. Genet. Mol. Biol. 4, 2 (2005).

    Article  Google Scholar 

  42. Ioannidis, J.P.A. Why most discovered true associations are inflated. Epidemiology 19, 640–648 (2008).

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

The protocol was designed with discussion among all authors. All authors except V.v.N. participated in evaluations of the eligible articles and their analyses. V.v.N. collected all the evaluations and examined if there were discrepancies among teams. J.P.A.I. wrote the manuscript, which was critically revised by all other coauthors. After the first author, the author order is alphabetical.

Corresponding author

Correspondence to John P A Ioannidis.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1 and 2, Supplementary Table 1 (PDF 323 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Ioannidis, J., Allison, D., Ball, C. et al. Repeatability of published microarray gene expression analyses. Nat Genet 41, 149–155 (2009). https://doi.org/10.1038/ng.295

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.295

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing