Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance

Abstract

The concordance of RNA-sequencing (RNA-seq) with microarrays for genome-wide analysis of differential gene expression has not been rigorously assessed using a range of chemical treatment conditions. Here we use a comprehensive study design to generate Illumina RNA-seq and Affymetrix microarray data from the same liver samples of rats exposed in triplicate to varying degrees of perturbation by 27 chemicals representing multiple modes of action (MOAs). The cross-platform concordance in terms of differentially expressed genes (DEGs) or enriched pathways is linearly correlated with treatment effect size (R2≈0.8). Furthermore, the concordance is also affected by transcript abundance and biological complexity of the MOA. RNA-seq outperforms microarray (93% versus 75%) in DEG verification as assessed by quantitative PCR, with the gain mainly due to its improved accuracy for low-abundance transcripts. Nonetheless, classifiers to predict MOAs perform similarly when developed using data from either platform. Therefore, the endpoint studied and its biological complexity, transcript abundance and the genomic application are important factors in transcriptomic research and for clinical and regulatory decision making.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Overview of study design.
Figure 2: Concordance between RNA-seq and microarray.
Figure 3: Transcript abundance–dependent concordance between RNA-seq and microarray.
Figure 4: Concordance of RNA-seq and microarray data with qPCR data.
Figure 5: Cross-platform comparisons of prediction results between two platforms.
Figure 6: Systemic trends of differentially expressed RNA elements.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Sequence Read Archive

References

  1. Hamburg, M.A. Advancing regulatory science. Science 331, 987 (2011).

    Article  Google Scholar 

  2. Chen, M., Zhang, M., Borlak, J. & Tong, W. A decade of toxicogenomic research and its contribution to toxicological science. Toxicol. Sci. 130, 217–228 (2012).

    Article  CAS  Google Scholar 

  3. Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).

    Article  CAS  Google Scholar 

  4. Guo, L. et al. Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nat. Biotechnol. 24, 1162–1169 (2006).

    Article  CAS  Google Scholar 

  5. Shi, L. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 28, 827–838 (2010).

    Article  CAS  Google Scholar 

  6. Fan, X. et al. Consistency of predictive signature genes and classifiers generated using different microarray platforms. Pharmacogenomics J. 10, 247–257 (2010).

    Article  CAS  Google Scholar 

  7. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  Google Scholar 

  8. Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).

    Article  CAS  Google Scholar 

  9. Bottomly, D. et al. Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays. PLoS ONE 6, e17820 (2011).

    Article  CAS  Google Scholar 

  10. Bradford, J.R. et al. A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling. BMC Genomics 11, 282 (2010).

    Article  Google Scholar 

  11. Giorgi, F.M., Del Fabbro, C. & Licausi, F. Comparative study of RNA-seq- and microarray-derived coexpression networks in Arabidopsis thaliana. Bioinformatics 29, 717–724 (2013).

    Article  CAS  Google Scholar 

  12. Malone, J.H. & Oliver, B. Microarrays, deep sequencing and the true measure of the transcriptome. BMC Biol. 9, 34 (2011).

    Article  CAS  Google Scholar 

  13. Merrick, B.A. et al. RNA-seq profiling reveals novel hepatic gene expression pattern in Aflatoxin B1 treated rats. PLoS ONE 8, e61768 (2013).

    Article  Google Scholar 

  14. Nookaew, I. et al. A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae. Nucleic Acids Res. 40, 10084–10097 (2012).

    Article  CAS  Google Scholar 

  15. Raghavachari, N. et al. A systematic comparison and evaluation of high density exon arrays and RNA-seq technology used to unravel the peripheral blood transcriptome of sickle cell disease. BMC Med. Genomics 5, 28 (2012).

    Article  CAS  Google Scholar 

  16. Sirbu, A., Kerr, G., Crane, M. & Ruskin, H.J. RNA-Seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering. PLoS ONE 7, e50986 (2012).

    Article  CAS  Google Scholar 

  17. Su, Z. et al. Comparing next-generation sequencing and microarray technologies in a toxicological study of the effects of aristolochic acid on rat kidneys. Chem. Res. Toxicol. 24, 1486–1493 (2011).

    Article  CAS  Google Scholar 

  18. Subramaniam, S. & Hsiao, G. Gene-expression measurement: variance-modeling considerations for robust data analysis. Nat. Immunol. 13, 199–203 (2012).

    Article  CAS  Google Scholar 

  19. Xiong, Y. et al. RNA sequencing shows no dosage compensation of the active X-chromosome. Nat. Genet. 42, 1043–1047 (2010).

    Article  CAS  Google Scholar 

  20. Xu, W. et al. Human transcriptome array for high-throughput clinical studies. Proc. Natl. Acad. Sci. USA 108, 3707–3712 (2011).

    Article  CAS  Google Scholar 

  21. Łabaj, P.P. et al. Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Bioinformatics 27, i383–i391 (2011).

    Article  Google Scholar 

  22. McIntyre, L.M. et al. RNA-seq: technical variability and sampling. BMC Genomics 12, 293 (2011).

    Article  CAS  Google Scholar 

  23. Mooney, M. et al. Comparative RNA-Seq and microarray analysis of gene expression changes in B-cell lymphomas of Canis familiaris. PLoS ONE 8, e61088 (2013).

    Article  CAS  Google Scholar 

  24. Sultan, M. et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321, 956–960 (2008).

    Article  CAS  Google Scholar 

  25. SEQC/MAQC-III Consortium . A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 10.1038/nbt.2957 (24 August 2014).

  26. Thierry-Mieg, D. & Thierry-Mieg, J. AceView: a comprehensive cDNA-supported gene and transcripts annotation. Genome Biol. 7 (suppl. 1), S12.1–14 (2006).

    Google Scholar 

  27. Irizarry, R.A. et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31, e15 (2003).

    Article  Google Scholar 

  28. Li, C. & Wong, W.H. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 98, 31–36 (2001).

    Article  CAS  Google Scholar 

  29. Smith, G.K. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. (eds. Gentleman, R., Carey, V., Huber, W., Irizarry, R. & Dudoit, S.) 397–420 (Springer, 2005).

  30. Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  Google Scholar 

  31. Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

    Article  CAS  Google Scholar 

  32. Shi, L. et al. Microarray scanner calibration curves: characteristics and implications. BMC Bioinformatics 6 (suppl. 2), S11 (2005).

    Article  Google Scholar 

  33. Kupershmidt, I. et al. Ontology-based meta-analysis of global collections of high-throughput public data. PLoS ONE 5, e13066 (2010).

    Article  Google Scholar 

  34. Lu, J. & Bushel, P.R. Dynamic expression of 3′ UTRs revealed by Poisson hidden Markov modeling of RNA-Seq: implications in gene expression profiling. Gene 527, 616–623 (2013).

    Article  CAS  Google Scholar 

  35. Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

    Article  CAS  Google Scholar 

  36. Baker, S.C. et al. The External RNA Controls Consortium. a progress report. Nat. Methods 2, 731–734 (2005).

    Article  CAS  Google Scholar 

  37. Lovén, J. et al. Revisiting global gene expression analysis. Cell 151, 476–482 (2012).

    Article  Google Scholar 

  38. Ganter, B. et al. Development of a large-scale chemogenomics database to improve drug candidate selection and to understand mechanisms of chemical toxicity and action. J. Biotechnol. 119, 219–244 (2005).

    Article  CAS  Google Scholar 

  39. Liu, W.M. et al. Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 18, 1593–1599 (2002).

    Article  CAS  Google Scholar 

  40. Affymetrix Technical Note. Guide to Probe Logarithmic Intensity Error (PLIER) Estimation (http://www.affymetrix.com/support/technical/technotes/plier_technote.pdf) (2005).

  41. Wu, Z., Irizarry, R.A., Gentleman, R., Martinez-Murillo, F. & Spencer, F. A model-based background adjustment for oligonucleotide expression arrays. J. Am. Stat. Assoc. 99, 909–917 (2004).

    Article  Google Scholar 

  42. Fox, J. & Weisberg, S. An R Companion to Applied Regression (Sage, Thousand Oaks, CA, 2011).

  43. Wingender, E. et al. The TRANSFAC system on gene expression regulation. Nucleic Acids Res. 29, 281–283 (2001).

    Article  CAS  Google Scholar 

  44. Wingender, E. et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res. 28, 316–319 (2000).

    Article  CAS  Google Scholar 

  45. Breslin, T., Krogh, M., Peterson, C. & Troein, C. Signal transduction pathway profiling of individual tumor samples. BMC Bioinformatics 6, 163 (2005).

    Article  Google Scholar 

  46. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank M. Arana and D. Mendrick for their critical review of the manuscript. This research was supported, in part, by the Intramural Research Program of the National Institutes of Health (NIH), National Institute of Environmental Health Sciences (NIEHS) (ES102345-04 and ES023026) and National Library of Medicine. P.P.Ł. and D.P.K. acknowledge support by the Vienna Scientific Cluster (VSC), the Vienna Science and Technology Fund (WWTF), Baxter AG, Austrian Research Centres (ARC) Seibersdorf and the Austrian Centre of Biopharmaceutical Technology (ACBT).

Author information

Authors and Affiliations

Authors

Contributions

W.T. coordinated the consortium study and manuscript preparation. W.T., S.S.A. and C.W. designed the study. C.W. conducted sequencing and qPCR experiments. S.S.A. provided rat tissue samples, gene expression data and contributed to the data analysis. P.R.B. was involved heavily in manuscript preparation and data analysis. B.G. and J.X. conducted the majority of data analysis and prepared various figures and supplementary materials. J.T.M. and D.T.M. constructed the mapping table between microarray and RNA-seq along with other data analysis and interpretation. All the co-authors contributed to various components of the study, including data analysis and preparation of text, figures, tables and supplementary materials.

Corresponding authors

Correspondence to Pierre R Bushel, Scott S Auerbach or Weida Tong.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1, 2, 4–9, 12, 13 and Supplementary Notes 1–6 (PDF 3022 kb)

Supplementary Table 3

RNA-seq data and mapping status summary based on data analysis pipeline P1 (XLSX 31 kb)

Supplementary Table 10

List of transcripts with shortened 3' UTRs detected from the samples treated by chemicals PHE and PIR (XLSX 156 kb)

Supplementary Table 11

List of differentially spliced isoforms detected in samples treated by chemicals PHE and PIR (XLSX 205 kb)

Supplementary Table 14

Master table for mapping Affymetrix probesets to RNA-seq gene annotations (XLS 8113 kb)

Source data

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Gong, B., Bushel, P. et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat Biotechnol 32, 926–932 (2014). https://doi.org/10.1038/nbt.3001

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.3001

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing