Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance

Abstract

The concordance of RNA-sequencing (RNA-seq) with microarrays for genome-wide analysis of differential gene expression has not been rigorously assessed using a range of chemical treatment conditions. Here we use a comprehensive study design to generate Illumina RNA-seq and Affymetrix microarray data from the same liver samples of rats exposed in triplicate to varying degrees of perturbation by 27 chemicals representing multiple modes of action (MOAs). The cross-platform concordance in terms of differentially expressed genes (DEGs) or enriched pathways is linearly correlated with treatment effect size (R2≈0.8). Furthermore, the concordance is also affected by transcript abundance and biological complexity of the MOA. RNA-seq outperforms microarray (93% versus 75%) in DEG verification as assessed by quantitative PCR, with the gain mainly due to its improved accuracy for low-abundance transcripts. Nonetheless, classifiers to predict MOAs perform similarly when developed using data from either platform. Therefore, the endpoint studied and its biological complexity, transcript abundance and the genomic application are important factors in transcriptomic research and for clinical and regulatory decision making.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Overview of study design.
Figure 2: Concordance between RNA-seq and microarray.
Figure 3: Transcript abundance–dependent concordance between RNA-seq and microarray.
Figure 4: Concordance of RNA-seq and microarray data with qPCR data.
Figure 5: Cross-platform comparisons of prediction results between two platforms.
Figure 6: Systemic trends of differentially expressed RNA elements.

Accession codes

Primary accessions

Gene Expression Omnibus

Sequence Read Archive

References

  1. 1

    Hamburg, M.A. Advancing regulatory science. Science 331, 987 (2011).

    Article  Google Scholar 

  2. 2

    Chen, M., Zhang, M., Borlak, J. & Tong, W. A decade of toxicogenomic research and its contribution to toxicological science. Toxicol. Sci. 130, 217–228 (2012).

    CAS  Article  Google Scholar 

  3. 3

    Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).

    CAS  Article  Google Scholar 

  4. 4

    Guo, L. et al. Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nat. Biotechnol. 24, 1162–1169 (2006).

    CAS  Article  Google Scholar 

  5. 5

    Shi, L. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 28, 827–838 (2010).

    CAS  Article  Google Scholar 

  6. 6

    Fan, X. et al. Consistency of predictive signature genes and classifiers generated using different microarray platforms. Pharmacogenomics J. 10, 247–257 (2010).

    CAS  Article  Google Scholar 

  7. 7

    Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    CAS  Article  Google Scholar 

  8. 8

    Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).

    CAS  Article  Google Scholar 

  9. 9

    Bottomly, D. et al. Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays. PLoS ONE 6, e17820 (2011).

    CAS  Article  Google Scholar 

  10. 10

    Bradford, J.R. et al. A comparison of massively parallel nucleotide sequencing with oligonucleotide microarrays for global transcription profiling. BMC Genomics 11, 282 (2010).

    Article  Google Scholar 

  11. 11

    Giorgi, F.M., Del Fabbro, C. & Licausi, F. Comparative study of RNA-seq- and microarray-derived coexpression networks in Arabidopsis thaliana. Bioinformatics 29, 717–724 (2013).

    CAS  Article  Google Scholar 

  12. 12

    Malone, J.H. & Oliver, B. Microarrays, deep sequencing and the true measure of the transcriptome. BMC Biol. 9, 34 (2011).

    CAS  Article  Google Scholar 

  13. 13

    Merrick, B.A. et al. RNA-seq profiling reveals novel hepatic gene expression pattern in Aflatoxin B1 treated rats. PLoS ONE 8, e61768 (2013).

    Article  Google Scholar 

  14. 14

    Nookaew, I. et al. A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae. Nucleic Acids Res. 40, 10084–10097 (2012).

    CAS  Article  Google Scholar 

  15. 15

    Raghavachari, N. et al. A systematic comparison and evaluation of high density exon arrays and RNA-seq technology used to unravel the peripheral blood transcriptome of sickle cell disease. BMC Med. Genomics 5, 28 (2012).

    CAS  Article  Google Scholar 

  16. 16

    Sirbu, A., Kerr, G., Crane, M. & Ruskin, H.J. RNA-Seq vs dual- and single-channel microarray data: sensitivity analysis for differential expression and clustering. PLoS ONE 7, e50986 (2012).

    CAS  Article  Google Scholar 

  17. 17

    Su, Z. et al. Comparing next-generation sequencing and microarray technologies in a toxicological study of the effects of aristolochic acid on rat kidneys. Chem. Res. Toxicol. 24, 1486–1493 (2011).

    CAS  Article  Google Scholar 

  18. 18

    Subramaniam, S. & Hsiao, G. Gene-expression measurement: variance-modeling considerations for robust data analysis. Nat. Immunol. 13, 199–203 (2012).

    CAS  Article  Google Scholar 

  19. 19

    Xiong, Y. et al. RNA sequencing shows no dosage compensation of the active X-chromosome. Nat. Genet. 42, 1043–1047 (2010).

    CAS  Article  Google Scholar 

  20. 20

    Xu, W. et al. Human transcriptome array for high-throughput clinical studies. Proc. Natl. Acad. Sci. USA 108, 3707–3712 (2011).

    CAS  Article  Google Scholar 

  21. 21

    Łabaj, P.P. et al. Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Bioinformatics 27, i383–i391 (2011).

    Article  Google Scholar 

  22. 22

    McIntyre, L.M. et al. RNA-seq: technical variability and sampling. BMC Genomics 12, 293 (2011).

    CAS  Article  Google Scholar 

  23. 23

    Mooney, M. et al. Comparative RNA-Seq and microarray analysis of gene expression changes in B-cell lymphomas of Canis familiaris. PLoS ONE 8, e61088 (2013).

    CAS  Article  Google Scholar 

  24. 24

    Sultan, M. et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321, 956–960 (2008).

    CAS  Article  Google Scholar 

  25. 25

    SEQC/MAQC-III Consortium . A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 10.1038/nbt.2957 (24 August 2014).

  26. 26

    Thierry-Mieg, D. & Thierry-Mieg, J. AceView: a comprehensive cDNA-supported gene and transcripts annotation. Genome Biol. 7 (suppl. 1), S12.1–14 (2006).

    Google Scholar 

  27. 27

    Irizarry, R.A. et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31, e15 (2003).

    Article  Google Scholar 

  28. 28

    Li, C. & Wong, W.H. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc. Natl. Acad. Sci. USA 98, 31–36 (2001).

    CAS  Article  Google Scholar 

  29. 29

    Smith, G.K. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. (eds. Gentleman, R., Carey, V., Huber, W., Irizarry, R. & Dudoit, S.) 397–420 (Springer, 2005).

  30. 30

    Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    CAS  Article  Google Scholar 

  31. 31

    Anders, S. & Huber, W. Differential expression analysis for sequence count data. Genome Biol. 11, R106 (2010).

    CAS  Article  Google Scholar 

  32. 32

    Shi, L. et al. Microarray scanner calibration curves: characteristics and implications. BMC Bioinformatics 6 (suppl. 2), S11 (2005).

    Article  Google Scholar 

  33. 33

    Kupershmidt, I. et al. Ontology-based meta-analysis of global collections of high-throughput public data. PLoS ONE 5, e13066 (2010).

    Article  Google Scholar 

  34. 34

    Lu, J. & Bushel, P.R. Dynamic expression of 3′ UTRs revealed by Poisson hidden Markov modeling of RNA-Seq: implications in gene expression profiling. Gene 527, 616–623 (2013).

    CAS  Article  Google Scholar 

  35. 35

    Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

    CAS  Article  Google Scholar 

  36. 36

    Baker, S.C. et al. The External RNA Controls Consortium. a progress report. Nat. Methods 2, 731–734 (2005).

    CAS  Article  Google Scholar 

  37. 37

    Lovén, J. et al. Revisiting global gene expression analysis. Cell 151, 476–482 (2012).

    Article  Google Scholar 

  38. 38

    Ganter, B. et al. Development of a large-scale chemogenomics database to improve drug candidate selection and to understand mechanisms of chemical toxicity and action. J. Biotechnol. 119, 219–244 (2005).

    CAS  Article  Google Scholar 

  39. 39

    Liu, W.M. et al. Analysis of high density expression microarrays with signed-rank call algorithms. Bioinformatics 18, 1593–1599 (2002).

    CAS  Article  Google Scholar 

  40. 40

    Affymetrix Technical Note. Guide to Probe Logarithmic Intensity Error (PLIER) Estimation (http://www.affymetrix.com/support/technical/technotes/plier_technote.pdf) (2005).

  41. 41

    Wu, Z., Irizarry, R.A., Gentleman, R., Martinez-Murillo, F. & Spencer, F. A model-based background adjustment for oligonucleotide expression arrays. J. Am. Stat. Assoc. 99, 909–917 (2004).

    Article  Google Scholar 

  42. 42

    Fox, J. & Weisberg, S. An R Companion to Applied Regression (Sage, Thousand Oaks, CA, 2011).

  43. 43

    Wingender, E. et al. The TRANSFAC system on gene expression regulation. Nucleic Acids Res. 29, 281–283 (2001).

    CAS  Article  Google Scholar 

  44. 44

    Wingender, E. et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res. 28, 316–319 (2000).

    CAS  Article  Google Scholar 

  45. 45

    Breslin, T., Krogh, M., Peterson, C. & Troein, C. Signal transduction pathway profiling of individual tumor samples. BMC Bioinformatics 6, 163 (2005).

    Article  Google Scholar 

  46. 46

    Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank M. Arana and D. Mendrick for their critical review of the manuscript. This research was supported, in part, by the Intramural Research Program of the National Institutes of Health (NIH), National Institute of Environmental Health Sciences (NIEHS) (ES102345-04 and ES023026) and National Library of Medicine. P.P.Ł. and D.P.K. acknowledge support by the Vienna Scientific Cluster (VSC), the Vienna Science and Technology Fund (WWTF), Baxter AG, Austrian Research Centres (ARC) Seibersdorf and the Austrian Centre of Biopharmaceutical Technology (ACBT).

Author information

Affiliations

Authors

Contributions

W.T. coordinated the consortium study and manuscript preparation. W.T., S.S.A. and C.W. designed the study. C.W. conducted sequencing and qPCR experiments. S.S.A. provided rat tissue samples, gene expression data and contributed to the data analysis. P.R.B. was involved heavily in manuscript preparation and data analysis. B.G. and J.X. conducted the majority of data analysis and prepared various figures and supplementary materials. J.T.M. and D.T.M. constructed the mapping table between microarray and RNA-seq along with other data analysis and interpretation. All the co-authors contributed to various components of the study, including data analysis and preparation of text, figures, tables and supplementary materials.

Corresponding authors

Correspondence to Pierre R Bushel or Scott S Auerbach or Weida Tong.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1, 2, 4–9, 12, 13 and Supplementary Notes 1–6 (PDF 3022 kb)

Supplementary Table 3

RNA-seq data and mapping status summary based on data analysis pipeline P1 (XLSX 31 kb)

Supplementary Table 10

List of transcripts with shortened 3' UTRs detected from the samples treated by chemicals PHE and PIR (XLSX 156 kb)

Supplementary Table 11

List of differentially spliced isoforms detected in samples treated by chemicals PHE and PIR (XLSX 205 kb)

Supplementary Table 14

Master table for mapping Affymetrix probesets to RNA-seq gene annotations (XLS 8113 kb)

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Gong, B., Bushel, P. et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat Biotechnol 32, 926–932 (2014). https://doi.org/10.1038/nbt.3001

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing