Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Research Briefing
  • Published:

Normalizing cancer RNA-seq data for library size, tumor purity and batch effects

Accurate identification and effective removal of unwanted variation is essential to derive meaningful biological results from large and complex RNA-seq studies. Technical replicates together with negative and positive control genes are key tools for carrying out this task. We show how to proceed when technical replicates are unavailable.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: RUV-III-PRPS improves normalization of TCGA RNA-seq data.

References

  1. Gauss, C. F. & Stewart, G. W. Theory of the combination of observations least subject to errors, Part One, Part Two, Supplement. Classics in Applied Mathematics https://doi.org/10.1137/1.9781611971248 (SIAM, 1995). English translation of Gauss’s classic 1823 work in which, amongst much else, systematic errors are noted.

  2. Ku, H. H. Precision Measurement and Calibration. Volume 1. Statistical Concepts and Procedures (National Bureau of Standards, 1969). A collection of papers dealing with random and systematic errors in the context of the art and science of measurement.

  3. Leek, J. T. et al. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 11, 733–739 (2010). A review article whose title says it all.

    Article  CAS  Google Scholar 

  4. Molania, R. et al. A new normalization for Nanostring nCounter gene expression data. Nucleic Acids Res. 47, 6073–6083 (2019). This paper presents RUV-III and includes some examples using technical replicates and others using pseudo-replicates.

    Article  CAS  Google Scholar 

  5. Vallejos, C. A. et al. Normalizing single-cell RNA sequencing data: challenges and opportunities. Nat. Methods 14, 565–571 (2017). A critical review of the task of normalization in the context of single cell RNA-seq, with much relevance to bulk RNA-seq.

    Article  CAS  Google Scholar 

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Molania, R. et al. Removing unwanted variation from large-scale RNA sequencing data with PRPS. Nat. Biotechnol. https://doi.org/10.1038/s41587-022-01440-w (2022)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Normalizing cancer RNA-seq data for library size, tumor purity and batch effects. Nat Biotechnol 41, 27–28 (2023). https://doi.org/10.1038/s41587-022-01441-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41587-022-01441-9

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics