Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Accounting for technical noise in single-cell RNA-seq experiments

A Corrigendum to this article was published on 30 January 2014

This article has been updated

Abstract

Single-cell RNA-seq can yield valuable insights about the variability within a population of seemingly homogeneous cells. We developed a quantitative statistical method to distinguish true biological variability from the high levels of technical noise in single-cell experiments. Our approach quantifies the statistical significance of observed cell-to-cell variability in expression strength on a gene-by-gene basis. We validate our approach using two independent data sets from Arabidopsis thaliana and Mus musculus.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Dilution series of total A. thaliana RNA.
Figure 2: Technical noise fit and inference of highly variable genes using total HeLa RNA as a spike-in in GL2 cells.
Figure 3: Technical noise fit and inference of highly variable genes using ERCC spike-ins.

Accession codes

Primary accessions

ArrayExpress

Change history

  • 11 October 2013

    In the version of this article initially published online, the dilution for the ERCC Spike-In Control Mix added to the lysis mix was given as 1:40,000 in the Online Methods. The actual dilution used was 1:400. The error has been corrected for the PDF and HTML versions of this article.

References

  1. 1

    Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. Cell Rep. 2, 666–673 (2012).

    CAS  Article  Google Scholar 

  2. 2

    Islam, S. et al. Genome Res. 21, 1160–1167 (2011).

    CAS  Article  Google Scholar 

  3. 3

    Ramsköld, D. et al. Nat. Biotechnol. 30, 777–782 (2012).

    Article  Google Scholar 

  4. 4

    Tang, F. et al. Nat. Protoc. 5, 516–535 (2010).

    CAS  Article  Google Scholar 

  5. 5

    Tang, F. et al. Nat. Methods 6, 377–382 (2009).

    CAS  Article  Google Scholar 

  6. 6

    Chambers, I. et al. Nature 450, 1230–1234 (2007).

    CAS  Article  Google Scholar 

  7. 7

    Reynolds, N. et al. Cell Stem Cell 10, 583–594 (2012).

    CAS  Article  Google Scholar 

  8. 8

    Chang, H.H., Hemberg, M., Barahona, M., Ingber, D.E. & Huang, S. Nature 453, 544–547 (2008).

    CAS  Article  Google Scholar 

  9. 9

    Toyooka, Y., Shimosato, D., Murakami, K., Takahashi, K. & Niwa, H. Development 135, 909–918 (2008).

    CAS  Article  Google Scholar 

  10. 10

    Shalek, A.K. et al. Nature 498, 236–240 (2013).

    CAS  Article  Google Scholar 

  11. 11

    Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. Genome Res. 18, 1509–1517 (2008).

    CAS  Article  Google Scholar 

  12. 12

    Brady, S.M. et al. Science 318, 801–806 (2007).

    CAS  Article  Google Scholar 

  13. 13

    Jiang, L. et al. Genome Res. 21, 1543–1551 (2011).

    CAS  Article  Google Scholar 

  14. 14

    Benjamini, Y. & Hochberg, Y. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).

    Google Scholar 

  15. 15

    Clough, S.J. & Bent, A.F. Plant J. 16, 735–743 (1998).

    CAS  Article  Google Scholar 

  16. 16

    Birnbaum, K. et al. Nat. Methods 2, 615–619 (2005).

    CAS  Article  Google Scholar 

  17. 17

    Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).

    CAS  Article  Google Scholar 

  18. 18

    Irizarry, R.A. et al. Biostatistics 4, 249–264 (2003).

    Article  Google Scholar 

  19. 19

    Anders, S. & Huber, W. Genome Biol. 11, R106 (2010).

    CAS  Article  Google Scholar 

  20. 20

    Alexa, A., Rahnenfuhrer, J. & Lengauer, T. Bioinformatics 22, 1600–1607 (2006).

    CAS  Article  Google Scholar 

Download references

Acknowledgements

We thank E. Furlong and W. Huber for helpful discussions. We also acknowledge K. Birnbaum (New York University) for kindly providing pWOX5::GFP and pGl2::GFP seed. S.A. acknowledges partial funding from the European Union (FP7-Health, project Radiant); M.G.H. acknowledges the Australian Research Council for present funding. The EMBL Genomics Core Facility provided technical support for this work. We acknowledge A. Surani for the use of the C1 Single-Cell Auto Prep System in his lab and B. Jones for performing the experiment. We also acknowledge A. McKenzie (Medical Research Council Laboratory of Molecular Biology) for the Il13-GFP reporter mice and the Sanger-EBI Single Cell Centre for technical support. We acknowledge the support of European Research Council Starting Grant no. 260507, ThSWITCH.

Author information

Affiliations

Authors

Contributions

P.B. designed plant cell experiments, carried out experiments, interpreted results and wrote the paper; S.A. developed the statistical method, performed bioinformatics analyses and wrote the paper; J.K.K. performed bioinformatics analyses and helped write the paper; A.A.K. designed and carried out mouse cell experiments and helped write the paper; X.Z. designed and analyzed mouse cell experiments and helped write the paper; V.P. designed and carried out mouse cell experiments and helped write the paper; B.B. adapted an Illumina sequencing library preparation protocol; V.B. contributed to adapting the Illumina sequencing library preparation protocol and gave advice; S.A.T. designed mouse cell experiments and helped write the paper; J.C.M. contributed to the development of the statistical method, performed bioinformatics analyses, supervised the project and wrote the paper; M.G.H. initiated the project, designed plant cell experiments, interpreted results, supervised the project and wrote the paper.

Corresponding authors

Correspondence to John C Marioni or Marcus G Heisler.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–14 and Supplementary Notes 1–9 (PDF 8682 kb)

Supplementary Software

Accounting for technical noise in single-cell RNA-seq experiments – Supplement II. This file contains the R code used to perform the analysis described in the manuscript. (PDF 1938 kb)

Supplementary Table 1

Estimated amounts of total RNA in single plant cells. Content of total RNA from single cells was estimated based on the 50 pg HeLa total RNA spike-in. The calculation is based on the assumption that the fraction of polyadenylated RNA is comparable between HeLa and A. thaliana input material. For detailed description refer to the Methods section. (XLSX 8 kb)

Supplementary Table 2

List of highly variable genes in GL2 cells. The columns give the gene ID, the gene name, the normalized average read count, the cell with the strongest expression, and, for each cell, the log2 ratio of the cell's expression to the average. (XLSX 134 kb)

Supplementary Table 3

List of highly variable genes in QC cells. Columns are the same as in Supplementary Table 2. (XLSX 16 kb)

Supplementary Table 4

List of GO categories that are significantly enriched for highly variable genes (Online Methods). (XLSX 10 kb)

Supplementary Table 5

Read counts for the 91 mouse immune cells spiked with ERCC spike-ins. Each column corresponds to a cell, each row to gene or an ERCC spike-in molecule. Mouse gene names have been replaces by randomized identifiers (column 1). The second column contains the transcript lengths used for the analysis in Supplementary Note 5. The transcript lengths are computed from Ensembl annotation by taking the union of all exons within a gene, where the exons annotated as “retained introns” and “nonsense mediated decay” are excluded. (XLSX 14402 kb)

Supplementary Table 6

Full list of barcoded Illumina PE adapters used for multiplexing of cDNA libraries. The set of adapters used depends on the degree of multiplexing applied to the samples. 4-plex is the lower limit for multiplexing. (XLSX 10 kb)

Supplementary Table 7

List of qPCR primers used (XLSX 8 kb)

Supplementary Table 8

Read counts for A. thaliana experiments. Raw number of reads mapped to each gene for all samples. (XLSX 6101 kb)

Supplementary Table 9

Transcript lengths for the human genome used for the analysis described in Supplementary Note 5, computed as described above, in the legend for Supplementary Table 5. (XLSX 1227 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Brennecke, P., Anders, S., Kim, J. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10, 1093–1095 (2013). https://doi.org/10.1038/nmeth.2645

Download citation

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing