Single-cell RNA-seq can yield valuable insights about the variability within a population of seemingly homogeneous cells. We developed a quantitative statistical method to distinguish true biological variability from the high levels of technical noise in single-cell experiments. Our approach quantifies the statistical significance of observed cell-to-cell variability in expression strength on a gene-by-gene basis. We validate our approach using two independent data sets from Arabidopsis thaliana and Mus musculus.
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. Cell Rep. 2, 666–673 (2012).
Islam, S. et al. Genome Res. 21, 1160–1167 (2011).
Ramsköld, D. et al. Nat. Biotechnol. 30, 777–782 (2012).
Tang, F. et al. Nat. Protoc. 5, 516–535 (2010).
Tang, F. et al. Nat. Methods 6, 377–382 (2009).
Chambers, I. et al. Nature 450, 1230–1234 (2007).
Reynolds, N. et al. Cell Stem Cell 10, 583–594 (2012).
Chang, H.H., Hemberg, M., Barahona, M., Ingber, D.E. & Huang, S. Nature 453, 544–547 (2008).
Toyooka, Y., Shimosato, D., Murakami, K., Takahashi, K. & Niwa, H. Development 135, 909–918 (2008).
Shalek, A.K. et al. Nature 498, 236–240 (2013).
Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. Genome Res. 18, 1509–1517 (2008).
Brady, S.M. et al. Science 318, 801–806 (2007).
Jiang, L. et al. Genome Res. 21, 1543–1551 (2011).
Benjamini, Y. & Hochberg, Y. Stat. Soc. Series B Stat. Methodol. 57, 289–300 (1995).
Clough, S.J. & Bent, A.F. Plant J. 16, 735–743 (1998).
Birnbaum, K. et al. Nat. Methods 2, 615–619 (2005).
Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).
Irizarry, R.A. et al. Biostatistics 4, 249–264 (2003).
Anders, S. & Huber, W. Genome Biol. 11, R106 (2010).
Alexa, A., Rahnenfuhrer, J. & Lengauer, T. Bioinformatics 22, 1600–1607 (2006).
We thank E. Furlong and W. Huber for helpful discussions. We also acknowledge K. Birnbaum (New York University) for kindly providing pWOX5::GFP and pGl2::GFP seed. S.A. acknowledges partial funding from the European Union (FP7-Health, project Radiant); M.G.H. acknowledges the Australian Research Council for present funding. The EMBL Genomics Core Facility provided technical support for this work. We acknowledge A. Surani for the use of the C1 Single-Cell Auto Prep System in his lab and B. Jones for performing the experiment. We also acknowledge A. McKenzie (Medical Research Council Laboratory of Molecular Biology) for the Il13-GFP reporter mice and the Sanger-EBI Single Cell Centre for technical support. We acknowledge the support of European Research Council Starting Grant no. 260507, ThSWITCH.
The authors declare no competing financial interests.
Supplementary Figures 1–14 and Supplementary Notes 1–9 (PDF 8682 kb)
Accounting for technical noise in single-cell RNA-seq experiments – Supplement II. This file contains the R code used to perform the analysis described in the manuscript. (PDF 1938 kb)
Estimated amounts of total RNA in single plant cells. Content of total RNA from single cells was estimated based on the 50 pg HeLa total RNA spike-in. The calculation is based on the assumption that the fraction of polyadenylated RNA is comparable between HeLa and A. thaliana input material. For detailed description refer to the Methods section. (XLSX 8 kb)
List of highly variable genes in GL2 cells. The columns give the gene ID, the gene name, the normalized average read count, the cell with the strongest expression, and, for each cell, the log2 ratio of the cell's expression to the average. (XLSX 134 kb)
List of highly variable genes in QC cells. Columns are the same as in Supplementary Table 2. (XLSX 16 kb)
List of GO categories that are significantly enriched for highly variable genes (Online Methods). (XLSX 10 kb)
Read counts for the 91 mouse immune cells spiked with ERCC spike-ins. Each column corresponds to a cell, each row to gene or an ERCC spike-in molecule. Mouse gene names have been replaces by randomized identifiers (column 1). The second column contains the transcript lengths used for the analysis in Supplementary Note 5. The transcript lengths are computed from Ensembl annotation by taking the union of all exons within a gene, where the exons annotated as “retained introns” and “nonsense mediated decay” are excluded. (XLSX 14402 kb)
Full list of barcoded Illumina PE adapters used for multiplexing of cDNA libraries. The set of adapters used depends on the degree of multiplexing applied to the samples. 4-plex is the lower limit for multiplexing. (XLSX 10 kb)
List of qPCR primers used (XLSX 8 kb)
Read counts for A. thaliana experiments. Raw number of reads mapped to each gene for all samples. (XLSX 6101 kb)
About this article
Cite this article
Brennecke, P., Anders, S., Kim, J. et al. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10, 1093–1095 (2013). https://doi.org/10.1038/nmeth.2645
Pluripotent stem cell-derived models of neurological diseases reveal early transcriptional heterogeneity
Genome Biology (2021)
BMC Genomics (2021)
Nature Biotechnology (2021)
Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis
Nature Genetics (2021)
Single-cell RNA sequencing reveals the mesangial identity and species diversity of glomerular cell transcriptomes
Nature Communications (2021)