Abstract
Mutation analysis in single-cell genomes is prone to artifacts associated with cell lysis and whole-genome amplification. Here we addressed these issues by developing single-cell multiple displacement amplification (SCMDA) and a general-purpose single-cell-variant caller, SCcaller (https://github.com/biosinodx/SCcaller/). By comparing SCMDA-amplified single cells with unamplified clones from the same population, we validated the procedure as a firm foundation for standardized somatic-mutation analysis in single-cell genomics.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Genomic heterogeneity in pancreatic cancer organoids and its stability with culture
npj Genomic Medicine Open Access 19 December 2022
-
SIEVE: joint inference of single-nucleotide variants and cell phylogeny from single-cell DNA sequencing data
Genome Biology Open Access 30 November 2022
-
Prevalence and mechanisms of somatic deletions in single human neurons during normal aging and in DNA repair disorders
Nature Communications Open Access 07 October 2022
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 per month
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout



Accession codes
Primary accessions
Sequence Read Archive
Referenced accessions
Sequence Read Archive
Change history
13 October 2017
In the version of this article initially published, Lodato, M.A. et al. Science 350, 94–98 (2015) (reference 2) was cited as an example of a single-cell sequencing study with high CG-to-TA transitions that applies heat lysis. However, that work used alkaline lysis on ice (Walsh, C.A. and Lodato, M.A., personal communication); therefore, we have changed the third sentence of the paper from "This pathway may explain the observed excess of such mutations in single neurons2 compared with unamplified neuronal clones3" to "Amplification artifacts could, in general, explain the observed excess of such mutations in single neurons2 compared with unamplified DNA from neuronal clones3." The error has been corrected in the HTML and PDF versions of the article.
01 December 2017
Nat. Methods 14, 491–493 (2017); published online 20 March 2017; corrected after print 13 October 2017 In the version of this article initially published, Lodato, M.A. et al. Science 350, 94–98 (2015) (reference 2) was cited as an example of a single-cell sequencing study with high CG-to-TA transitions that applies heat lysis.
References
Fryxell, K.J. & Zuckerkandl, E. Mol. Biol. Evol. 17, 1371–1383 (2000).
Lodato, M.A. et al. Science 350, 94–98 (2015).
Hazen, J.L. et al. Neuron 89, 1223–1236 (2016).
Lasken, R.S. Biochem. Soc. Trans. 37, 450–453 (2009).
Gundry, M., Li, W., Maqbool, S.B. & Vijg, J. Nucleic Acids Res. 40, 2032–2040 (2012).
Fu, Y. et al. Proc. Natl. Acad. Sci. USA 112, 11923–11928 (2015).
Zong, C., Lu, S., Chapman, A.R. & Xie, X.S. Science 338, 1622–1626 (2012).
McKenna, A. et al. Genome Res. 20, 1297–1303 (2010).
Zafar, H., Wang, Y., Nakhleh, L., Navin, N. & Chen, K. Nat. Methods 13, 505–507 (2016).
Cibulskis, K. et al. Nat. Biotechnol. 31, 213–219 (2013).
Koboldt, D.C. et al. Genome Res. 22, 568–576 (2012).
Behjati, S. et al. Nature 513, 422–425 (2014).
Hanawalt, P.C. & Spivak, G. Nat. Rev. Mol. Cell Biol. 9, 958–970 (2008).
Gundry, M. & Vijg, J. Mutat. Res. 729, 1–15 (2012).
Dong, X. et al. Protocol Exchange http://dx.doi.org/10.1038/protex.2017.061 (2017).
Park, C.H. et al. J. Invest. Dermatol. 123, 1012–1019 (2004).
Falanga, V. et al. J. Invest. Dermatol. 105, 27–31 (1995).
Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).
Abecasis, G.R. et al. Nature 491, 56–65 (2012).
Acknowledgements
This research was supported by the NIH (grants AG017242, AG047200 and AG038072 to J.V.) and the Glenn Foundation for Medical Research (J.V.). We thank H. Choi (Seoul National University) for providing materials.
Author information
Authors and Affiliations
Contributions
J.V., L.Z. and X.D. conceived this study and designed the experiments. L.Z., B.M. and M.L. performed the experiments. X.D. developed the software. X.D. and T.W. analyzed the data. J.V., X.D., L.Z., B.M. T.W. and A.Y.M. wrote the manuscript.
Corresponding author
Ethics declarations
Competing interests
X.D., L.Z., M.L., A.M. and J.V. are cofounders of SingulOmics Corp.
Integrated supplementary information
Supplementary Figure 1 Schematic representation of enrichment of amplification errors through allelic bias.
(a) When there is no allelic amplification bias, amplification errors occurring in the first round of amplification (worst case scenario) would appear in only 12.5% of the sequencing reads, corresponding to 1 out of 8 strands when the error occurs during the first round of amplification of a diploid genome; a true heterozygous SNV (or SNP) would appear in about 50% of the reads. (b) When there is allelic amplification bias, the fraction of reads carrying the amplification error is enriched and could easily affect 50% or more of the reads.
Supplementary Figure 2 Isolation of a single cell by using the CellRaft system.
(a) The raft left of number “0804” contains one single fibroblast (red circle). (b) The target raft containing the single cell next to “0804” was collected into a PCR tube using the collection wand. This left the well empty (red circle). The two scratches in the target raft are caused by the needle of the release device, a part of the CellRaft system. The needle was used to dislodge the raft containing the cell from the CellRaft array. This system essentially precludes the capture of more than one cell.
Supplementary Figure 3 Validation of raft number in the PCR tube.
(a) A 0.2-ml PCR tube with 2.5 μl PBS containing one Raft collected by the CellRaft system (magnified). The small brown dot at the bottom of the PCR tube is the raft (arrow). (b) Two rafts (arrows) were collected into the same PCR tube. This is very clear from the two magnified brown dots in the same PCR. Using this system it is easy to visually check if more than one cell was captured in the same tube.
Supplementary Figure 4 Lorenz curves to assess coverage uniformity.
The diagonal line represents completely uniform amplification. Whole genome amplified cells showed modest locus bias as compared to unamplified bulk DNA.
Supplementary Figure 5 Allelic fraction of heterozygous SNPs as a percentage of sequencing reads affected in kindred clone and single cells.
We took hSNPs identified from the unamplified kindred clone (IL1C) in a randomly selected region (chr1:100,000,000-110,000,000), and examined what percentage of the sequencing reads in the two kindred single cells (IL11 and IL12) contained these hSNPs. Ideally, this should be 50% of the reads containing the variant for all hSNPs tested. Allelic fractions of these hSNPs were plotted as percentage of the sequencing reads for kindred cell IL11 (b), IL12 (c) and IL1C itself (a). There is little bias in the clone, moderate bias in cell IL11 and severe bias in cell IL12. “Densities” of y-axis are kernel density estimates using “density” function in R software with default parameters.
Supplementary Figure 6 Correlation of observed and estimated major allele fraction by ‘leave-100-out cross-validations’.
The cross-validations were performed with cell IL11 and IL12 with two λ, half width of window for smoothing, settings (a-d) for a randomly selected region chr1:100,000,000-110,000,000. The λ denotes half of window width for smoothing (Methods). Amplification bias of IL12 was more correctly predicted than that of IL11 because there is more amplification bias in IL12; hence, known heterozygous SNPs of IL12 are more bias informative.
Supplementary Figure 7 Steps in variant calling by SCcaller.
First, local allelic bias is estimated using a kernel smoother based on hSNPs. Second, SNVs are identified using a likelihood ratio test. Third, SNVs not present in the bulk alignment are designated as true somatic SNVs rather than SNPs.
Supplementary Figure 8 Distribution of log likelihood ratios of variant-calling models.
The ratio between the likelihoods of the null model (L0) and of the heterozygous SNV model (L1) was plotted. The black dashed line indicates cutoff criteria for log likelihood ratios. The cutoff corresponds to an alpha level of 0.01 (dashed line) using a likelihood ratio test. These results indicate that using this test and its criteria, we are able to separate real mutations from artifacts. Real mutations and amplification artifacts were determined by comparing data from single cells to their kindred clone (Fig. 1b). “Densities” of y-axis are kernel density estimates using “density” function in R software with default parameters.
Supplementary Figure 9 Spectra and numbers of somatic SNVs identified with different variant callers.
Mutation spectra of candidate somatic SNVs called by (a) Monovar; (b) MuTect; and (c) VarScan. The results from SCcaller are shown in Fig. 3b The error bars indicate standard deviations. (d) The number of candidate somatic SNVs per cell called by each variant caller. The error bars indicate standard deviations. Sample size n=4, 6 and 2 for the clones, SCMDA and HighTemp MDA respectively.
Supplementary Figure 10 Somatic SNVs in the functional genome.
(a) Enrichment and depletion of somatic SNVs in genomic features. The asterisks indicate a significant depletion (P < 0.01, two-tailed pair-wise t test) compared to the genome average. Data on germline polymorphisms were obtained from the 1000 Genomes Project. The error bars indicate standard deviations. Sample size n=10 for the somatic SNVs, e.g. including 4 clones and 6 single-cells. (b) Mutant genes are less highly expressed than wildtype genes. Average FPKM (Fragments Per Kilobase Of Exon Per Million Fragments Mapped) value (red dashed line) of genes affected by somatic SNVs in their exon regions was compared with the average FPKM values (black line) of 2,000 random gene sets (same number of genes as the mutant gene set). P value was calculated from the permutation (one-tailed). The RNA sequencing data were downloaded from the ENCODE project (ID: ENCFF640FPG and ENCFF704TVE). “Densities” of y-axis are kernel density estimates using “density” function in R software with default parameters.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–10, Supplementary Tables 1–5 and Supplementary Note (PDF 1473 kb)
Supplementary Software
SCcaller (version 1.0) (ZIP 22 kb)
Supplementary Protocol
Single-Cell Multiple Displacement Amplification (SCMDA) Protocol (PDF 143 kb)
Rights and permissions
About this article
Cite this article
Dong, X., Zhang, L., Milholland, B. et al. Accurate identification of single-nucleotide variants in whole-genome-amplified single cells. Nat Methods 14, 491–493 (2017). https://doi.org/10.1038/nmeth.4227
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth.4227
This article is cited by
-
SIEVE: joint inference of single-nucleotide variants and cell phylogeny from single-cell DNA sequencing data
Genome Biology (2022)
-
CellPhy: accurate and fast probabilistic inference of single-cell phylogenies from scDNA-seq data
Genome Biology (2022)
-
SCSilicon: a tool for synthetic single-cell DNA sequencing data generation
BMC Genomics (2022)
-
Somatic genomic changes in single Alzheimer’s disease neurons
Nature (2022)
-
Single-cell analysis of somatic mutations in human bronchial epithelial cells in relation to aging and smoking
Nature Genetics (2022)