Abstract
Identification and quantification of low-frequency mutations remain challenging despite improvements in the baseline error rate of next-generation sequencing technologies. Here, we describe a method, termed SaferSeqS, that addresses these challenges by (1) efficiently introducing identical molecular barcodes in the Watson and Crick strands of template molecules and (2) enriching target sequences with strand-specific PCR. The method achieves high sensitivity and specificity and detects variants at frequencies below 1 in 100,000 DNA template molecules with a background mutation rate of <5 × 10–7 mutants per base pair (bp). We demonstrate that it can evaluate mutations in a single amplicon or simultaneously in multiple amplicons, assess limited quantities of cell-free DNA with high recovery of both strands and reduce the error rate of existing PCR-based molecular barcoding approaches by >100-fold.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The sequencing data generated in this study can be obtained from the European Genome–phenome Archive (accession number EGAS00001005048).
Code availability
The SaferSeqS bioinformatics pipeline is implemented in Python. The source code is available in a Zenodo repository (https://doi.org/10.5281/zenodo.4588264).
References
Shendure, J. et al. DNA sequencing at 40: past, present and future. Nature 550, 345–353 (2017).
McMahon, M. A. et al. The HBV drug entecavir—effects on HIV-1 replication and resistance. N. Engl. J. Med. 356, 2614–2621 (2007).
Robins, H. S. et al. Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cells. Blood 114, 4099–4107 (2009).
Miller, W. et al. Sequencing the nuclear genome of the extinct woolly mammoth. Nature 456, 387–390 (2008).
Bruijns, B., Tiggelaar, R. & Gardeniers, H. Massively parallel sequencing techniques for forensics: a review. Electrophoresis 39, 2642–2654 (2018).
Hoang, M. L. et al. Genome-wide quantification of rare somatic mutations in normal human tissues using massively parallel sequencing. Proc. Natl Acad. Sci. USA 113, 9846–9851 (2016).
Chiu, R. W. et al. Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma. Proc. Natl Acad. Sci. USA 105, 20458–20463 (2008).
Mattox, A. K. et al. Applications of liquid biopsies for cancer. Sci. Transl. Med. 11, eaay1984 (2019).
Newman, A. M. et al. Integrated digital error suppression for improved detection of circulating tumor DNA. Nat. Biotechnol. 34, 547–555 (2016).
Dou, Y. et al. Accurate detection of mosaic variants in sequencing data without matched controls. Nat. Biotechnol. 38, 314–319 (2020).
Razavi, P. et al. High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants. Nat. Med. 25, 1928–1937 (2019).
Meynert, A. M., Bicknell, L. S., Hurles, M. E., Jackson, A. P. & Taylor, M. S. Quantifying single nucleotide variant detection sensitivity in exome sequencing. BMC Bioinformatics 14, 195 (2013).
Kinde, I., Wu, J., Papadopoulos, N., Kinzler, K. W. & Vogelstein, B. Detection and quantification of rare mutations with massively parallel sequencing. Proc. Natl Acad. Sci. USA 108, 9530–9535 (2011).
Bettegowda, C. et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci. Transl. Med. 6, 224ra224 (2014).
Cohen, J. D. et al. Combined circulating tumor DNA and protein biomarker-based liquid biopsy for the earlier detection of pancreatic cancers. Proc. Natl Acad. Sci. USA 114, 10202–10207 (2017).
Cohen, J. D. et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359, 926–930 (2018).
Phallen, J. et al. Direct detection of early-stage cancers using circulating tumor DNA. Sci. Transl. Med. 9, eaan2415 (2017).
Springer, S. et al. A multimodality test to guide the management of patients with a pancreatic cyst. Sci. Transl. Med. 11, eaav4772 (2019).
Springer, S. et al. A combination of molecular markers and clinical features improve the classification of pancreatic cysts. Gastroenterology 149, 1501–1510 (2015).
Tie, J. et al. Circulating tumor DNA analysis detects minimal residual disease and predicts recurrence in patients with stage II colon cancer. Sci. Transl. Med. 8, 346ra392 (2016).
Wang, Y. et al. Detection of somatic mutations and HPV in the saliva and plasma of patients with head and neck squamous cell carcinomas. Sci. Transl. Med. 7, 293ra104 (2015).
Wang, Y. et al. Detection of tumor-derived DNA in cerebrospinal fluid of patients with primary tumors of the brain and spinal cord. Proc. Natl Acad. Sci. USA 112, 9704–9709 (2015).
Wang, Y. et al. Diagnostic potential of tumor DNA from ovarian cyst fluid. eLife 5, e15175 (2016).
Springer, S. U. et al. Non-invasive detection of urothelial cancer through the analysis of driver gene mutations and aneuploidy. eLife 7, e32143 (2018).
Schmitt, M. W. et al. Detection of ultra-rare mutations by next-generation sequencing. Proc. Natl Acad. Sci. USA 109, 14508–14513 (2012).
Schmitt, M. W. et al. Sequencing small genomic targets with high efficiency and extreme accuracy. Nat. Methods 12, 423–425 (2015).
Samorodnitsky, E. et al. Evaluation of hybridization capture versus amplicon-based methods for whole-exome sequencing. Hum. Mutat. 36, 903–914 (2015).
Chabon, J. J. et al. Integrating genomic features for non-invasive early lung cancer detection. Nature 580, 245–251 (2020).
Zheng, Z. et al. Anchored multiplex PCR for targeted next-generation sequencing. Nat. Med. 20, 1479–1484 (2014).
Makarov, V. & Laliberte, J. Enhanced adapter ligation. US patent 10,208,338B2 (2019).
Peng, Q. et al. Targeted single primer enrichment sequencing with single end duplex-UMI. Sci. Rep. 9, 4810 (2019).
Vogelstein, B. et al. Cancer genome landscapes. Science 339, 1546–1558 (2013).
Snyder, M. W., Kircher, M., Hill, A. J., Daza, R. M. & Shendure, J. Cell-free DNA comprises an in vivo nucleosome footprint that informs its tissues-of-origin. Cell 164, 57–68 (2016).
Nachmanson, D. et al. Targeted genome fragmentation with CRISPR/Cas9 enables fast and efficient enrichment of small genomic regions and ultra-accurate sequencing with low DNA input (CRISPR–DS). Genome Res. 28, 1589–1599 (2018).
Kennedy, S. R. et al. Detecting ultralow-frequency mutations by duplex sequencing. Nat. Protoc. 9, 2586–2606 (2014).
Rago, C. et al. Serial assessment of human tumor burdens in mice by the analysis of circulating DNA. Cancer Res. 67, 9364–9370 (2007).
Lennon, A. M. et al. Feasibility of blood testing combined with PET–CT to screen for cancer and guide intervention. Science 369, eabb9601 (2020).
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Smith, T., Heger, A. & Sudbery, I. UMI-tools: modeling sequencing errors in unique molecular identifiers to improve quantification accuracy. Genome Res. 27, 491–499 (2017).
Jiang, P. et al. Detection and characterization of jagged ends of double-stranded DNA in plasma. Genome Res. 30, 1144–1153 (2020).
Acknowledgements
We thank the individuals who participated in this study for their courage and generosity. We also thank M. Hoang, S. Sur, A. Mattox, A. Pearlman and members of the Ludwig Center at Johns Hopkins for insightful and helpful scientific discussions. We are grateful to C. Blair and K. Judge for expert technical and administrative assistance and to E. Cook for illustrative assistance. This work was supported by The Lustgarten Foundation for Pancreatic Cancer Research, The Marcus Foundation, The Virginia and D.K. Ludwig Fund for Cancer Research, The Conrad N. Hilton Foundation, The John Templeton Foundation, Medical Research Future Fund Investigator Grant (APP1194970) and National Institutes of Health grants (T32 GM007309, U01 CA230691-01, P50 CA228991, U01 CA200469, R37 CA230400-01, and U01 CA152753).
Author information
Authors and Affiliations
Contributions
J.D.C., N.P., K.W.K. and B.V. conceptualized the SaferSeqS method. J.D.C., C.D., J.C.D., B.J.M., N.P., K.W.K. and B.V. contributed to the study design. J.D.C., M.P., J.P., L.D., N.S. and J.S. performed the experiments. J.T. and P.G. recruited participants and acquired samples. J.D.C. developed the SaferSeqS bioinformatic pipeline and analyzed the data. Mathematical and statistical analyses were conducted by J.D.C. and C.T. N.P., K.W.K. and B.V. supervised the study. J.D.C. and B.V. wrote the manuscript, which was edited and approved by all authors.
Corresponding authors
Ethics declarations
Competing interests
B.V., K.W.K. and N.P. are founders of Thrive and Personal Genome Diagnostics and own equity in Exact Sciences and Personal Genome Diagnostics. K.W.K. and N.P. are consultants to Thrive. K.W.K. and B.V. are consultants to Sysmex and Eisai, and K.W.K., B.V. and N.P. are advisors to CAGE Pharma. B.V. is also a consultant to Catalio, and K.W.K., B.V. and N.P. are consultants to Neophore. C.D. is a consultant to Thrive and is compensated with income and equity. The companies named above, as well as other companies, have licensed previously described technologies related to the work described in this paper from Johns Hopkins University. J.D.C., C.D., B.V., K.W.K., C.T. and N.P. are inventors on some of these technologies. Licenses to these technologies are or will be associated with equity or royalty payments to the inventors as well as to Johns Hopkins University. Additional patent applications on the work described in this paper are being filed by Johns Hopkins University. The terms of all these arrangements are being managed by Johns Hopkins University in accordance with its conflict of interest policies. The remaining authors declare no competing interests.
Additional information
Peer review information Nature Biotechnology thanks Paul Spellman and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1–9 and Note.
Rights and permissions
About this article
Cite this article
Cohen, J.D., Douville, C., Dudley, J.C. et al. Detection of low-frequency DNA variants by targeted sequencing of the Watson and Crick strands. Nat Biotechnol 39, 1220–1227 (2021). https://doi.org/10.1038/s41587-021-00900-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-021-00900-z
This article is cited by
-
An in vitro CRISPR screen of cell-free DNA identifies apoptosis as the primary mediator of cell-free DNA release
Communications Biology (2024)
-
Single duplex DNA sequencing with CODEC detects mutations with high sensitivity
Nature Genetics (2023)
-
Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA
Nature (2023)
-
A longitudinal cohort study of watch and wait in complete clinical responders after chemo-radiotherapy for localised rectal cancer: study protocol
BMC Cancer (2022)
-
Limitations and opportunities of technologies for the analysis of cell-free DNA in cancer diagnostics
Nature Biomedical Engineering (2022)