Quantifying ultra-rare pre-leukemic clones via targeted error-corrected sequencing

Young, A L; Wong, T N; Hughes, A E O; Heath, S E; Ley, T J; Link, D C; Druley, T E

doi:10.1038/leu.2015.17

Download PDF

Letter to the Editor
Open access
Published: 03 February 2015

Quantifying ultra-rare pre-leukemic clones via targeted error-corrected sequencing

A L Young^1,2,
T N Wong³,
A E O Hughes^1,2,
S E Heath³,
T J Ley³,
D C Link³ &
…
T E Druley^1,2

Leukemia volume 29, pages 1608–1611 (2015)Cite this article

5973 Accesses
69 Citations
10 Altmetric
Metrics details

Subjects

The quantification of rare clonal and subclonal populations from a heterogeneous DNA sample has multiple clinical and research applications for the study and treatment of leukemia. Specifically, in the hematopoietic compartment, recent reports demonstrate the presence of subclonal variation in normal and malignant hematopoiesis,^1,2 and leukemia is now recognized as an oligoclonal disease.³ Currently, clonal heterogeneity in leukemia is studied using next-generation sequencing (NGS) targeting subclone-specific mutations. With this method, detecting mutations at 2–5% variant allele fraction (VAF) requires costly and time-intensive deep resequencing and identifying lower frequency variants is impractical regardless of sequencing depth. Recently, various methods have been developed to circumvent the error rate of NGS.^{4, 5} These methods tag individual DNA molecules with unique oligonucleotide indexes, which enable error correction after sequencing.

Here we present a direct application of error-corrected sequencing (ECS) to study clonal heterogeneity during leukemogenesis and validate the accuracy of this method with a series of benchmarking experiments. Specifically, we demonstrate the ability of ECS to identify leukemia-associated mutations in banked pre-leukemic blood and bone marrow from patients with either therapy-related acute myeloid leukemia (t-AML) or therapy-related myelodysplastic syndrome (t-MDS). T-AML/t-MDS occurs in 1–10% of individuals who receive alkylator- or epipodophyllotoxin-based chemotherapy or radiation to treat a primary malignancy.⁶ For the seven individuals surveyed in this study, matched leukemia/normal whole-genome sequencing identified the t-AML/t-MDS-specific somatic mutations present at diagnosis. We applied our method for ECS to identify leukemia-specific mutations in four individuals from DNA extracted from blood and bone marrow samples collected years before diagnosis. In a separate study into the role of TP53 mutations in t-AML/t-MDS leukemogenesis, this method was used to identify leukemia-associated mutations at low frequency in samples banked years before diagnosis.⁷ In two cases, subclones were identified below the 1% threshold of detection governed by conventional NGS. These results highlight the ability of targeted ECS to identify clinically silent single-nucleotide variations (SNVs).

We employed ECS by tagging individual DNA molecules with adapters containing 16 bp random oligonucleotide molecular indexes in a manner similar to other reports.^{4, 5, 8} Our implementation of ECS easily targets loci of interest by single or multiplex PCR and inserts seamlessly into the standard NGS library preparation (Supplementary Figure 1, Supplementary Methods). Our only deviations from the standard protocol are ligation of customized adapters containing random indexes instead of the manufacturer’s supplied adapters and a quantitative PCR (qPCR) quantification step before sequencing (Supplementary Table 1). Following sequencing, sequence reads containing the same index and originating from the same molecule are grouped into read families. Sequencing errors are identified by comparing reads within a read family and removed to create an error-corrected consensus sequence (ECCS). We performed a dilution series experiment to assess bias during library preparation and determine the limit of detection for ECS. For this experiment, we spiked DNA from a t-AML sample into control human DNA, which was serially diluted over five orders of magnitude. The experiment was comprised of two technical replicates targeting two separate mutations (20 total independent libraries). The results demonstrate that ECS is quantitative to a VAF of 1:10 000 molecules and provides a highly reproducible digital readout of tumor DNA prevalence in a heterogeneous DNA sample (r² of 0.9999 and 0.9991, Figures 1a and b). We next characterized the error profile based on the wild-type nucleotides included in the dilution series experiment. Variant identification using the ECCSs was 99% specific at a VAF of 0.0016 versus 0.0140 for deep sequencing alone (Figure 1c). We noticed that ECCS errors were heavily biased towards G to T transversions and to a lesser degree C to T transitions (Figure 1d, Supplementary Figure 2), as previously observed.^{4, 9} When separated by substitution type, variants identified from the ECCSs were 99% specific at a VAF of 0.0034 for G to T (C to A) mutations, 0.00020 for C to T (G to A) mutations and 0.000079 for the other eight possible substitutions. Although excess G to T mutations are a known consequence of DNA oxidation leading to 8-oxo-guanine conversion,⁴ the pre-treatment of samples with formamidopyrimidine-DNA glycosylase before PCR amplification did not appreciably improve the error profile of G to T mutations (Supplementary Figure 3).

As proof of principle, we applied ECS to study rare pre-leukemic clonal hematopoiesis in seven individuals who later developed t-AML/t-MDS. Leukemia/normal whole-genome sequencing at diagnosis was used to identify the leukemia-specific somatic mutations in each patient’s malignancy (Supplementary Table 2). We applied targeted ECS to query these 18 different loci in 10 cryopreserved or formalin-fixed paraffin-embedded blood and bone marrow samples that were 9–22-year old and banked up to 12 years before diagnosis (Supplementary Table 3).

We generated ~25 Gb of 150 bp paired-end reads from six Illumina (San Diego, CA, USA) MiSeq runs. We targeted 1–7 somatic mutations per individual (25 mutations spanning 5.5 kb from 15 genes in total) and identified leukemia-specific subclonal populations in four individuals up to 12 years before diagnosis (Table 1). For each sequencing library, we tagged ~2.5 million locus-specific amplicons generated from genomic DNA using high-fidelity PCR with randomly indexed custom adapters. Sequencing errors were removed to create ECCSs as described above. Each ECCS was then aligned to the reference genome for variant calling (Supplementary Figure 1).

Table 1 Patient-specific leukemia-associated somatic mutations identified by ECS

Full size table

Using conventional deep sequencing, we detected t-AML/t-MDS-specific mutations in prior banked samples at variant allele fractions between 0.03 and 0.87 (data not shown). In one individual (UPN 684949), deep sequencing alone was insufficient to distinguish known ASXL1 and U2AF1 mutations from the sequencing errors in samples banked 5 and 3 years before t-MDS diagnosis, respectively (Figures 1e and f). However, ECS identified the L866* nonsense mutation in ASXL1 at a VAF of 0.004 (Figure 1g) and the S34Y missense mutation in U2AF1 at a VAF of 0.009 (Figure 1h). In addition, ECS was able to temporally quantify these mutations from three pre-t-MDS samples banked yearly from 3 to 5 years before diagnosis (Supplementary Figures 4 and 5). In two cases (UPN643006 and UPN942008), only a subset of the variants identified at diagnosis were present in the prior banked sample (Table 1). Specifically, in the UPN643006 sample, banked 12 years before diagnosis, a single-nucleotide deletion in ASXL1 was present at VAF 0.03. But, the G to T substitution in ASXL1, CTT deletion in GATA2 and G to T substitution in U2AF1 were not detectable in this prior banked sample.

Here we present a practical and clinically oriented application for targeted error-corrected NGS utilizing single molecule indexing. This method easily integrates into existing NGS library preparation protocols and enables the quantification of previously undetectable mutations in heterogeneous DNA samples. The only modification to the standard NGS library preparation is the replacement of the stock adapters with our randomly indexed adapters and the addition of a qPCR step before sequencing. The qPCR step limits the number of molecules sequenced, ensuring adequate coverage for each read family. With these two modifications, we achieve highly specific detection for rare mutations. The bioinformatics analysis is straightforward and does not require proprietary algorithms or tools (Supplementary Methods). Our results highlight the ability of this method to identify rare subclonal populations in a heterogeneous biological sample. As applied to t-AML/t-MDS, we show these previously undetectable mutations are present years before diagnosis and fluctuate in prevalence over time.

A clinical application of ECS is to quantify minimal residual disease (MRD). As the genomic characterization of leukemia becomes more readily available, identifying causative genetic lesions and rare therapy-resistant subclones will become increasingly useful for risk stratification, therapeutic selection and disease monitoring. Already, whole-genome sequencing of AML has demonstrated that nearly every case of AML harbors one or more somatic SNVs.¹⁰ These SNVs are more reliable clonal markers of malignancy than cell surface markers, which can change over time. Leveraging this information, conventional NGS was implemented retrospectively to detect MRD harboring leukemia-specific insertions/deletions (indels) as rare as 0.00001 VAF in NPM1¹¹ and 0.0001 VAF in RUNX1.¹² This was possible because indels are only rarely generated erroneously by NGS. Unfortunately, measuring rare leukemia-associated substitutions is limited owing to the relatively high error profile of conventional NGS.¹³ However, ECS can achieve the 1:10 000 limit of detection featured by conventional MRD platforms.¹⁴ For patients whose leukemia lacks suitable markers for conventional MRD, ECS could offer an alternative with comparable sensitivity and specificity that is easy to implement in a clinical sequencing lab. Furthermore, the ability to multiplex targets for ECS enables the surveillance of known mutations and the simultaneous discovery of new somatic mutations. Ongoing work will directly compare gold-standard MRD methods with targeted ECS in patients with and without relapsed leukemia.

References

Holstege H, Pfeiffer W, Sie D, Hulsman M, Nicholas TJ, Lee CC et al. Somatic mutations found in the healthy blood compartment of a 115-yr-old woman demonstrate oligoclonal hematopoiesis. Genome Res 2014; 24: 733–742.
Article CAS PubMed PubMed Central Google Scholar
Walter MJ, Shen D, Ding L, Shao J, Koboldt DC, Chen K et al. Clonal architecture of secondary acute myeloid leukemia. N Engl J Med 2012; 366: 1090–1098.
Article CAS PubMed PubMed Central Google Scholar
Welch JS, Ley TJ, Link DC, Miller CA, Larson DE, Koboldt DC et al. The Origin and Evolution of Mutations in Acute Myeloid Leukemia. Cell 2012; 150: 264–278.
Article CAS PubMed PubMed Central Google Scholar
Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, Loeb LA . Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci USA 2012; 109: 14508–14513.
Article CAS PubMed PubMed Central Google Scholar
Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B . Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci USA 2011; 108: 9530–9535.
Article PubMed PubMed Central Google Scholar
Godley LA, Larson RA . Therapy-related myeloid leukemia. Semin Oncol 2008; 35: 418–429.
Article CAS PubMed PubMed Central Google Scholar
Wong T, Ramsingh G, Young AL, Miller CA, Touma W, Welch JS et al. The role of TP53 mutations in the origin and evolution of therapy-related AML. Nature 2015; 518: 552–555.
Article CAS PubMed Google Scholar
Fu GK, Xu W, Wilhelmy J, Mindrinos MN, Davis RW, Xiao W et al. Molecular indexing enables quantitative targeted RNA sequencing and reveals poor efficiencies in standard library preparations. Proc Natl Acad Sci USA 2014; 111: 1891–1896.
Article CAS PubMed PubMed Central Google Scholar
Lou DI, Hussmann Ja, McBee RM, Acevedo A, Andino R, Press WH et al. High-throughput DNA sequencing errors are reduced by orders of magnitude using circle sequencing. Proc Natl Acad Sci USA 2013; 110: 19872–19877.
Article CAS PubMed PubMed Central Google Scholar
Cancer Genome Atlas Research Network. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N Engl J Med 2013; 368: 2059–2074.
Article Google Scholar
Salipante SJ, Fromm JR, Shendure J, Wood BL, Wu D . Detection of minimal residual disease in NPM1-mutated acute myeloid leukemia by next-generation sequencing. Mod Pathol 2014; 27: 1438–1446.
Article CAS PubMed PubMed Central Google Scholar
Kohlmann a, Nadarajah N, Alpermann T, Grossmann V, Schindela S, Dicker F et al. Monitoring of residual disease by next-generation deep-sequencing of RUNX1 mutations can identify acute myeloid leukemia patients with resistant disease. Leukemia 2014; 28: 129–137.
Article CAS PubMed Google Scholar
Loman NJ, Misra RV, Dallman TJ, Constantinidou C, Gharbia SE, Wain J et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat Biotechnol 2012; 30: 434–439.
Article CAS PubMed Google Scholar
Hourigan CS, Karp JE . Minimal residual disease in acute myeloid leukaemia. Nat Rev Clin Oncol 2013; 10: 460–471.
Article CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Pediatrics, Division of Hematology and Oncology, Washington University School of Medicine, Saint Louis, MO, USA
A L Young, A E O Hughes & T E Druley
Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO, USA
A L Young, A E O Hughes & T E Druley
Department of Medicine, Division of Oncology, Washington University School of Medicine, Saint Louis, MO, USA
T N Wong, S E Heath, T J Ley & D C Link

Authors

A L Young
View author publications
You can also search for this author in PubMed Google Scholar
T N Wong
View author publications
You can also search for this author in PubMed Google Scholar
A E O Hughes
View author publications
You can also search for this author in PubMed Google Scholar
S E Heath
View author publications
You can also search for this author in PubMed Google Scholar
T J Ley
View author publications
You can also search for this author in PubMed Google Scholar
D C Link
View author publications
You can also search for this author in PubMed Google Scholar
T E Druley
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to T E Druley.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies this paper on the Leukemia website

Supplementary information

Supplementary Methods (PDF 44 kb)

Supplementary Tables and Figures (PDF 1046 kb)

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and permissions

About this article

Cite this article

Young, A., Wong, T., Hughes, A. et al. Quantifying ultra-rare pre-leukemic clones via targeted error-corrected sequencing. Leukemia 29, 1608–1611 (2015). https://doi.org/10.1038/leu.2015.17

Download citation

Published: 03 February 2015
Issue Date: July 2015
DOI: https://doi.org/10.1038/leu.2015.17

This article is cited by

The immunome of mobilized peripheral blood stem cells is predictive of long-term outcomes and therapy-related myeloid neoplasms in patients with multiple myeloma undergoing autologous stem cell transplant
- Saurabh Zanwar
- Eapen K. Jacob
- Taxiarchis Kourelis
Blood Cancer Journal (2023)
Error-corrected sequencing strategies enable comprehensive detection of leukemic mutations relevant for diagnosis and minimal residual disease monitoring
- Erin L. Crowgey
- Nitin Mahajan
- Todd E. Druley
BMC Medical Genomics (2020)
Measurable residual disease testing in acute myeloid leukaemia
- C S Hourigan
- R P Gale
- R B Walter
Leukemia (2017)
The Prognostic Significance of Measurable (“Minimal”) Residual Disease in Acute Myeloid Leukemia
- Francesco Buccisano
- Christopher S. Hourigan
- Roland B. Walter
Current Hematologic Malignancy Reports (2017)
Clonal haematopoiesis harbouring AML-associated mutations is ubiquitous in healthy adults
- Andrew L. Young
- Grant A. Challen
- Todd E. Druley
Nature Communications (2016)

Quantifying ultra-rare pre-leukemic clones via targeted error-corrected sequencing

Subjects

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Methods (PDF 44 kb)

Supplementary Tables and Figures (PDF 1046 kb)

Rights and permissions

About this article

Cite this article

This article is cited by

The immunome of mobilized peripheral blood stem cells is predictive of long-term outcomes and therapy-related myeloid neoplasms in patients with multiple myeloma undergoing autologous stem cell transplant

Error-corrected sequencing strategies enable comprehensive detection of leukemic mutations relevant for diagnosis and minimal residual disease monitoring

Measurable residual disease testing in acute myeloid leukaemia

The Prognostic Significance of Measurable (“Minimal”) Residual Disease in Acute Myeloid Leukemia

Clonal haematopoiesis harbouring AML-associated mutations is ubiquitous in healthy adults

Search

Quick links

Subjects

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Methods (PDF 44 kb)

Supplementary Tables and Figures (PDF 1046 kb)

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

The immunome of mobilized peripheral blood stem cells is predictive of long-term outcomes and therapy-related myeloid neoplasms in patients with multiple myeloma undergoing autologous stem cell transplant

Error-corrected sequencing strategies enable comprehensive detection of leukemic mutations relevant for diagnosis and minimal residual disease monitoring

Measurable residual disease testing in acute myeloid leukaemia

The Prognostic Significance of Measurable (“Minimal”) Residual Disease in Acute Myeloid Leukemia

Clonal haematopoiesis harbouring AML-associated mutations is ubiquitous in healthy adults

Search

Quick links