Abstract
The evaluation of the somatic hypermutation of the clonotypic immunoglobulin heavy variable gene has become essential in the therapeutic management in chronic lymphocytic leukemia patients. European Research Initiative on Chronic Lymphocytic Leukemia promotes good practices and standardized approaches to this assay but often they are labor-intensive, technically complex, with limited in scalability. The use of next-generation sequencing in this analysis has been widely tested, showing comparable accuracy and distinct advantages. However, the adoption of the next generation sequencing requires a high sample number (run batching) to be economically convenient, which could lead to a longer turnaround time. Here we present data from nanopore sequencing for the somatic hypermutation evaluation compared to the standard method. Our results show that nanopore sequencing is suitable for immunoglobulin heavy variable gene mutational analysis in terms of sensitivity, accuracy, simplicity of analysis and is less time-consuming. Moreover, our work showed that the development of an appropriate data analysis pipeline could lower the nanopore sequencing error rate attitude.
Similar content being viewed by others
Introduction
The evaluation of the somatic hypermutation (SHM) of the clonotypic immunoglobulin heavy variable (IGHV) gene has become essential in the therapeutic management of chronic lymphocytic leukemia (CLL) patients. In fact, it has been shown to be a robust prognostic marker, stable over time and independent of other clinical and biological parameters, including the disease progression1.
The gold standard method for determining the SHM status is performed in two steps: (a) clonality detection by PCR and capillary electrophoresis (CE); (b) Sanger sequencing (SS) of the clonotypic IGHV gene. The sequencing result is then evaluated for its deviation compared with the closest matched germline IGH gene, using the 2% threshold to discriminate unmutated from mutated status1,2.
Despite the European Research Initiative's efforts in CLL (ERIC) to promote good practices and standardized approaches, this assay is still not uniformly performed in many clinical laboratories due to its limitations, that include labor-intensiveness, technical complexity, and limited scalability.
The recent availability of next-generation sequencing (NGS) technologies offers the chance to develop more standardized methods to unambiguously determine the individual clonal sequences and their relative proportions3,4,5. The use of NGS for SHM analysis has been widely tested, showing comparable accuracy, but distinct advantages of NGS included: the possibility to operate in batch mode, low costs (only running in batch mode), direct clone determination, and a greater sensitivity allowing more than one dominant clonal IGH rearrangement, that has been reported in almost 25% of CLL patients, to be identified6. On the other hand, the adoption of NGS requires a high sample number (batch running) to be economically convenient, leading to a longer turnaround time. Moreover, data analysis and interpretation are more complicated and not yet standardized6.
MinION is the smallest and cheapest NGS third-generation platform currently available, based on nanopore sequencing (NS)7. It can produce two different throughput ranges (1–2 Gb up to 40 Gb) depending on the flowcell used, and library preparation is easy and rapid compared with the currently adopted technologies8,9,10,11. These features make NS technology an excellent candidate for adoption in small labs, even if this approach still suffers from a high error rate, incompatible with clinical demands. On the other hand, this weakness has been considerably reduced thanks to the improvements in sequencing chemistry, the introduction of new base-calling algorithms, and the use of post-sequencing correction tools8,9,10,12,13. Therefore, in the CLL SHM analysis context, NS could offer a convenient alternative in terms of costs, throughput scalability, and ease of use.
Our report presents data on NS analysis for the SHM evaluation in CLL patients, as compared with the SS method. We developed a bioinformatic pipeline for sequence assembly, correction, clonality assessment, and mutational status analysis.
Results
SHM analysis by SS
All samples were evaluated by SS with the leader primers, as indicated by the ERIC recommendations. A second-round evaluation was performed with FR1 primers to ensure a more accurate comparison with the data generated by NS analysis. Final SHM analysis of the 36 samples produced the following results: 27 single (12 unmutated, 15 mutated), and 9 double VDJ recombination (7 productive/unproductive with concordant status, 2 double productive with concordant status). Sample #29 was borderline (V-REGION identity % = 97.80). For all cases it was possible to determine the mutational status, and no difference was detected between the two types of primers (Supplementary Table 2).
SHM analysis by MinION sequencing
A total of three runs was performed, employing three libraries of 12 patients, and three different flongle flowcells, with 74, 65, and 61 active pores, respectively. Each run lasts 24 h and produced respectively: 855.897, 860.106, and 1.251.284 reads. After base-calling, each run generated more than 0.4Gbases. Plotting the distribution of read lengths, we observed a prevalent peak around 0.5 Kb, in agreement with our amplicon size range. Basecalled and demultiplexed data were then analyzed on our NanoIg Pipeline (Fig. 1 and “Material and methods”). Each analysis lasts around 12 h on Intel® CORE™ i7-7700 K CPU @420 GHz 16.0 GB RAM depending on the total of reads.
Comparative evaluation
The SHM analysis results from both methods were compared. Comparative results are summarized in Fig. 2 (details are reported in Supplementary Table 3).
Comparing the final mutational status between SS and NS analysis for each sample (first row Fig. 2), among the 36 samples, 28 (78%) were concordant.
Eight samples (22%) showed a different final mutational status between the two methods. NS analysis on cases #4 and #18 did not produce a VDJ consensus even though clonal VH genes were correctly reported in the clonality plot (respectively V4-39/D2-2/J6 and V3-30/D5-18/J6). In addition, for NS analysis on case #36 the productive rearrangement was not detected (V1-69/D3-16/J3) in the clonality plot or generate a productive consensus.
On the other hand, NS analysis for samples #15, #19, #24, #33, and #34 showed additional VDJ recombinations that were not detected by SS methodology.
To establish the correct mutational state of this samples the additional VDJs rearrangements found, were validated by specific VH-PCR (Fig. 3 and “Discordant data validation” in Material and methods). For samples #15, #19, #24, and #34 the specific VH-PCR showed the actual existence of the additional recombinations not previously detected by SS method, whereas in case #33 the validation attempt was unsuccessful.
Moreover, two patients (#3 and #13) with concordant status showed differences in the number of clonal recombinations detected. In detail, sample #3 showed IGHV4-34 recombination, detected only by NS, whereas case #13 showed an additional IGHV4-34 recombinant that NS did not detect. In both cases, this did not affect the mutational status evaluation and therefore are not reported as discordant in Fig. 2.
In summary, considering the results of validations and the relative changes in the final mutational status, the percentage of successful analysis for both the SS and NS methods was 88.8% (Fig. 2 last two rows).
A side-by-side comparison of the percentage of VH genes similarity found in both methods was further performed, as reported in Fig. 4. Excluding IGHV3-23 of case #5, the difference for the other cases was always within 0.5%.
Discussion
NGS has brought in a new era in medical diagnostics, and IGHV analysis is following this trend. Nowadays, there is a growing demand for shifting the standard IGHV gene SHM analysis in favor of NGS. However, this change must take into account the financial status of laboratories with limited resources and the absence of technical aspects and data interpretation standardization. Regarding the latter point, the ERIC IG network and EuroClonality-NGS Working Group are currently working on this. Therefore, NGS will probably become the IGHV gene SHM assessment method of choice, but traditional SS remains the standard method today. The NS technology has already proven to be a valuable tool in analyzing often mutated genes in CLL9,10. In the IGHV analysis context, our report results demonstrate the non-inferiority of the NS approach to the standard, as in our series, the NS analysis provides equal or better information than SS. In 5 (13.8%) cases, NS revealed multiple productive IGHV subclones. The FR1 and leader primers consensus nature does not allow the best annealing, and this event can further be compromised when the SHM falls in the annealing region. To improve the PCR efficiency, we designed primer pairs annealing in intergenic regions, upstream of the J6 gene and downstream of the specific VH, that would not be involved in the somatic hypermutations events. In this way, we validated all five additional IGH rearrangements (Fig. 3).
Notably, some shows inconsistency between clonality and the consensus IMGT/V-Quest analysis result (Supplementary Figs. 1–3). The reasons for this inconsistency stem essentially from the different references used for IMGT/V-QUEST and our clonality analysis. The IMGT/V-QUEST reference directory is constituted by sets of sequences that contain the V-REGION, D-REGION, and J-REGION alleles, isolated from the functional and ORF allele IMGT reference sequences. Sequences are obtained from clones, subclones or PCR, and do not apply to sequences obtained directly from genomic DNA (http://www.imgt.org/vquest/refseqh.html#VQUEST). In our analysis, instead, we mapped reads on GRCh38/hg38 human genome reference. Moreover, the error-prone nature of reads from nanopore sequencing generates reads dispersion during alignment due to the similarity within the same VH family. Furthermore, consensus generation from NanoIG pipeline includes error correction and allow the correct VDJ rearrangement detection from IMGT/V-Quest analysis.
The circumstance of multiple productive subclones has already been reported in IGHV analysis by NGS6, and it could contribute to better define the CLL biological and prognostic heterogeneity.
NS analysis reported for #04, #18, #13 and #36 was missing an IGHV recombination compared to SS analysis. This loss for cases #04 and #18 occurred during consensus building because the clonality plot reported the same clonotypic variable genes found in the standard method (Supplementary Figs. 1 and 2). The principal reason for consensus-building failure was the various data filtering steps in the pipeline. Whereas in cases #13 and #36, lacking recombination was due to a failure in the amplification step (Supplementary Figs. 2 and 3).
. The use of FR1 primers to build the library for NS analysis represents a major critical point of our work. ERIC guidelines recommend using leader primers for the VDJ amplification and subsequent sequencing, but this primer set is known to have low amplification efficiencies14,15. Moreover, it is well-documented that the advantage of using leader primers is limited to the fact that this strategy could change the mutational status of a small fraction (estimated as 0.4%) of the CLL patients15. In fact, in our series of 36 CLL patients, the IGHV mutational status resulted the same when using leader and FR1 primer sets. In a previous report, FR1 primers were employed for NGS analysis because the leader would have generated a mean length of amplified products close to exceeding the expected read length range16. In the context of NS, the long-read sequencing analysis could have benefited from using the leader primers to increase amplicon length. Unfortunately, the leader primers were unsuccessful when used in multiplex to prepare the library for the NS analysis. Even if it is unlikely that the use of FR1 primers in the NS approach could have jeopardized the results in our set, a future clinical application of this method needs to be improved to include the Leader region in the sequencing.
Comparative analysis of the VH genes percentage of similarity revealed that the sequencing results were almost overlapping between the SS and NS, except in one case (case #5). Nevertheless, in the latter case, the IGHV mutational status was in agreement between the two methods. Therefore, VH genes comparative analysis further confirms the accuracy of NS data generation.
In conclusion, here we show that in IGHV molecular analysis the NS approach is reliable, is not inferior to SS and, enables the detection of small subclones. Our NanoIG pipeline proved to be a helpful tool for improving the sequencing data quality generated by NS. Due to its practicality and low cost (twelve samples were analyzed in 4–5 days and the estimated cost per sample was around 20 Euros), NS technology is potentially an ideal candidate to lead the evaluation of the IGHV mutational status towards a new standard of analysis.
Materials and methods
Patients
Thirty-six newly diagnosed CLL patients were included in this study. Genomic DNA was extracted from peripheral blood using the QIAamp DNA Blood Mini Kit (Qiagen); the DNA concentration and purity were checked using the Qubit 2.0 fluorometer (Life Technologies) and Nanodrop UV–Vis spectrometer (Thermo Fisher Scientific). The local ethics committee of “Azienda Ospedaliero Universitaria Policlinico Bari” (University of Bari) approved the study. Informed consent was obtained from all patients before study inclusion, in accordance with the Declaration of Helsinki. Patients' records/information were anonymized and de-identified before analysis.
IGHV analysis
According to ERIC recommendations2, the IGHV region was amplified using leader primers for clonality and SHM assessment. To better compare data and improve the polymerase-chain-reaction (PCR) performance in the subsequent NS library preparation, we decided to reevaluate all samples with FR1 consensus primers from the European BIOMED-2 collaborative study17.
SS analysis
For each case, two multiplex PCRs were performed using Platinum Taq DNA Polymerase (Invitrogen), with 50 ng of gDNA, labeled V-gene primers, and the J-gene primer17, in a final volume of 25 uL. The amplification results (0.5 μl) were prepared for fragment analysis by CE according to the Seqstudio genetic analyzer user guide. For the SHM assessment, the clonal rearrangement was then reamplified in a new PCR (under the same conditions), loaded on a 2% agarose gel, purified with the QIAquick PCR Purification Kit (Qiagen), and prepared for SS. The IMGT/V-QUEST18,19 and ARResT/AssignSubsets20 tools were used for the IGHV identity assessment and stereotypic subset identification.
NS analysis
Library preparation and sequencing
For each case, a multiplex PCR17 was performed. Because of a high amplification failure rate of leader primers in multiplex strategy (about 30% in our hands), we used the FR1 primers instead.
PCR products were purified with the QIAquick PCR Purification Kit (Qiagen), and the DNA concentration and purity were measured with a Qubit 2.0 fluorometer (Life Technologies) and Nanodrop UV–Vis spectrometer (Thermo Fisher Scientific). In accordance with the ONT Native barcoding genomic DNA protocol (EXP-NBD104), amplicons (350 ng) were end-prepared and barcoded with the ligation of ONT Native Barcodes (NB01–NB12). Equimolar amounts of each barcoded amplicon were then pulled (12 samples each pool). Following the Ligation Sequencing Kit (SQK-LSK109) protocol, 25 ng of the pooled amplicons were prepared for sequencing. After the Platform QC run and the priming of the Flongle flowcell, the sequencing mix was loaded, and a 24-h sequencing run protocol was started (MinIONflowcell: FLO-FLG001).
NanoIG pipeline
Fast5 files produced from sequencing were basecalled, filtered by quality, demultiplexed by MinKnow software, and the resulting fastq pass reads were analyzed on our NanoIG pipeline (see “Results”). The pipeline is available at https://github.com/eziominervini/NanoIg.
The NanoIG pipeline performs both clonality and SHM analysis (Fig. 1) on nanopore data. Reads are first mapped on chromosome 14, and the depth of sequence on immunoglobulin variable region genes is calculated. The clonotypic variable gene is determined from coverage values after filtering. In this study, we set the coverage threshold at 500X. Genomic coordinates of clonal VH genes were then used to select the reads for subsequent VDJ consensus building.
The VDJ consensus is built using canu assembler21 and medaka for correction (https://nanoporetech.github.io/medaka/index.html). Due to the MinION error rate, the first assembly performed by canu is set at a relatively low initial value for acceptable error rate, that is increased until the maximum value or until the assembly achievement. Before medaka correction, assemblies are filtered by supporting reads number.
Final consensus sequences are then used to interrogate IMGT/V-Quest, and results are downloaded. The report is produced by plotting clonality results and tabulating IMGT/V-Quest results of all samples.
Discordant data validation
Alternative PCRs were designed using the same J-gene primer as before17 and specific V-gene primers mapping in the intragenic region downstream of the V-gene to clarify discordant cases (Supplementary Table 1). The same PCR conditions were applied as previously described.
Data availability
This study sequencing data have been submitted to the National Center for Biotechnology Information (NCBI) Short Read Archive (https://www.ncbi.nlm.nih.gov/sra/)under PRJNA731004.
References
Agathangelidis, A. et al. Immunoglobulin gene sequence analysis in chronic lymphocytic leukemia: From patient material to sequence interpretation. J. Vis. Exp. 2018, 1–12. https://doi.org/10.3791/57787 (2018).
Rosenquist, R. et al. Immunoglobulin gene sequence analysis in chronic lymphocytic leukemia: Updated ERIC recommendations. Leukemia 31, 1477–1481. https://doi.org/10.1038/leu.2017.125 (2017).
Gupta, S. K., Viswanatha, D. S. & Patel, K. P. Evaluation of somatic hypermutation status in chronic lymphocytic leukemia (CLL) in the era of next generation sequencing. Front. Cell Dev. Biol. https://doi.org/10.3389/fcell.2020.00357 (2020).
Davi, F. et al. Immunoglobulin gene analysis in chronic lymphocytic leukemia in the era of next generation sequencing. Leukemia 34, 2545–2551. https://doi.org/10.1038/s41375-020-0923-9 (2020).
Scheijen, B. et al. Next-generation sequencing of immunoglobulin gene rearrangements for clonality assessment: A technical feasibility study by EuroClonality-NGS. Leukemia 33, 2227–2240. https://doi.org/10.1038/s41375-019-0508-7 (2019).
Stamatopoulos, B. et al. Targeted deep sequencing reveals clinically relevant subclonal IgHV rearrangements in chronic lymphocytic leukemia. Leukemia 31, 837–845. https://doi.org/10.1038/leu.2016.307 (2017).
Minervini, C. F. et al. Nanopore sequencing in blood diseases: A wide range of opportunities. Front. Genet. https://doi.org/10.3389/fgene.2020.00076 (2020).
Cumbo, C. et al. Nanopore targeted sequencing for rapid gene mutations detection in acute myeloid leukemia. Genes (Basel) 10, 1026 (2019).
Orsini, P. et al. Design and MinION testing of a nanopore targeted gene sequencing panel for chronic lymphocytic leukemia. Sci. Rep. 8, 11798. https://doi.org/10.1038/s41598-018-30330-y (2018).
Minervini, C. F. et al. TP53 gene mutation analysis in chronic lymphocytic leukemia by nanopore MinION sequencing. Diagn. Pathol. 11, 96. https://doi.org/10.1186/s13000-016-0550-y (2016).
Cumbo, C. et al. Nanopore sequencing sheds a light on the FLT3 gene mutations complexity in acute promyelocytic leukemia. Leuk. Lymphoma https://doi.org/10.1080/10428194.2020.1856838 (2020).
Cumbo, C. et al. Genomic BCR-ABL1 breakpoint characterization by a multi-strategy approach for “personalized monitoring” of residual disease in chronic myeloid leukemia patients. Oncotarget 9, 10978–10986. https://doi.org/10.18632/oncotarget.23971 (2018).
Minervini, C. F. et al. Mutational analysis in BCR-ABL1 positive leukemia by deep sequencing based on nanopore MinION technology. Exp. Mol. Pathol. 103, 33–37. https://doi.org/10.1016/j.yexmp.2017.06.007 (2017).
Ghia, P. et al. ERIC recommendations on IGHV gene mutational status analysis in chronic lymphocytic leukemia. Leukemia 21, 1–3. https://doi.org/10.1038/sj.leu.2404457 (2007).
Huet, S. et al. Impact of using leader primers for IGHV mutational status assessment in chronic lymphocytic leukemia. Leukemia 34, 2257–2259. https://doi.org/10.1038/s41375-020-0716-1 (2020).
McClure, R., Mai, M. & McClure, S. High-throughput sequencing using the ion torrent personal genome machine for clinical evaluation of somatic hypermutation status in chronic lymphocytic leukemia. J. Mol. Diagn. 17, 145–154. https://doi.org/10.1016/j.jmoldx.2014.11.006 (2015).
van Dongen, J. J. M. et al. Design and standardization of PCR primers and protocols for detection of clonal immunoglobulin and T-cell receptor gene recombinations in suspect lymphoproliferations: Report of the BIOMED-2 concerted action BMH4-CT98-3936. Leukemia 17, 2257–2317. https://doi.org/10.1038/sj.leu.2403202 (2003).
Brochet, X., Lefranc, M. P. & Giudicelli, V. IMGT/V-QUEST: The highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Res. https://doi.org/10.1093/nar/gkn316 (2008).
Giudicelli, V., Brochet, X. & Lefranc, M. P. IMGT/V-QUEST: IMGT standardized analysis of the immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences. Cold Spring Harb. Protoc. 6, 695–715. https://doi.org/10.1101/pdb.prot5633 (2011).
Bystry, V. et al. ARResT/AssignSubsets: A novel application for robust subclassification of chronic lymphocytic leukemia based on B cell receptor IG stereotypy. Bioinformatics 31, btv456. https://doi.org/10.1093/bioinformatics/btv456 (2015).
Koren, S. et al. Canu: Scalable and accurate long-read assembly via adaptive κ-mer weighting and repeat separation. Genome Res. 27, 722–736. https://doi.org/10.1101/gr.215087.116 (2017).
Acknowledgements
This work was supported by “Associazione Italiana contro le Leucemie (AIL)-BARI”.
Author information
Authors and Affiliations
Contributions
C.F.M and F.A. conceived and designed the study and wrote the manuscript. C.C. performed the PCRs, Barcoding and nanopore experiments. C.F.M. and P.O. performed all bioinformatics analysis. G.T. performed conventional cytogenetic analysis for patients diagnosis. L.A., A.Z., N.C. and M.R.C. conducted diagnostic FISH. A.M., L.I., I.R. and E.P. performed diagnostic molecular analysis. A.G., G.S., P.M., and F.T. provided clinical data. F.A. supervised the manuscript preparation. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Minervini, C.F., Cumbo, C., Redavid, I. et al. Nanopore sequencing approach for immunoglobulin gene analysis in chronic lymphocytic leukemia. Sci Rep 11, 17668 (2021). https://doi.org/10.1038/s41598-021-97198-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-021-97198-3
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.