Comparative assessment of NOIR-SS and ddPCR for ctDNA detection of EGFR L858R mutations in advanced L858R-positive lung adenocarcinomas

Genotyping epidermal growth factor receptor (EGFR) is an essential process to indicate lung adenocarcinoma patients for the most appropriate treatment. Liquid biopsy using circulating tumor DNA (ctDNA) potentially complements the use of tumor tissue biopsy for identifying genotype-specific mutations in cancer cells. We assessed the performance of a high-fidelity sequencing method that uses molecular barcodes called the nonoverlapping integrated read sequencing system (NOIR-SS) for detecting EGFR L858R-mutated alleles in 33 advanced or recurrent patients with L858R mutation-positive lung adenocarcinoma revealed by matched tissue biopsy. We compared NOIR-SS with site-specific droplet digital PCR (ddPCR), which was taken as the reference, in terms of sensitivity and ability to quantify L858R variant allele fractions (VAFs). NOIR-SS and ddPCR had sensitivities of 87.9% (29/33) and 78.8% (26/33) for detecting L858R alleles, respectively. The VAFs measured by each assay were strongly correlated. Notably, one specimen was positive with a VAF of 30.12% for NOIR-SS but marginally positive with that of 0.05% for ddPCR because of a previously poorly recognized mechanism: two-base substitution-induced L858R (c.2573_2574delinsGA). These results indicate that NOIR-SS is a useful method for detecting ctDNA, potentially overcoming a limitation of ddPCR which highly depends on the binding ability of primers to specific targeting sequences.

EGFR L858R mutation testing in ctDNA using NOIR-SS. cfDNA was successfully isolated from all plasma samples obtained before EGFR-TKI treatment initiation, with a median yield of 75 (range, 22.2-822.0) ng (Supplementary Table S1 online). After validating the excellent specificity of 100% using plasma samples from 12 healthy controls as well as the ability to precisely quantify the positive control standards ( Supplementary  Fig. S1 online), NOIR-SS was applied for the detection of L858R-positive ctDNA alleles in patient samples. The average depth of coverage was 594,210 reads, and the average uniformity of coverage was 80.1%, ensuring a sufficiently deep read and completeness of coverage. L858R mutations were detected in 29 [87.9%; 95% confidence interval (CI), 71. .6%] of the 33 cases. Table 2 shows the L858R variant allele fractions (VAFs) in 33 cases. We next evaluated the association between L858R VAFs assessed by NOIR-SS and clinical parameters. Patients with bone metastasis (P = 0.043; Fig. 1a) or liver metastasis (P = 0.026; Fig. 1b) had significantly higher VAF levels than those without such metastases. Regarding the total number of metastatic organs, patients who had at least three metastatic organs tended to have higher VAFs than those with less than three metastatic organs (P = 0.059; Fig. 1c). There were no significant differences in the VAF levels according to the presence of metastasis at other sites. Moreover, there were no significant differences in the VAF levels based on clinical factors such as performance status, T factor of the TNM classification, or smoking status. Of the four patients in whom L858R was not detected by NOIR-SS, two had postoperative recurrent diseases, all had a low T factor (T1 or T2), and only one patient had distant metastasis (in the brain), suggesting low tumor burden.

Comparison between NOIR-SS and ddPCR for L858R detection in ctDNA.
Along with NOIR-SS, ddPCR was applied to detect L858R mutations in all samples, because ddPCR is an established technique to sensitively genotype ctDNA with absolute quantification 32 . L858R-mutant alleles were detected in 26 (78.8%; 95% CI, 61.1-91.0%) of the 33 cases. There was no statistically significant difference in sensitivity between the two assays (P = 0.73). Table 2 shows the L858R VAFs identified by ddPCR. As shown in Fig. 2a, the VAFs measured by the two assays were strongly positively correlated (ρ = 0.90; 95% CI, 0.81-0.95; P < 0.0001). The Bland-Altman analysis revealed that the bias was -0.37 (standard deviation of bias, 7.4) and the 95% limits of agreement was − 14.9 to 14.2 (Fig. 2b). Notably, the L858R mutant allele was detected at a relatively high allele fraction (30.12%) by NOIR-SS but was only marginally positive (0.05%) by ddPCR (Fig. 3a [left]) in patient No. 27 (Table 2). In this case, NOIR-SS revealed that the tumor had a L858R mutation caused by a two-base substitution (Fig. 3b). Presumably, ddPCR could not well detect the L858R allele in this case because the mutant allelespecific MGB probe specifically bound to the c.2573T > G targeting sequence but could not efficiently bind to the two-base substitution-positive (c.2573_2574delinsGA) target DNA sequence (Fig. 3c), which was supported by the unique four droplet clusters below the threshold for the FAM (L858R) channel ( Fig. 3a [left]). When this case was removed from the correlation analysis as an outlier (indicated as arrows in Fig. 2a and b), the correlation coefficient increased to 0.97 (95% CI, 0.94-0.99; P < 0.0001). www.nature.com/scientificreports/

Discussion
The present study assessed the detection sensitivity of L858R ctDNA using NOIR-SS in patients with advanced lung adenocarcinoma whose tumors had been genotyped as L858R-positive while using orthogonal ddPCR as a reference. Using matched tissue biopsies as the reference standard, we showed that the sensitivity of NOIR-SS was comparable with that of site-specific ddPCR and that VAFs measured by both assays were highly correlated. Furthermore, we reported a unique case that showed discordant results between the two assays. This case was positive for NOIR-SS with relatively high VAF but only marginally positive for ddPCR because of a mechanism that was previously not well characterized: two-base substitution-induced L858R. NOIR-SS enables the accurate molecular numbering and detection of rare cancer variants via a novel target sequencing method that uses adaptor ligation to add barcode sequences using linear amplification; this process eliminates errors introduced during the early cycles of PCR, and monitors and removes erroneous barcode tags using a bioinformatic variant filter called CV78 filtering 30,31 . The main innovation of the NOIR-SS assay is a procedure to remove artifactual errors on the barcode itself. Incorporation of an error on the barcode sequence generates another erroneous molecular identifier and leads to the overestimation of molecular count of mutant DNA or artifactual substitution by DNA damage. Removal of such erroneous molecular barcode contributes to the precise estimation of variant DNA molecules and to a highly specific mutation detection by strict noise reduction as shown in Supplementary Fig. S1. Additionally, the NOIR-SS EGFR targeting panel designed in this study was optimized to cover the entire region of EGFR tyrosine kinase domain, which resulted in the high coverage uniformity and deep sequencing with sufficient depth. With the combination effect of erroneous molecular barcode removal and focused optimal EGFR panel, the NOIR-SS assay successfully detected L858R ctDNAs with good signal/noise ratio with a minimum VAF of 0.06% in 29 (87.9%) of the 33 cases in an EGFR-TKI-naïve setting. Conversely, ddPCR detected L858R with a minimum VAF of 0.04% in 26 (78.8%) of the 33 cases. These www.nature.com/scientificreports/ tissue-liquid concordance results were in line with those previously reported both for NGS (76.5-100%) 22,33,34 and ddPCR (80-94%) [35][36][37] in patients with advanced EGFR-mutated lung adenocarcinoma. To our knowledge, our study is the first to assess the detection performance of ctDNA using both NOIR-SS and ddPCR in patients with cancer, with our results indicating a strong correlation of VAFs between the two assays.
In the present study, the high tumor burden represented by the high number of metastatic organs tended to be associated with a higher VAF than that resulting from a low tumor burden. Specifically, bone and liver metastases were associated with a higher VAF. These findings are supported by previous reports showing the association of the positivity of ctDNA with a high tumor burden 38 , liver metastasis 39 , and bone metastasis 40 . The relatively low tumor burden in the four patients in our study in whom both NOIR-SS and ddPCR failed to detect ctDNA also highlights the importance of the tumor burden and metastatic sites for ctDNA detection. However, the precise mechanisms underlying the presence and amount of ctDNA remain largely unclear. Future studies should investigate the associations of ctDNA levels with factors including cancer types, histology, genotypes, neovascularization, phenotypes such as epithelial-mesenchymal transitions, and anatomy and blood flow at tumor sites.
Of interest, in one case in our study, the L858R mutant allele was detected with a relatively high allele fraction (30.12%) in NOIR-SS but with a considerably low fraction (0.05%) in ddPCR. NOIR-SS revealed that the tumor had a L858R mutation caused by a two-base substitution (c.2573_2574delinsGA). In the COSMIC (v92; http:// cancer. sanger. ac. uk), the two-base substitutions c.2573_2574delinsGA and c.2573_2574delinsGT were reported in one and eight samples, respectively. In this study, NOIR-SS could detect the c.2573_2574delinsGA mutation because single or more catalogue entry of EGFR in the COSMIC database was used for inclusion criteria. Nonetheless, we cannot exclude the possibility that rare mutations not catalogued in the COSMIC database www.nature.com/scientificreports/ were still missed from the mutation call. A similar L858R allele dropout in SNaPshot genotyping caused by the allele substitutions c.2571G > A and c.2573T > G in cis, which disrupted the primer binding and single base pair extension reaction, was reported 41 . These pitfalls of PCR-based genotyping highlight the advantage of NGS-based methodologies even for detecting single missense mutations.
In conclusion, our study demonstrated the benefit of NOIR-SS in the detection of the L858R ctDNA in patients with advanced lung adenocarcinoma in terms of its high sensitivity, which was comparable with sitespecific ddPCR, and its robust sequencing even of two-base substitution-induced L858R, which is difficult to be detected by ddPCR. Our findings provide further evidence indicating that the highly accurate and quantitative NOIR-SS is useful in clinical settings in addition to established advantages of NGS methodologies such as the concurrent interrogation of the entire EGFR tyrosine kinase domain and a number of other cancer-relevant genes. Although we are currently continuing to evaluate the clinical significance of L858R ctDNA kinetics measured  www.nature.com/scientificreports/  Sample preparation, quality assessment, and NOIR-SS assay design. Preparation of plasma was performed as described previously 31 . Plasma was separated from 10 mL blood and cfDNA was extracted from 4 mL of plasma using a QIAamp circulating nucleic acid kit (QIAGEN Valencia, CA, USA). cfDNA was concentrated by Amicon ultra-0.5 centrifugal filters. Double-stranded DNA was quantified according to the Qubit dsDNA HS Assay (Thermo Fisher Scientific, Waltham, MA, USA) on the Qubit 2.0 Fluorometer (Thermo Fisher Scientific). Concentrated cfDNA corresponding to 2 mL of plasma was used for each assay of NOIR-SS and ddPCR. Molecular barcoded next generation sequencing library was constructed by the NOIR-SS method as described previously 30  Library construction for ctDNA mutation analysis using NOIR-SS molecular barcoding method. Library construction for NOIR-SS was performed according to a previously described procedure 31 .
For each plasma sample, we prepared two separate reaction mixtures with two discrete gene-specific primer cocktails (forward_nest/reverse_nest cocktails in Supplementary Table S3 online) since our PCR system did not allow the use of primer pairs to avoid the undesirable amplification between forward and reverse gene specific primers. Sequencing library is amplified by anchored PCR using primer pair between single anchor of gene specific primer and universal primer on the sequencing adapter. cfDNA obtained from plasma was end-repaired in a 15 μL reaction containing 50 mM Tris-HCl, pH 8. Sequencing and data analysis. The constructed library was quantified using the Qubit dsDNA HS Assay Kit or the Quant-iT PicoGreen dsDNA Assay Kit (Thermo Fisher Scientific) and was loaded onto an Ion 540 chip using the Ion Chef System (Thermo Fisher Scientific). Sequencing was performed on the Ion Torrent S5 XL platform. Data analysis was performed according to the procedure described previously 30 . www.nature.com/scientificreports/ Variant call by molecular barcoding analysis. To detect the variants, we applied anomaly detection based on a Poisson distribution model for the sequencing error, as previously described 30 . When the number of base alterations in a target region is significantly higher than the average expected from the sequencing error, we may attribute the changes to variant(s). We can use a statistical model to calculate the probability that a specific number of sequencing errors will occur. The average number of base changes due to sequencing errors, λ, is as follows: where l, m and ER are the number of base pairs in a target region, the number of sequenced molecules and the sequencing error rate, respectively. With the application of a Poisson distribution model for the sequencing error, the probability of n or more sequencing errors is estimated. The sequencing error rate was set to 10-5, corresponding to expectation of 1 alteration at single nucleotide site by sequencing error in 100,000 analyzed DNA molecules. In this study, we evaluated each target region upon the presence of a variant(s), setting P = 0.01 as the threshold of detection. For the hotspot mutation with variant positive molecules, we evaluated each base position in hotspot mutation at the specified threshold value estimated from anomaly detection 42  For quality control, the fragmented (170 bp) EGFR L858R Reference Standard genomic DNA (HD254; Horizon, Cambridge, UK) was mixed with cfDNA from a healthy control to produce 0%, 0.1%, and 1.0% control standards. As shown in Supplementary Fig. S1 online, the NOIR-SS assay was performed using these standards as well as negative controls from 12 healthy individuals. ddPCR for detecting and quantifying ctDNA. ddPCR was performed using the QX200 ddPCR system (Bio-Rad Laboratories Hercules, CA, USA) according to the manufacturer's instructions. QuantaSoft (Bio-Rad Laboratories Hercules) software was used for data analysis. This software calculates the number of DNA molecules in the starting sample for the assay by modeling as a Poisson distribution. ddPCR assays were performed to detect the EGFR L858R mutation as described by Zhu et al. 43 . The allele-specific minor groove binder (MGB) probes were labeled with either VIC or FAM at the 5′ end and a nonfluorescent quencher (NFQ) at the 3′ end. The following probe sequences were used for the EGFR L858R assay: 5′-VIC-AGT TTG GCC AGC CCA A-MGB-NFQ-3′ for wild-type DNA and 5′-FAM-AGT TTG GCC CGC CCAA-MGB-NFQ-3′ for mutant DNA detection. The variant detection was classified as negative when only one positive droplet was identified in the FAM-positive area because false positive droplets with intermediate or strong intensity are frequently observed in negative control samples. Given that even negative control samples showed noise signals around 1000 amplitude, the threshold for signal intensity to judge mutation positive was set at the 2000 amplitude. In the QX200 ddPCR system, maximumlly 20,000 droplets were analyzed in one assay. The cutoff of two or more mutation positive droplets corresponds to the theoretical limit of detection %VAF of 0.01% (2 positives in 20,000 total droplets). The ddPCR assays in this study included DNA-negative vacant droplets and such vacant droplets were not used for the %VAF estimation, therefore actual limit of detection of ddPCR assay is higher than the theoretical estimation.
The control standards (0%, 0.1%, and 1.0%) which were prepared for the NOIR-SS quality control were also used for ddPCR quality control ( Supplementary Fig. S2 online). Statistical analysis. Discrete variables were expressed as totals (percentages), and continuous variables were expressed as median (range). The Mann-Whitney U test was used to compare continuous variables. The Fisher exact test was used for categorical variables. Correlations were analyzed using the Spearman's rank correlation coefficient. Comparisons of %VAFs determined from measurements of NOIR-SS and ddPCR were performed by Bland-Altman analysis. Statistical analyses were performed using GraphPad Prism Version 8.4.3 (GraphPad Software, San Diego, CA, USA). All analyses were two-tailed, and P values of < 0.05 were considered statistically significant. www.nature.com/scientificreports/