Lung cancer is the most common malignancy and the leading cause of cancer related deaths worldwide (18.4% of total cancer deaths), with non-small cell lung cancer (NSCLC) being the most common subtype, accounting for approximately 85% of all diagnosed cases1,2. The majority of NSCLC patients display advanced disease when diagnosed and thus have poor prognosis2,3. It is well established that acquired genetic alterations in certain driver genes result in tumour growth and invasiveness, and that patients harboring certain mutations may benefit from targeted therapies4,5. Indeed, a randomized clinical trial reported that advanced NSCLC patients harboring activating mutations in EGFR, one of the major driver genes of NSCLC, exhibited longer progression-free period when treated with a tyrosine kinase inhibitor (TKI), gefitinib, compared to those treated with standard platinum based chemotherapy6. However, those who were treated with TKI drugs can acquire secondary resistant mutations, in which case a new treatment regimen is needed to maintain therapeutic effect7,8. In addition to EGFR, NSCLC patients carrying ALK or ROS1 rearrangement were shown to respond well to a different TKI drug, crizotinib, while BRAF mutated NSCLC patients can be treated with a combination of BRAF inhibitors, dabrafenib and trametinib9,10,11,12. These findings suggest that the identification of mutation profiles of NSCLC is critical in order to prescribe suitable TKI therapy as well as elucidate the molecular basis of drug resistance to provide timely treatment adjustment.

Since 2018, the American Society of Clinical Oncology (ASCO) has recommended routine mutation testing for driver genes including EGFR, ALK, ROS1 and BRAF in clinical practice for NSCLC patients13. Although there are currently no targeted drugs for KRAS or NRAS mutated NSCLCs, mutation testing for these genes has also been recommended due to their proven impact on clinical outcomes of NSCLC patients14,15. Hence, simultaneous mutation profiling of these six driver genes has been recommended in current clinical practice13,16,17. Currently, information in publicly available database such as The Cancer Genome Atlas (TCGA) was obtained mostly from prospective studies in Caucasians and European cohorts. Therefore, the impact of heterogeneous genetic constitution of NSCLC patients across different populations might be underestimated17.

Vietnam displays higher incidence rate of lung cancer than the average worldwide incidence (14.5% versus 11% of all new cases) despite having the same level of mortality rate according to the latest Globocan data18,19. Also, in Vietnam lung cancer is the most aggressive type of cancer in men and ranks as the second leading cause of cancer deaths in women18,19. Therefore, it is important to assess the mutation landscape as well as their association with unique clinicopathological features of Vietnamese NSCLC patients. In the present study, we employed massively parallel sequencing to detect genetic changes in six major driver genes including EGFR, KRAS, NRAS, BRAF, ALK and ROS1 in tumour tissues from 350 NSCLC patients in Vietnam. Interestingly, we found that the Vietnamese cohort had a significantly higher frequency of KRAS mutations as compared to Caucasians and East Asian cohorts. We further identified significant associations between the prevalence of these mutations with patients’ clinical features in the Vietnamese cohort. Our data provide valuable information for guiding targeted therapy and drug development for Vietnamese NSCLC patients.


Clinical characteristics of Vietnamese NSCLC patients

The cohort in this study comprised of 350 patients diagnosed with NSCLC by clinical histology from 4 hospitals in Vietnam, with higher percentage of male compared to female (60.6% versus 37.4%, p < 0.05) and the median age of 61 years, ranging from 24 to 89 years (Table 1). The majority of patients (289 cases, 82.6%) were classified into advanced stages (III-IV), while 12 patients (3.4%) were in early stages (I-II) and 49 cases (14%) missing information on clinical stages (Table 1). Adenocarcinoma (AC) was the most common NSCLC subtype (241 cases, 68.9%) while squamous carcinoma (SCC) was confirmed in 25 patients (7.1%). Additionally, 84 cases (24%) were of either unknown or uncharacterized subtypes (Table 1). Among 350 patients, 185 cases (52.8%) in this cohort had smoking status recorded including 54 smokers (15.4%) and 131 non-smokers (37.4%) while 165 cases (47.1%) had missing smoking status (Table 1). Treatment details were obtained for 306 patients (87.4%), including 285 patients (81.4%) who had never taken any treatment at the time of diagnosis and 21 patients (6%) experienced either tumour resection (5 cases, 1.4%), chemotherapy combined with radiotherapy (10 cases, 2.9%) or TKI therapy (6 cases, 1.7% for each treatment) (Table 1).

Table 1 Major clinicopathological features of 350 Vietnamese NSCLC patients N: number of cases; AC: adenocarcinoma; SCC: squamous cell carcinoma.

Mutation profiles of driver genes in Vietnamese NSCLC patients

In the present study, we developed a targeted capture sequencing assay to analyse genetic alterations in formalin-fixed paraffin-embedded (FFPE) tissue biopsy specimens of NSCLC patients. We first validated our assay by comparing its performance with a commercial droplet digital PCR (ddPCR) assay (Bio-rad) for detecting three major EGFR mutations (L858R, del19 and T790M) in 40 tissue samples randomly selected from our cohort. When ddPCR results were used as reference standard, our targeted capture sequencing assay exhibited sensitivity of 81.8% (11/13), specificity of 100% (27/27) and concordance rate of 95% (38/40) (Table 2, Table S2). The two cases (LBL021 and L10021) that were positive for del19 mutation by ddPCR but missed by our assay had relatively low variant allele frequency (VAF) of 0.5% and 3.9%, respectively, below the limit of detection of our assay (4%) (Table S2). Hence, these results confirmed that our targeted capture sequencing assay achieved precise identification of mutations with VAF >4% in FFPE tissue samples. Therefore, this assay was subsequently used to identify genetic alterations in six major driver genes of NSCLC including KRAS, EGFR, NRAS, BRAF, ALK and ROS1 for the cohort of 350 NSCLC patients.

Table 2 Comparison of EGFR mutation detection between our targeted capture sequencing assay and a commercial ddPCR assay.

Among 350 patients successfully sequenced, 232 (66.3%) cases were found to carry at least one clinically relevant genetic alteration (according to ClinVar) in the tested driver genes while the remaining 118 cases (33.7%) were negative for these mutations (Fig. 1A). EGFR (32.3%) and KRAS (20%) were the most frequently mutated driver genes, followed by ALK (5.4%), ROS1 (2.9%), BRAF (1.1%) and NRAS (0.6%) (Fig. 1A). Although mutations in driver genes such as EGFR, KRAS and ALK were reported to be mutually exclusive in majority of NSCLC patients20, we detected 14 cases (4%) carrying mutations in more than one driver genes (Fig. 1B). Of those, the co-occurrence of mutations in EGFR and KRAS was the most common (6 cases), including one case carrying concurrent mutations in 3 driver genes (EGFR, KRAS and BRAF). EGFR mutation was also found in 3 cases with ALK rearrangement and 2 cases with BRAF mutation. In addition, KRAS mutations were also detected in patients carrying BRAF (2 cases), ALK (1 case) and ROS1 (1 case) mutations (Fig. 1B).

Figure 1
figure 1

The mutation composition in six major driver genes in 350 Vietnamese NSCLC patients. (A) Prevalence of mutations in 6 major driver genes determined by targeted capture sequencing. Cases with mutations occurring in more than one driver gene were counted as “co-mutation”. (B) Mutation frequency in cases harbouring co-mutations.

We next explored the distribution of mutation sites and subtypes across the six driver genes. EGFR mutations were predominantly detected in exon 19 and exon 21, with activating mutations del19 (55 cases) and L858R (48 cases) accounting for the majority of EGFR mutations (83.1% of all EGFR mutation events) (Fig. 2A). In contrast, mutations reported to be associated with resistance to TKI therapy, including T790M and ins20 in exon 20, were detected in 16 patients (12.9% of total patients, 8 cases for each mutation type) (Fig. 2A). All ins20 mutations were later confirmed by a real time-PCR based commercial assay Cobas® EGFR Mutation Test (Roche) (Data not shown). Additionally, other rare mutations with prevalence of less than 1% were also detected in exon 3 (R108K) and exon 18 (G719A, G719C, G719D and E709K) (Fig. 2A). The most common KRAS mutations occurred in exon 2 at G12 and G13 residues, accounting for 62/79 cases (78.4% of total KRAS mutation cases, Fig. 2B). Mutations were also detected in exon 3, residues Q61 (11 cases, 13.9%) and exon 4, residue K117 (11 cases, 13.9%, Fig. 2B). There were 8 cases carrying more than one KRAS mutation subtype and such concurrent mutations tend to arise at the same residue, with 6 out of 8 cases carrying co-mutations at the same residue (1 case at G12, 4 cases at G13 and 1 case at Q61) (Fig. 2B). Mutations in NRAS were detected in exon 3, residue Q61 (Fig. 2C); and mutations in BRAF were detected in exon 4, 11 and 12 (2 cases each) and exon 15, residue V600 (3 cases) (Fig. 2D).

Figure 2
figure 2

Distribution of mutation subtypes across 6 driver genes in Vietnamese NSCLC. (A–F) The mutation frequencies in particular subtypes of EGFR (A), KRAS (B), BRAF (C), NRAS (D), ALK (E) and ROS1 (F) genes were calculated as percentage of mutant cases in the total number of cases carrying mutations in the corresponding driver gene.

Our assay was able to detect genetic rearrangements in ALK and ROS1. ALK-EML4 translocation was identified as the most common mutation subtype (22/23 cases, 95.6%, Fig. 2E). All ROS1 mutations (11 cases) identified in this cohort were the results of fusion with the following genes: LIRG3 (1 case), SLC34A2 (3 cases), SDC4 (2 cases), CD74 (4 cases) and ERZ (1 case) (Fig. 2F).

Collectively, our data indicated that mutations in EGFR, KRAS are the two most common events occurring in more than half of all tested NSCLC patients in Vietnam, followed by less frequent mutations in other tested genes.

Comparison of mutation profiles among three NSCLC cohorts

To put the mutation profiles found in the Vietnamese cohort into global context, we compared the prevalence of mutations in Vietnamese NSCLC patients with those found in two other cohorts including the Caucasian MSK-IMPACT cohort established by the Memorial Sloan Kettering (MSK) Cancer Center21,22 and East Asian (China) cohort retrieved from a study by Wang et al.23 (Table 3). Since the majority of patients in the Vietnamese cohort were diagnosed with AC subtype in advanced stages (III-IV), we selectively retrieved data from patients with comparable histology and tumour stage from the other two cohorts. Given the comparable histology and stage, the Vietnamese cohort had fewer female patients than the other two cohorts. Additionally, patients in this cohort were slightly younger than the MSK-IMPACT cohort (median age: 61 versus 64, p < 0.001, Table 3) but older than the East Asian cohort (median age: 61 versus 58, p < 0.001, Table 3).

Table 3 Comparison of mutation frequency among the three cohorts.

The prevalence of mutations in EGFR was significantly higher in the Vietnamese cohort compared to the MSK-IMPACT cohort (37.7% versus 29.1%, p < 0.05, Table 3 and Fig. 3) but markedly lower than the East Asian cohort (37.7% versus 73.4%, p < 0.00001, Table 3 and Fig. 3). Interestingly, the prevalence of KRAS mutations of the Vietnamese cohort was comparable to the MSK-IMPACT cohort while it was significantly higher than that of East Asians (21.4% versus 9.1%, p < 0.0001, Table 3, Fig. 3). Apart from KRAS and EGFR mutations, mutation frequencies of the remaining tested genes showed no significant differences between the Vietnamese cohort and the other two cohorts (Table 3, Fig. 3)

Figure 3
figure 3

Comparison of driver gene mutation frequencies between Vietnamese NSCLC cohort with Caucasians and East Asians. Mutation frequency of each driver gene in the Vietnamese cohort was calculated among 220 patients with adenocarcinoma (AC) in late stages (III-IV) taking into account cases with co-mutation. For the Caucasian cohort, data were obtained from the MSK-IMPACT cohort (764 lung cancer cases with AC subtypes in metastatic stages (III-IV), Asian patients were excluded). For East Asia cohort, data we retrieved from a recent report profiling a similar panel of driver mutations in a Chinese cohort of 361 patients with AC in late stages (III-IV). NT: mutations were not tested.

In summary, the cohort of Vietnamese NSCLC patients showed specific characteristics that set it apart from the two other cohorts. It had higher prevalence of EGFR mutations than the Caucasian MSK-IMPACT cohort but lower than the East Asian cohort while its KRAS mutation prevalence is higher than the East Asian cohort.

Correlation between mutation prevalence and clinicopathological features of Vietnamese NSCLC patients

Previous studies have reported significant association between prevalence of driver mutations and patients’ clinicopathological features24,25,26,27,28,29. However, the results are often inconsistent across different studies. Here, we examined such associations in the Vietnamese NSCLC cohort.


Gender status was available in 343 patients including 131 female and 212 male patients (1:1.6 ratio); the 7 patients with unknown sex were excluded from gender association analysis. We performed Chi-squared (χ²) test to investigate the association between patients’ gender and mutation prevalence. Consistent with previous studies, we found that EGFR mutations were more commonly detected in Vietnamese female patients than in male patients (48.1% versus 26.9%, p < 0.00001, Table 4). Conversely, KRAS mutation frequency was significantly higher in male than that in female patients (30.7% versus 9.2%, p < 0.0001, Table 4). Other driver genes including NRAS, BRAF, ALK and ROS1 did not show any significant correlation

Table 4 Association between clinical factors and mutation frequencies of NSCLC driver genes.


Patient age was available in 344 patients, ranging from to 24 to 89 years old. When the median age (61 years) was used as a cutoff value, KRAS mutations were more frequently detected in elderly patients aged over 61 years than younger individuals (25.3% versus 20.1%, p < 0.05, Table 4). In contrast, the younger group showed higher prevalence of ALK (9.8% versus 3.5%, p < 0.05) and ROS1 (5.2% versus 1.2%, p < 0.05) mutations (Table 4). There was no significant correlation between patient age and mutation prevalence of other driver genes including EGFR, NRAS and BRAF (Table 4).

Smoking status

Of the 185 cases with smoking status, we could detect a statistically significant correlation between EGFR and KRAS mutation prevalence and smoking status (Table 4). Our data indicated that non-smokers showed significantly higher frequency of EGFR mutation (51.9% versus 29.6%, p < 0.01) but lower frequency of KRAS mutation (6.9% versus 29.6%, p < 0.001) than smokers.


Adenocarcinoma (AC) were diagnosed in 241 patients, accounting for the most common histological type (81%), while squamous cell carcinoma (SCC) were identified in 25 cases (8.4%). We detected a significant association between histology type and EGFR mutation with higher prevalence in AC group than SCC group (35.7% versus 16%, p < 0.05) (Table 4).


NSCLC is the most common type of lung cancer with high rates of acquired somatic mutations2. Comprehensive profiling of clinically relevant mutations is of great importance in clinical practice for designing optimal targeted therapy as well as understanding drug resistance mechanisms30,31. Given the diversity of mutation constitution in different populations, the primary objective of our study is to examine the mutation profiles of major druggable driver genes in Vietnamese NSCLC patients. To this end, we performed targeted capture sequencing on 350 tumour tissue samples from Vietnamese patients with NSCLC and analyzed their genomic alterations in the six most common driver genes recommended by the ASCO and National Comprehensive Cancer Network for mutation testing in NSCLC patients17.

Our data showed that 66.3% of patients in the Vietnamese cohort harbour at least one alteration in the six tested driver genes. Our findings were consistent with our previous study using the same panel of genes and reporting a similar mutation rate of 63.6% in a smaller cohort of 59 Vietnamese NSCLC patients32. The mutation profiles of Vietnamese NSCLC patients also exhibited certain common features of NSCLC patients previously reported33,34. Firstly, mutations in EGFR and KRAS are the most common, accounted for more than 50% of total cases, while mutations in ALK, BRAF, NRAS and ROS1 were rarer. This trend was also reported in a Chinese cohort by Zhuang et al.33 or in Caucasian populations by Campbell et al.34. Secondly, the mutation sites within the driver genes identified in this cohort were similar to those reported in other populations, including these most common mutation sites: EGFR exon 19 deletion (del19) and exon 21 (L858R)35,36, KRAS exon 2 (G12C)37, BRAF V600E38 and ALK-EML4 fusion39. The frequencies of ALK (5.4%) and ROS1 (2.9%) mutations determined in our study (Fig. 1) are comparable to previously published studies reporting the frequencies of 5.05 and 1%-2% for ALK and ROS1, respectively40,41. EGFR del19, EGFR L858R, BRAF V600E, ALK-EML4 and ROS1 fusion mutations in combined (139 cases) accounted for 39.7% of cases in the Vietnamese cohort. They were known as activating mutations and clinically proven to be sensitive to treatments with available TKI drugs5,6,11,39. Thus, our findings suggested that approximately 40% of Vietnamese patients would carry such mutations and therefore would benefit from available targeted drugs. Among 6 patients currently known to be on TKI therapy (Table 1), 5 cases carrying activating EGFR mutations (EGFR del19, EGFR L858R) and one case positive for both EGFR L858R and ALK-EML4 fusion.

In contrast, some patients with activating EGFR mutations were found to develop an acquired resistance mutation, T790M42,43. Consistent with previous studies, we found that 7 out of 8 T790M cases were also positive for either L858R or del19 mutations and we suspected that these patients were under treatment with TKI drugs although we could not obtain treatment data for these cases. In addition to T790M, ins20 mutations were also known as resistant mutations44,45 and were detected in 8 cases in our cohort and most of them (6/8 cases) did not co-exist with any activating EGFR mutations L858R and del19, suggesting that ins20 mutations are likely primary inactivating mutations rather than acquired resistant mutations. Although a significant proportion of Vietnamese NSCLC patients were identified to carry KRAS mutations, drugs directly targeting KRAS mutated NSCLC are still under clinical evaluation46.

Although concomitant driver gene mutations in EGFR, KRAS, BRAF and ALK were initially reported to be mutually exclusive events in NSCLC patients20,47, we detected 14 cases (4%) harbouring concurrent alterations among the tested driver genes. These mutations might either coexist in the same tumour cell or belong to different tumour cell lines. The proportion of cases with co-mutations varied among studies. A recent study by Zhuang et al.33 involving a cohort of 3774 Chinese NSCLC patients reported a lower co-mutation rate of 1.67% in 5 tested driver genes (EGFR, KRAS, ALK, ROS1 and BRAF) while 5% of patients in another cohort of 1,000 NSCLC patients at The NCI’s Lung Cancer Mutation Consortium were reported to harbour concomitant driver gene mutations48. However, these studies together with our study consistently reported the EGRF/KRAS (6/14 cases in our study) as the most common co-mutation event in NSCLC patients33,48. The identification of patients with such co-mutations is of clinical importance since these concurrent mutations represent a distinct subset of patients and may have significant impact on treatment outcomes. In this regard, previous studies showed that patients carrying EGFR/ALK co-mutations varied in their sensitivity to and that the choice between these two classes of TKI drugs as first-line treatment for these patients is still being debated49. Hence, further studies are required to investigate clinical activity and drug sensitivity of different co-mutation subsets in order to develop suitable treatment approaches.

To identify Vietnamese-specific mutation profiles in NSCLC patients, we selected patients with comparable histology (AC) and tumour stage (stage III-IV) from the East Asian cohort (China)23 and the MSK-IMPACT cohort mainly consisting of Caucasia patients21,22. The prevalence of EGFR mutations among Vietnamese NSCLC patients was markedly lower than East Asia cohort (37.7% versus 73.4%, p < 0.00001, Table 3) but significantly higher than MSK-IMPACT cohorts (37.7% versus 29.1%, p < 0.05, Table 3), confirming previous reports that EGFR mutations are more prevalent in Asian patients than in Caucasian patients50. Of note, the percentage of Vietnamese patients with KRAS mutation including those with concurrent mutations was comparable to the MSK-IMPACT cohort but significantly higher than the published data in East Asian (21.4% versus 9.1%, p < 0.0001). Hence, our results demonstrated that NSCLC patients from Vietnamese population exhibit a unique mutation constitution, suggesting that ethnic composition might contribute to the observed variation in mutation profiles. Furthermore, Nguyen et al.32 reported a remarkably higher frequency of KRAS mutations in Vietnamese patients living in Vietnam than in those living in the USA (24.4% versus 4.5%), suggesting that geographic and socioeconomic disparities might also contribute to the variation in mutation frequencies in different cohorts. Although the clinical significance and mechanisms driving these variations are unclear, the unique mutation profiles should be taken into consideration for prioritizing research programs aiming to develop new treatment strategies for Vietnamese NSCLC patients.

KRAS mutations were detected in 30% of NSCLC patients who were non-responsive to TKI treatment51. Hence, the high prevalence of KRAS mutations in the Vietnamese NSCLC cohort might have negative impacts on clinical outcomes. However, the use of KRAS mutation status as a negatively predictive marker of TKI therapy remains controversial due to inconsistent results obtained from subsequent meta-analysis studies52,53,54,55. Interestingly, KRAS mutations, particularly the most prevalent subtype G12C, when co-existing with PD-L1 expression in patients’ tumour were shown to have poor prognosis56, supporting the potential benefit of KRAS mutation testing in selecting patients for immunotherapy using check-point inhibitor.

We further investigated correlations between mutation prevalence and major patients’ clinical characteristics. Consistent with previous studies25,50, we observed that EGFR mutations were more prevalent in female patients, non-smoker and those with histological subtype of AC. Unlike EGFR mutations, KRAS mutations were more commonly detected in male patients and showed significant correlations with patients’ age, with higher prevalence in elder patients. It is possible that the high prevalence of KRAS mutations in Vietnamese male patients may be responsible for their higher mortality rate. Previous studies reported that KRAS mutations more frequently arise in smokers than in non-smokers25,57. Consistently, we observed such correlation in our study, indicating that the high frequency of KRAS mutation in our cohort could be attributable to high prevalence of smoking in Vietnam population. In addition, we found that ALK and ROS1 rearrangement mutations were more common in younger patients as compared to elderly patients, which is consistent with previous studies28,29. Take together, our data revealed several significant correlations between driver gene mutation prevalence and patients’ clinical characteristics.

There are a few limitations in our study. Firstly, although the panel of driver genes used in this study was chosen based on ASCO guidelines5, we did not take into account mutations in other driver genes such as PIK3CA58, AKT159 and ERBB260, previously reported to co-exist with those detected mutations and possibly having significant clinical impacts. Secondly, when comparing the mutation profiles of the Vietnamese cohorts with MSK-IMPACT and East Asian cohort, we selected patients with AC in late stages (stage III-IV) to exclude the confounding effect of histological subtypes and tumour stage that are varied among three cohorts. However, there were still differences in patients’ age at diagnosis and gender ratio between the Vietnamese and the other two cohorts, which were identified be significant factors associated with EGFR and KRAS mutation prevalence, thus might have impact on the analysis of the mutation profiles. Future large-scale studies are required to assess whether these confounders contribute to the variations in mutation profile between the of Vietnamese population and other races.

In conclusions, our study revealed the mutation profiles of multiple driver genes in the largest cohort of NSCLC patients in Vietnam to date. Our data highlighted several subsets of Vietnamese NSCLC patients carrying specific mutations that would benefit from future studies to provide more suitable treatment options.

Material and Methods

Tumour tissues

We studied 350 formalin fixed, paraffin-embedded tumour specimens from NSCLC patients treated at Pham Ngoc Thach hospital, Cho Ray hospital, Ha Noi Oncology hospital and Vietnam National cancer hospital. The tumour-rich areas of the tissues that contain at least 20% of tumour cells identified by a hematoxylin and eosin staining were micro-dissected. Written informed consents were obtained from all patients. Clinical characteristics of all patients were summarised in Table S1. This study was approved by The Ethic Committee of University of Medicine and Pharmacy at Ho Chi Minh City, Vietnam (Ethic number: 027/DHYD-HD) and The Medical Genetics Institute. All methods were performed in accordance with the relevant guidelines and regulations.

DNA isolation

DNA was extracted from FFPE samples using QIAamp DNA FFPE Tissue Kit (Qiagen, USA) following the manufacturer’s instructions and then quantified using the QuantiFlour dsDNA system (Promega, USA).

Massively parallel sequencing

DNA fragmentation and library preparation were performed using the NEBNext Ultra II FS DNA library prep kit (New England Biolabs, USA) following the manufacturer’s instructions. DNA library concentrations were quantified with a QuantiFlour dsDNA system (Promega, USA). Equal amounts of libraries (150 ng per sample) were pooled together and hybridized with xGen Lockdown probes for six targeted genes EGFR, KRAS, NRAS, BRAF, ALK and ROS1 (IDT DNA, USA). For ALK and ROS1, customized probes (Table S3) for intron regions were designed and mixed with probes for exon regions at equal concentration. Sequencing was run using NextSeq. 500/550 High output kits v2 (150 cycles) on Illumina NextSeq. 550 system (Illumina, USA) with minimum target coverage of 100×. In cases where the mean coverage in the targeted regions is lower than 100×, extra sequencing was performed to increase the mean coverage to the expected range. The mean coverage in the target regions for all samples is approximately 129×.

Variant calling using Mutect2 and Factera

Each FFPE sample was barcoded with dual indexes in the P7 and P5 primer. The PE reads were generated by bcl2fastq package (Illumina) and aligned to human genome (hg38) using BWA package61. Duplicate reads were marked using MarkDuplicates from Picard tools ( Somatic variants were called using Mutect2 package62. A custom pipeline with call to BWA, Picard, and Samtools packages were built to perform the above-mentioned analysis steps63. For detection of ALK and ROS1 rearrangement, fusion variant calling was analyzed using Factera v1.4.4 with default parameters64.

ddPCR method

A four-step ddPCR procedure was performed using reagents and equipment from Bio-Rad (unless otherwise stated) following the manufacturer’s instruction65. Briefly, the PCR mix was first prepared by mixing 1 × ddPCR Supermix for Probes, primers and probes (IDTDNA) and DNA template (0.8 or 1.6 ng). Next, 20 µl of the PCR mix was transferred into the Droplet Generator DG8TM Cartridge followed by 70 μl of the Droplet Generation Oil before placing in a QX100TM Droplet Generator to generate droplets. Subsequently, the droplets were transferred to a 96-well plate before placing in a thermal cycler (C1000 Touch, Bio-Rad) for PCR amplification. The PCR thermal program was performed as follows: 95 °C for 10 min, then 40 successive cycles of amplification (94 °C for 30 sec; 55 °C for 60 sec) and 98 °C for 10 min. Lastly, the droplet reading was acquired by the QX 200 Droplet reader and analyzed using the QuantaSoft Software. Positive and negative droplets are assigned based on the fluorescence threshold that was set as previously described by Deprez et al.66.

To detect T790M and L858R mutations in exon 20 and 21 of the EGFR gene, one reaction of ddPCR was used with two sets of primers and probes as follows: T790M primer F-GCCTGCTGGGCATCTG; T790M primer R-TCTTTGTGTTCCCGGACATAGTC; T790M mutation probe FAM- ATGAGCTGCATGATGAG-ZEN/3’IBFQ; L858R primer F-GCAGCATGTCAAGATCACAGATT; L858R primer R-CCTCCTTCTGCATGGTATTCTTTCT; L858R mutation probe HEX-AGTTTGGCCCGCCCAA- ZEN/3’IBFQ. For detection of 15 deletion sites in exon 19 (del19) of the EGFR gene, a commercially available ddPCR reaction (Bio-rad) was used (ddPCR™ EGFR Exon 19 Deletions Screening Kit #12002392).

Statistical analysis

Pearson’s chi-squared (χ²) test (sample size >5) or Fisher’s exact test (sample size< = 5) was performed on the web page ‘Social Science Statistics’ ( to assess the association between two categorical variables (Tables 3 and 4). Bonferroni correction was applied when multiple comparisons were performed (Table 3).