Application of large-scale targeted sequencing to distinguish multiple lung primary tumors from intrapulmonary metastases

The effective differentiation between multiple primary lung tumors (MPs) and intrapulmonary metastases (IMs) in patients is imperative to discover the exact disease stage and to select the most appropriate treatment. In this study, the authors was to evaluate the efficacy and validity of large-scale targeted sequencing (LSTS) as a supplement to estimate whether multifocal lung cancers (MLCs) are primary or metastatic. Targeted sequencing of 520 cancer-related oncogenes was performed on 36 distinct tumors from 16 patients with MPs. Pairing analysis was performed to evaluate the somatic mutation pattern of MLCs in each patient. A total of 25 tumor pairs from 16 patients were sequenced, 88% (n = 22) of which were classified as MPs by LSTS, consistent with clinical diagnosis. One tumor pair from a patient with lymph node metastases had highly consistent somatic mutation profiles, thus predicted as a primary-metastatic pair. In addition, some matched mutations were observed in the remaining two paired ground-glass nodules (GGNs) and classified as high-probability IMs by LSTS. Our study revealed that LSTS can potentially facilitate the distinction of MPs from IMs. In addition, our results provide new genomic evidence of the presence of cancer invasion in GGNs, even pure GGNs.

www.nature.com/scientificreports/ similarity in the mutation spectrum tend to be IMs 7,17,18 . To date, somatic mutations in several major driver genes 12 , targeted sequencing for dozens of genes [13][14][15] , loss of heterozygosity 19 , and chromosomal rearrangements 20 have been applied to discriminate MPs from IMs. Although these manners provide valuable information about tumor clonal relationships, the number of genes in the test method may affect the differential diagnosis of MLCs. A larger gene panel could enhance the efficiency of lineage calling and the sensitivity of differentiated diagnosis 20,21 . Most recently, whole-exome sequencing (WES) and whole-genome sequencing (WGS) have also been proposed to delineate the clonal relationships among different tumors 7,16 . There is no doubt that WES and WGS can provide pivotal information to accurately distinguish MPs from IMs, but it is unrealistic to apply them to every single tumor in clinical practice 22,23 . Hence, the clinical value of large-scale targeted sequencing (LSTS) should be further explored and confirmed. Therefore, in this study, a LSTS assay covering up to 520 cancer-related genes was performed on 36 tumor samples from 16 patients diagnosed with multiple primary lung cancers. Subsequently, mutational profiles were combined with clinical, radiological, and histopathological analysis to classify the paired tumors as MPs or IMs.

Results
In this study, 36 lesions from 16 patients were investigated, which were diagnosed as MPs, including three patients who had three or more lesions (up to four). The characteristics of patients and tumors are summarized in Table S1. These tumors were differentiated into sarcomatous carcinoma (SC, 2.8%) and adenocarcinomas (ADC, 97.2%), and the latter including adenocarcinoma in situ (AIS, 2.8%), minimal invasive adenocarcinoma (MIA, 11.1%) and invasive adenocarcinoma (IA, 83.3%) ( Table 1).
Targeted sequencing statistics. In order to validate the clinical application of LSTS to distinguish MPs from IMs, the comprehensive mutational profiles of 25 tumor pairs were analyzed using a panel of 520 cancerrelated genes. Collectively, our analysis revealed 331 gene variations (286 mutations and 45 CNV) from 36 tumor samples, with a median of 9 somatic alterations per tumor (range 0-41). No gene variation was detected in one lesion of P13. A list of all variations detected by targeted sequencing is depicted in Table S3, from which Oncoprint heatmap (Fig. 1) and molecular profiling (Fig. 2) of the tumors from all patients were generated. EGFR alteration was the most frequently detected (77%), which was observed in 93.75% of patients (15/16), followed by TP53 alteration (32%), consistent with a previous study on multiple lung cancers reported in 2018 24 . Notably, RBM10 exhibited a significantly high mutation frequency (eight tumors, 25%) in stage I-II adenocarcinomas (32 tumors) (Fig. 1). The result was similar to the previous reports in which RBM10 was found with a high mutation rate (16%) in radiological subsolid nodules 25 , and a high mutation rate (21%) in preinvasive and early-stage lung adenocarcinomas 26 . In addition, several oncogenes (such as EGFR, SPTA, and RB1) had different mutations in the same or different tumors on the same patient (Fig. 2). This phenomenon indicated both inter-tumor and intra-tumor heterogeneity of oncogenes in lung cancer.

Mutational evaluation and tumor classification.
No shared gene variations were detected between different tumors of 20 tumor pairs (75%), and these pairs were thus classified as definite MPs (Fig. 1, Table S4). In addition, EGFR was the only gene variation shared by two cases (CNV in P2, p.L858R in P15). And each of them also harbored some unique mutations, ranging from 2 to 40 mutations per tumor (Fig. 2). Given the prevalence of EGFR, the tumors in these two pairs were more likely independently arisen, leading to a clear classification of MPs with coincidental EGFR hotspot variation 27 .
Conversely, one tumor pair shared missense mutations in three genes (EMSY, SMAD4, and POM121L12) and gene copy number amplification of three genes (SDHA, TERT, and EGFR) (Fig. 2). These shared gene variations occurred in genes that were rarely reported, thus, this pair was classified as definite IMs.
Comparison of clinical diagnosis to LSTS classification. Overall, 22/25 (88%) of the 25 tumor pairs matched the clinical diagnosis of MPs. However, three tumor pairs were discordant with the clinical diagnosis, including two high-probability IMs (shared two mutations), and one definite IM (shared six alterations).
Tumors in the definite IM case (P5) were identified as adenocarcinoma by postoperative pathological examination. The proportion of histological subtypes showed that the acinar pattern (90%) was a significant subtype in one lesion, followed by papillary (5%) and micro-papillary (5%), while the subtypes of the other were papillary (75%), acinar (10%), adherent (10%), and micro-papillary patterns (5%) ( Table 2). In addition, a 0.6 mm-thick CT slice revealed the presence of two nodules, one was pure GGN (pGGN), the other was a solid nodule (Fig. 3A). With the combination of pathological types with imaging data, these two lesions were diagnosed as separate primaries despite the presence of lymph node metastases.
For these two high-probability IMs cases, histopathological analysis revealed that the major histological subtype of all tumor samples was a acinar pattern ( Table 2). One pair was a pGGN in the right upper lobe (RUL) and a mixed GGN (mGGN) in the right middle lobe (RML) of P10 (Fig. 3B) www.nature.com/scientificreports/ of P16: one in the RUL and one in the RML (Fig. 3C). However, the conclusions of the nature of the nodules in these two cases were drawn from the images scanned on 5 mm-thick CT sections, making difficult to understand if the 'GGN' was a true GGN, since they might be evaluated as solid nodules on 1 mm thin CT sections 30 . In addition, postoperative progression-free survival (PFS) and overall survival (OS) of these two patients have not been reached. Thus, these two cases were diagnosed as MPs.

Discussion
Due to the prevalence of MLCs, the etiology, diagnosis, staging, treatment, and prognosis of lung cancer aroused more attention in clinical practice, especially the distinction between MP and IM 4,[12][13][14]19,31,32 . MLCs refer to multiple lung lesions from one side or two sides of the same patient, within the context of identical genetic background and exposure history. In order to improve the accuracy on distinguishing the origins of multiple lesions in patients with MLCs, three major lung cancer research institutes proposed and revised some diagnostic criteria, but unified standards are still lacking 6 .
Recent studies have pointed out that large panel next-generation sequencing assys can be utilized not only to guide targeted therapies, but also to determine the clonal relationships among MLCs 18,29 . Unlike conventional   18 . The analysis involved 40 tumors in 16 patients, but in addition to lung tissues, tumor samples from breast, liver, thyroid, and mouth were also included. Therefore, it is necessary to conduct research from different perspectives and collect more samples from diverse populations to support the application of LSTS in clinical setting.
In this study, we performed 520 gene LSTS on 25 tumor pairs from 16 patients. Despite the shared germline mutation and environmental burden, paired tumors from 15 patients had distinct genomic profiles, including three tumors from P10 and two tumors from P16. As with the clinical diagnosis, they were all classified as MPs, although there were two patients (P2, P8) with lymph node metastases (Table 1). For tumor pairs classified as IMs, there were some shared variations, ranging from 2 to 6. Among them, one tumor pair with different histological subtypes from P5 were highly consistent in the somatic mutational profile. Considering that lung cancer usually displays a series of histological subtypes, different lesions often share overlapping histological features, which suggests that the morphology of MLC might not always be completely different. Therefore, the histological similarity between different tumors might be suggestive rather than conclusive 22,24 . In addition, lymph node metastases revealed that the two lesions were generated from the same clone. The above results suggested that this tumor pair originated from a common ancestor and the clinical diagnosis should be revised. Considering that pathologic, clinical and radiologic inferences depend on a doctor's experience and thus, it may be subjective, LSTS can be an effective and objective complementary tool in clinic practice.
Another issue that deserves special attention is the multiple GGNs, which mainly include pGGNs and mGGNs. It is generally accepted that multiple GGNs in patients come out from different lineages, meaning that hematogenous metastasis do not happen to GGNs 5,30,31,33 . However, the possibility that a small number of GGNs come from the same ancestor cannot be excluded, which can be explained by the air space theory 25,[34][35][36] . In our cohort, the definite IM case (a pGGN and a solid nodule) shared multiple mutations, providing evidence that a small portion of GGNs might be the result of early metastasis.
Besides, the recommendations given by the Fleishner Society state that lesions could be wrongly diagnosed as mGGNs on thick CT sections (typically 5 mm) when they are actually solid 30 . Nevertheless, the two highprobability IMs cases (P10 and P16) in this study had CT data from 5 mm-thick sections, which could not accurately determine whether the lesions were GGNs. Considering this issue and the LSTS results of these two tumor pairs, the possibility that these two nodules in P10 located in RUL and RML were descended from the same ancestor cannot be excluded. P16 confirmed the same. It does emphasize the importance of using contiguous thin CT sections (1 mm) to verify that the lesions are true GGNs. Moreover, no other mutations were  www.nature.com/scientificreports/ metastatic cancers might exist in those patients, which is the most overlooked aspect of clinical work. However, since neither patient reached PFS and OS, follow-up is needed to confirm this conclusion. The guidelines by the Fleishner Society also point out that GGNs progress slowly to various degrees, but the oncogenetic molecular mechanisms remain elusive 30 . Notably, the latest research suggested that the pre-cancerous unstable CNV with potentially genetic susceptibility may promote the development of driver mutations and independent synchronous multiple GGNs 34 . However, it remains unclear how the genetic map changes in the diversification from a GGN to a solid nodule, what factors influence this process and how to analyze it. Compared with LSTS, comprehensive genomic profiling at the whole exome or genome-level may be much more helpful to address this problem.
Additionally, some other challenges remain for a wider use of LSTS, including strict technological requirements, high cost, and long turnaround period. Moreover, because of the existence of negative results and common mutation sharing, its utility in making a distinction between MP and IM is limited 20 . In this study, no mutations were detected in one patient, which means that the 520 gene panel was uninformative in 2.8% of the cases. However, our work demonstrated that this problem can be solved by integrating clinical, radiologic and histological data.
There were two limitations to our study. Firstly, the limited cases could induce a slight bias in the genomic comparative analysis. We will further verify our conclusions by integrating public statistics and collecting more clinical cases. Secondly, we would have provided detailed histologic assessment of the study cases ideally, particularly of GGO or presumable AIS/lepidic cases. Some previous studies have provided more complete histological information for more cases. The results of those studies and ours are comparable in that they show that standard histopathological methods are sufficient in most cases, but have obvious limitations in the recognition of MLCs. In order to further probe into the development mechanism of MLCs, we intend to analyze the immune repertoire of MLCs, integrate multi-omics data, and conceive a more systematical and holistical approach to avoid the above problems.
Our findings have not only demonstrated the effectiveness of LSTS in distinguishing MP and IM but also provided evidence for the air space theory and the early metastasis of GGNs. The LSTS in this study was significantly increasing the diagnostic accuracy of patients with MLCs and can be used for guiding clinical treatments and achieving surveillance throughout the course of the therapy. In order to offer the best clinical management for patients with MLCs, larger targeted next-generation sequencing panels should be brought into the clinical detection in order to offer the best clinical management for patients with MLCs.  www.nature.com/scientificreports/ Guangzhou, China), including the entire exon regions of 312 genes and the hotspot mutation regions (exons, introns, and promoter regions) of 208 genes. In addition, 16 fusion genes were detected. The 520 cancer related genes used in the panel are listed in Table S2. A wide spectrum of mutation types was found, including large www.nature.com/scientificreports/ genomic rearrangement, copy number variation (CNV), insertion, deletion, stop-gain, frameshift, splice variant, missense, and mutations.

Statistical analysis. NGS-based analysis were submitted to Burning Rock Biotech, a College of American
Pathologists (CAP)-accredited/Clinical Laboratory Improvement Amendments (CLIA)-certified clinical laboratory and processed using optimized protocols as previously described 21 . The FASTQ format sequencing data were mapped to the human genome (hg19) using a BWA aligner 0.7.10 37 . With the use of GATK 3.2, MuTect and VarScan, local alignment optimization, variant calling, and annotation were performed respectively. The copy number cut-off of 1.5 corresponds to copy number deletion and 2.64 for copy number amplifications 21 . Variants were filtered using the VarScan fpfilter pipeline, and loci with depth less than 100 were filtered out 38 . DNA translocation analysis was performed using Tophat2 and Factera1.4.3. After reading the depth of each region using the total reading and region size and correcting the GC bias using the LOESS algorithm, the genetic level CNV was evaluated using the t statistic. Pairing analysis was used to assess the patterns of somatic mutations and CNV in the same individual, and a total of 25 comparisons were performed.
Ethical statement.
1. This study was performed with the approval of the Institutional Review Board of the First Affiliated Hospital of Chongqing Medical University (No. 2020-124). 2. All patients signed an informed consent document for the publication of this manuscript and any accompanying images. 3. The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. 4. All experiments were performed in accordance with relevant guidelines and regulations.