Introduction

Spread through air spaces (STAS) is defined in the current World Health Organization (WHO) Classification of Lung Tumors as “spread of micropapillary clusters, solid nests, and/or single cancer cells into air spaces in the lung parenchyma beyond the edge of the main tumor” [1]. Although the definition of STAS varies across different studies, STAS has been shown to be associated with decreased recurrence-free survival (RFS) and overall survival in multivariate analyses [2,3,4,5,6,7,8,9,10,11,12,13,14,15]. It has also been shown to be associated with other aggressive pathologic features such as larger tumor size [6, 12, 16], pleural invasion [6, 11, 17], lymphatic invasion [3, 6, 10, 11, 16,17,18], vascular invasion [6, 11], nodal metastases [10, 12], higher pathologic stage [10], and high-grade histological patterns [3, 5, 6, 8,9,10,11,12, 14, 16, 19, 20].

Increased screening for lung cancer in smokers has led to increased detection of early-stage peripheral lung cancers. Surgical resection is the standard of care for these patients. Sublobar resection (pulmonary parenchyma sparing surgery) for smaller peripheral nodules, instead of more extensive lobar resection, has become more common recently [21,22,23]. Some investigators have advocated for STAS to serve as an indication for escalation to lobar resection because STAS appears to be an independent predictor of disease recurrence in sublobar resections of early-stage lung adenocarcinoma but not in lobectomies [3, 4, 7, 11, 24]. Therefore, in STAS-positive cases, lobectomy may be associated with better clinical outcomes than sublobar resection [4, 7]. However, other studies have not confirmed the latter observations [5, 6, 18].

If STAS is indeed clinically significant and would warrant lobectomy over sublobar resection, then intraoperative evaluation would be an important step in the management of patients with early-stage lung adenocarcinoma, as it could prompt an immediate decision of lobectomy over limited resection. So far, only a few studies have evaluated the diagnostic performance characteristics of frozen sections for the detection of STAS [7, 16, 25].

Recently a new grading system for pulmonary adenocarcinoma has been proposed [20]. The grading system is based on a combination of histological patterns with emphasis on high-grade patterns, which offers better prognostic correlation than the adenocarcinoma classification system that is based on the predominant pattern only. Therefore, the performance of the new grading system was assessed to compare the feasibility of STAS and tumor grade detection on frozen sections.

Methods

Case selection

Patients with stage I lung adenocarcinoma who underwent frozen sections were identified retrospectively at two institutions, New York University Langone Health Tisch Hospital (N = 66) and Massachusetts General Hospital (N = 97). All surgeries were performed by video-assisted thorascopic surgery. Adequate lymph node dissection at the time of surgery was performed in all patients, except 18 cases in which the patients’ other comorbidities prevented more extensive surgery so only wedge resections of the tumors were done. In these cases lymph node staging was performed by imaging studies and endobronchial ultrasound-guided bronchoscopic biopsies. All resections had clear margins. The study was approved by each respective Institution Review Board and performed according to HIPAA regulations.

Histological evaluation and study design

All intraoperative consultations were performed for confirmation of malignancy as part of routine clinical care. For each case, one section was taken from fresh tumor with adjacent benign lung parenchyma. Each sample was embedded and frozen in OCT compound and cut into 4- to 5-micron thick sections on positively charged glass slides for hematoxylin and eosin (H&E) staining. The remnant of each frozen tissue sample was processed into a formalin fixed paraffin embedded (FFPE) block and prepared into one 4-micron thick H&E-stained slide as the frozen section control. The remainder of each specimen that was not frozen was inflated and fixed using 10% neutral buffered formalin for a minimum of 6 h before routine grossing and processing into FFPE blocks and 4-micron thick permanent H&E slides. Inflation was performed using a syringe at the bronchial remnant. When a bronchial remnant was not present or when the bronchial tree was significantly disrupted by the frozen section, formalin was injected directly into the lung parenchyma using a syringe with an attached needle. At grossing, each tumor was submitted entirely with adjacent benign lung parenchyma.

For each tumor, all frozen section slides and permanent section slides were evaluated. The presence or absence of STAS on each type of slide was recorded. STAS was defined as described by the 2015 WHO Classification Tumour of the Lung [1, 3]: briefly, spread of micropapillary clusters, solid nests, and/or single cancer cells into air spaces in the lung parenchyma beyond the edge of the main tumor. Artifacts were also defined by previously described criteria [3]: briefly, linear strips of cells that appeared stripped from alveolar walls and clusters of tumor cells randomly discontinuously distributed in the tissue were excluded from being classified as STAS. Figure 1 illustrates one example of an artifact.

Fig. 1: Example of artifact.
figure 1

An example of artifact that was not classified as STAS, consisting of linear strips of tumor cells that are lifted off the alveolar walls.

In addition, the percentage of each histological growth pattern (lepidic, acinar, papillary, micropapillary, solid, and complex glandular patterns) was recorded in increments of 5% for each tumor, on both the frozen sections and the permanent sections. Tumor grade was also recorded separately for each type of slide. Tumor grade was defined by architectural pattern as recently proposed by the International Association for the Study of Lung Cancer pathology committee [20]. Grade 1 (well differentiated) was defined as lepidic predominant adenocarcinoma with <20% high-grade pattern. Grade 2 (moderately differentiated) was defined as acinar or papillary predominant with <20% high-grade pattern. Grade 3 (poorly differentiated) was defined as any tumor with ≥20% high-grade pattern. High-grade pattern was defined as solid, micropapillary, and/or complex glands.

The same study protocols, including histological evaluation and definition of STAS, were followed by both institutions.

Statistical analysis

Cohen’s kappa tests (K) were performed to measure the strength of agreement between frozen and permanent sections, for both STAS and tumor grade. Cohen’s kappa agreement values were interpreted as follows, based on Landis and Koch’s classification [26]: <0: poor; 0.01–0.20: slight; 0.21–0.40: fair; 0.41–0.60: moderate; 0.61–0.80: substantial; 0.81–1.00: almost perfect. The 95% confidence intervals (95% CI) for Cohen’s kappa were calculated using the GraphPad QuickCalcs website: https://www.graphpad.com/quickcalcs/kappa1/ (accessed 15 Oct 2020).

Cross-tabulations were created and diagnostic test performance characteristics (sensitivity, specificity, positive predictive value, negative predictive value, and accuracy) were calculated to evaluate the performance of frozen sections in the detection of STAS and tumor grade. The reference group was defined as the permanent sections, since all studies describing the prognostic association of STAS were performed on permanent sections.

Pearson’s χ2 and post-test Cramer’s V coefficient (V) were calculated to examine the presence and strength of association between the categorical variables “STAS” and “tumor grade.” Strengths of association according to Cramer’s V coefficients were interpreted as follows: 0 to <0.1: negligible; 0.1–<0.2: weak; 0.2–<0.4: moderate; 0.4–<0.6: relatively strong; 0.6–<0.8: strong; 0.8–1: very strong [27].

RFS for patients with sublobar resections was defined as time (days) from surgery until date of recurrence. Loco-regional recurrences were defined as recurrences in the lymph nodes of the chest, intrapulmonary metastases, or pleural metastases. Distant metastases were defined as those occurring outside of the chest. Patients with no recurrence were censored on the day of last follow-up. RFS was estimated by the Kaplan–Meier method, and the groups were compared using the Log-rank (Mantel–Cox) test (GraphPad Prism 8.2.0, San Diego, CA).

All statistical analyses were performed using IBM SPSS Statistics version 25 (Armonk, NY), unless otherwise specified above. All statistical significance was set at p < 0.05.

Results

Patient demographics

A total of 163 stage I patients were included in the study. Our cohort was composed of 104 women and 59 men, aged between 33 and 97 years old (mean 68 ± 9.7). There were 94 sublobar resections and 69 lobectomies (Table 1). All specimens had clear margins. The prevalence of STAS was 24% (40/163) on permanent sections. STAS was seen in 28.2% (46/163) of frozen sections and 29.4% (48/163) of frozen section control sections.

Table 1 Summary of clinicopathologic data.

Diagnostic performance of frozen sections to detect STAS

Compared to permanent sections, the overall diagnostic accuracy of STAS on frozen sections was 74%, and agreement was fair (K = 0.34) (Table 2). Of the 40 cases that had STAS on permanent sections, STAS was also present on frozen sections in 22 cases (sensitivity 55%). Of the 123 cases that had no STAS on permanent sections, STAS was also absent on the frozen sections in 99 cases (specificity 80%). Of the 46 cases that had STAS on frozen sections, STAS was present on permanent sections in 22 cases (PPV 48%). Of the 117 cases that did not have STAS on frozen sections, STAS was also absent on permanent sections in 99 cases (NPV 85%). Figures 24 demonstrate examples of concordant and discordant cases.

Table 2 Diagnostic performance of STAS on frozen sections compared to permanent sections.
Fig. 2: Concordant case.
figure 2

An example of a tumor that displayed STAS on both frozen section (A) and permanent section (B). It was composed of the following growth patterns: micropapillary (60%), acinar (20%), papillary (10%), and lepidic (10%).

Fig. 3: Discordant case.
figure 3

An example of a tumor that displayed no STAS on frozen section (A) but did display STAS on permanent section (B). It was composed of the following growth patterns: acinar (50%), papillary (20%), lepidic (20%), and micropapillary (10%).

Fig. 4: Discordant case.
figure 4

An example of a tumor that displayed STAS on frozen section (A) but did not display STAS on permanent section (B). It was composed of the following growth patterns: solid (40%), acinar (30%), papillary (20%), and micropapillary (10%). STAS was present only focally on the frozen section slide.

There was fair agreement between the presence of STAS on frozen sections and frozen section controls (K = 0.39) with an accuracy of 74.5%. However, the inclusion of frozen section controls in the analysis (as permanent sections) resulted in reduction of sensitivity to 52%, and accuracy to 73%, mostly due to an increase in false-positive STAS seen on the frozen section control slides, as previously reported [19, 28]. Therefore, in order to avoid the introduction of additional frozen section artifacts, frozen section control slides were excluded from further analysis.

Diagnostic performance of frozen sections to assess tumor grade

Table 3 shows the distribution and correlation between tumor grade on frozen sections and tumor grade on permanent sections. Agreement was moderate (K = 0.54), which was higher than that of STAS agreement between these two types of sections (K = 0.34). Table 4 shows the diagnostic performance characteristics of frozen sections for the detection of grade 3 (versus grades 1–2) tumors compared to permanent sections. There was substantial agreement (K = 0.72). Sensitivity was 77%, specificity was 94%, PPV was 90%, NPV was 85%, and accuracy was 87%. The overall diagnostic performance of frozen sections was higher for the detection of grade 3 tumors than for the detection of STAS (Fig. 5).

Table 3 Diagnostic performance of tumor grade on frozen sections compared to permanent sections.
Table 4 Diagnostic performance of frozen sections for detection of poorly differentiated (grade 3) adenocarcinoma compared to permanent sections.
Fig. 5: Comparison of diagnostic performance.
figure 5

Diagnostic performance characteristics of frozen sections for the detection of STAS vs poorly differentiated (grade 3) tumors.

Correlation of STAS with tumor grade

Based on the permanent sections, there were 48 well, 46 moderately, and 69 poorly differentiated adenocarcinomas. The presence of STAS was associated with higher tumor grade (χ2 tests, p < 0.0001). The strength of association between STAS on frozen sections and grade on permanent sections was moderate (V = 0.30); the strength of association between STAS on permanent sections and grade on permanent sections was relatively strong (V = 0.45) (Fig. 6). A similar strength of association was also seen between STAS on frozen sections and grade on frozen sections (V = 0.37, moderate).

Fig. 6: Grade versus STAS.
figure 6

There was statistically significant, moderate to relatively strong association between tumor grade (on permanent sections) and STAS on frozen sections (A), permanent sections (B), and both types of sections combined (C).

Of the 46 tumors that had STAS on frozen sections, 4 (9%) were well differentiated, 14 (30%) were moderately differentiated, and 28 (61%) were poorly differentiated on permanent sections. Of the 28 poorly differentiated tumors that showed STAS on frozen sections, STAS remained positive on permanent sections only in 17 cases (PPV = 61%). Of the 14 moderately differentiated tumors that showed STAS on frozen, only 5 exhibited STAS on permanent (PPV = 36%). Of the 4 well-differentiated tumors that were STAS positive on frozen, STAS was not seen on permanent in any of them (PPV = 0%).

Of the 117 tumors without STAS on frozen sections, 44 (38%) were well, 32 (27%) were moderately, and 41 (35%) were poorly differentiated on permanent sections. Of the 41 poorly differentiated tumors without STAS on frozen sections, STAS remained negative on permanent sections in 26 cases (NPV = 63%). Of the 32 moderately differentiated tumors without STAS on frozen sections, STAS remained negative on permanent sections in 30 cases (NPV = 94%). Of the 44 well-differentiated tumors without STAS on frozen sections, STAS remained negative on permanent sections in 43 cases (NPV = 98%).

Of the 40 cases that had STAS on permanent sections, 1 (2.5%) was well differentiated (with 0% high-grade pattern), 7 (18%) were moderately differentiated, and 32 (80%) were poorly differentiated. The well-differentiated tumor with STAS on permanent contained only focal STAS, and did not show STAS on frozen section. Of the 32 poorly differentiated tumors that had STAS on permanent, 17 had STAS on frozen sections (sensitivity = 53%). Of the seven moderately differentiated tumors that had STAS on permanent, five had STAS on frozen sections (sensitivity = 71%), four (57%) of which had at least 10% high-grade component.

Recurrence-free survival in sublobar resections

RFS was estimated separately in 94 patients who underwent sublobar resection (segmentectomy or wedge resection), since it has previously been described that the presence of STAS is associated with recurrence only in sublobar resections [3, 4, 7, 11, 24]. In the sublobar resection subgroup, STAS was present in 26% (24/94) of cases. The distribution of tumor grade in this group was the following: grade 1 (n = 36), grade 2 (n = 25), and grade 3 (n = 33). In the lobectomy group, STAS was present in 33% (23/69) of cases with the following tumor grade distribution: grade 1 (n = 13), grade 2 (n = 18), and grade 3 (n = 38).

There were 19 recurrences. Recurrences were observed in 13 patients treated with sublobar resection (10 loco-regional and 3 distant), whereas there were 6 recurrences in patients treated with lobectomy (3 loco-regional and 3 distant). The average time to recurrence was 1030 ± 973 days (range 482–3097 days) and 1324 ± 822 (range 811–3286 days), respectively. There was no statistical difference between the two subgroups (p = 0.5).

STAS was observed on frozen section slides of 4 (31%) of the 13 patients with recurrence in the sublobar resection group, but there was no STAS on the permanent sections in this subgroup. The distribution of grade in patients with recurrences after sublobar resection was: grade 1 (n = 2), grade 2 (n = 4), and grade 3 (n = 7). Similarly, in the lobectomy subgroup, STAS was present on frozen section slides from three (50%) of the six patients with recurrence. STAS was present on both frozen section and permanent slides in two cases. The distribution of grade for patients with recurrences after lobectomy was: grade 1 (n = 0), grade 2 (n = 2), and grade 3 (n = 4).

In the sublobar resection group, there was no significant difference in RFS between cases with and without STAS identified on frozen sections (p = 0.469) (Fig. 7A). There was, however, a significant difference in RFS according to STAS detected on permanent sections (p = 0.034) (Fig. 7B). The determination of tumor grade on frozen sections showed significant difference in RFS (p = 0.038) (Fig. 7C), and the significance increased when grades 1 and 2 were combined and compared to grade 3 (p = 0.018) (Fig. 7D), thus suggesting that information about high-grade component at the time of surgery is a better predictor of tumor recurrence, compared to STAS. The determination of tumor grade on permanent sections also showed significant differences in RFS in this group of patients (p = 0.007 and 0.008) (Fig. 7E, F).

Fig. 7: Recurrence-free survival (RFS, days) in patients who underwent sublobar resection.
figure 7

Detection of STAS on frozen sections (FS) was not associated with RFS (A). Detection of STAS on permanent sections was associated with RFS (B). Tumor grade on frozen sections was associated with RFS (C, D). Grades 1 and 2 were combined into one category in D. Tumor grade on permanent sections was also associated with RFS (E, F). Grades 1 and 2 were combined into one category in F.

When analyzing all cases (both sublobar and lobar resections), STAS did not predict RFS (p = 0.420) on frozen sections, while it did predict RFS on permanent sections (p < 0.0001). Grade predicted RFS on both frozen (p = 0.012) and permanent (p = 0.004) sections (Supplementary Fig. 1).

Discussion

Although there are multiple accounts in the literature that STAS is associated with RFS in sublobar resections, the investigators did not use frozen section slides [3, 4, 7, 11, 24]. As seen in our findings, while STAS on permanent sections is indeed associated with RFS, this association does not hold true on frozen sections. Our study provides the first evidence that in sublobar resections of stage I pulmonary adenocarcinomas, STAS on frozen sections is not associated with RFS. Intraoperative detection of STAS has low sensitivity (55%) and PPV (48%), with higher specificity (80%) and NPV (85%). The detection of STAS on frozen sections presents with many false-positive results. This, in combination with the lack of association with RFS, is an important drawback to an intraoperative test that would determine the extent of resection.

Low sensitivity of detecting STAS on frozen sections has also been reported by other investigators [7, 16, 25], in one study reaching as low as 35% [16]. This can be partly attributed to sampling limitations and other technical issues such as cutting a fresh, unfixed, uninflated specimen [19, 28] that confound interpretation on frozen sections [7, 16, 18, 25]. Walts et al. [25] surmised that their low sensitivity (50%) may be due to lack of lung inflation and the limitations of partial sampling during frozen sections. They thus recommended against assessing STAS on frozen sections. A similar statement was made by Morimoto et al. [18].

Of note, the prevalence of STAS in Walts et al.’s study [25] was 95.8%, which was significantly higher than in our cohort (24%). The prevalence of STAS in other reports ranges from 14.8 to 51.4% [2]. This variance may be due to differences in factors such as study population, different stages of disease, interobserver variability, and/or study methodology. Interobserver variability was examined in two recent single-center studies. Villalba et al. [16] showed that the sensitivity of STAS detection on frozen sections ranged from 35 to 77% among five pathologists, and that there was moderate agreement for both interobserver and intraobserver reliability. Upon further analysis, they concluded that most discordant diagnoses were due to the presence of artifacts that mimic STAS. Eguchi et al. [7] demonstrated substantial interobserver reliability (Gwet’s AC1 = 0.67) among five pathologists, and concluded that frozen section diagnosis of STAS was feasible based on an overall sensitivity of 71% and specificity of 92% among all five pathologists. However, similar to the study by Villalba et al., there were also wide variations in sensitivity (52–86%) and specificity (74–100%) across the five individual observers in their study. An interobserver reliability of 0.67, though categorized as “substantial,” may be undesirable for a frozen section test, and the overall combined performance of five pulmonary-focused pathologists may not reflect general pathology practice. During time-sensitive frozen sections, most pathologists are able to confer with only one or two colleagues who may have variable experience in lung pathology. In addition, even in subspecialized centers, most frozen section laboratories do not have a dedicated pulmonary pathology subspecialist reading frozen section slides.

In our results, agreement between frozen and permanent sections was higher for tumor grade than for STAS. This may be due to the relative ease of detecting the presence of high-grade histological patterns compared to STAS, which can be seen only focally (and thus is more influenced by sampling limitations) or may be difficult to distinguish from artifact. Tumor grade is one of the known covariates of STAS [3, 5, 6, 8,9,10,11,12,13,14, 16, 19, 20], and is associated with increased recurrence risk [4, 7, 9] as well as decreased survival [7, 10]. A prognostic model for tumor grade in pulmonary adenocarcinoma was established in a recent, multi-institutional study, which simultaneously showed that STAS did not influence the predictive value of the model [20]. Applying the same definitions of tumor grade that were used in this model, we confirmed that STAS is associated with higher tumor grade, and that on frozen sections, tumor grade is associated with RFS while STAS is not.

We had 24 cases of false-positive STAS on frozen sections, which contributed to the low PPV of the test. Of these false positives, 13 cases were well or moderately differentiated adenocarcinomas, and 11 cases were poorly differentiated. If STAS on frozen sections were used as an indication for lobectomy, then the 13 patients with well or moderately differentiated tumors may have had an unnecessarily aggressive procedure. Regarding the patients with poorly differentiated tumors, it is yet unknown whether they would have benefitted from lobectomy over sublobar resection. A clinical trial is needed to answer this question.

One of the research methods to contemplate is what types of slides are included in the “gold standard” or reference group. The decision depends on whether or not STAS that is seen on frozen but not permanent sections is regarded as true STAS. Studies assessing the feasibility of STAS detection on frozen sections have taken various approaches to this matter [7, 16, 25]. However, the more important question is whether there is an association between STAS on frozen sections and patient outcomes. Our results show that there is no association between STAS on frozen sections and RFS, and that STAS is a covariate of tumor grade, whereas there is a statistically significant association between tumor grade on frozen sections and RFS. In all, our observations question the feasibility of using intraoperative detection of STAS as an indication for more extensive surgery.

There is currently a trend toward performing less-extensive surgery for smaller and more peripheral lung cancers. When evaluating the feasibility of detecting any high-risk histological parameter on a frozen section as an indication to escalate surgical management, care must be taken to avoid false-positive results that could lead to irreversible, unnecessarily aggressive surgery, especially in a population with low prevalence (pretest probability) of that histological parameter. In our study of stage I lung adenocarcinomas, the detection of STAS on frozen sections was suboptimal, with 74% accuracy, 55% sensitivity, 80% specificity, and 48% PPV. In principle, a diagnostic test with lower than 90% accuracy is not desirable for clinical implementation [29]. In addition, our results confirm that STAS is a covariate of high-grade histology, and we reveal that in sublobar resections, high histological grade detected on frozen sections is associated with shorter RFS, whereas STAS detection on frozen sections has no bearing on RFS. Further study is needed to explore the utility of assessing tumor grade on frozen sections.