Main

Immunotherapy for the treatment of lung cancer has taken a major step forward with the recent development of antibodies that block inhibitory immune checkpoints, which are activated in many cancers as a mechanism for evading immune surveillance.1, 2, 3 The T-cell checkpoint receptor that has been most promising as a target for lung cancer immunotherapy is programmed death-1 (PD-1/ CD279), which is activated by at least two ligands, programmed death-ligand 1 (PD-L1) and programmed death-ligand 2 (PD-L2). Expression of PD-L1 ligands by cancer cells is generally recognized to be an important mechanism for preventing T-cell-mediated antitumor cytotoxicity. In clinical trials targeting this pathway, anti-PD-1 and anti-PD-L1 antibodies have shown promising results with ∼20% response rate in unselected patients with advanced treatment-refractory pulmonary squamous cell carcinoma or adenocarcinoma (including patients with sustained responses) and up to 50% tumor response rates in selected patients in a first-line setting.4, 5, 6, 7, 8, 9, 10

Predicting response to checkpoint blockade therapy for lung cancer has largely focused on measuring PD-L1 expression on tumor cells, although significant challenges in this biomarker approach are now recognized.11, 12 Some investigators have found a significant correlation between PD-L1 expression and response to checkpoint blockade therapy,7, 8, 9, 10 but others have found modest or insignificant differences in response rates of lung cancers to checkpoint blockade related to PD-L1 expression in tumor.4, 5 Although responses have been observed in some tumors lacking PD-L1 or in tumors that cannot be adequately evaluated for PD-L1, clinical trials that led to approval by the United States Food and Drug Administration for pembrolizumab in the treatment of advanced squamous cell carcinoma or adenocarcinoma of the lung were conducted in patients selected on the basis of a complementary immunohistochemistry diagnostic test showing complete or partial membranous staining ≥50% of viable tumor cells. This cutoff was established after the Keynote 001 trial found <10% response rate with a PD-L1 proportional score of <1%, although intermediate response rates were observed with cutoffs of 5 or 10%.10 Using a relatively high cutoff score for PD-L1 staining likely increased the probability of detecting a significant effect of therapy in the clinical trial, but findings of responses in many patients with scores below the established cutoff raises a question of whether additional patients who could potentially benefit from this treatment are disqualified based on the current complementary diagnostic algorithm.

Although applying such cutoffs in PD-L1 staining can be critical for determining whether a particular patient is eligible for checkpoint blockade therapy, there is remarkably little recognition that assessment of PD-L1 made on small biopsies of tumors might not represent the expression of this ligand more generally in the tumor. One recent study reported significant heterogeneity of PD-L1 expression within tumors in a series of 49 lung cancer tissue samples,13 but two other studies have reported somewhat conflicting results on concordance of PD-L1 expression between biopsy samples and corresponding resection specimens.14, 15 Considering the importance of understanding how reliable biopsy samples can reflect PD-L1 expression overall in a lung cancer, we examined the extent of PD-L1 heterogeneity in lung cancers and also addressed the question of how this heterogeneity might affect accuracy of small tissue samples to classify tumors according to PD-L1 status.

Materials and methods

Control Human Tissues and Xenograft Tumors

Samples of human term placenta and pharyngeal tonsils were obtained from tissues submitted to surgical pathology and processed through routine formalin fixation and paraffin embedding. These samples were not linked to patient by identifiers, in accordance with approval from the Internal Review Board for Human Subjects Research. CHO-K1 cells were purchased from the American Type Culture Collection and transfected with human PD-L1 as described previously.16 Xenografts were established in 6–8-week-old female NSG (non-obese diabetic.Cg-Prkdcscid Il2rgtm1Wjl/SzJ) mice obtained from the Johns Hopkins Cancer Center Animal Core. Using protocols approved by the Johns Hopkins University Animal Care and Use Committee, 10 × 106 CHO-PDL1 or CHO-WT cells were implanted subcutaneously in the upper flanks.16 When tumors reached a size of ∼1 cm, animals were killed and tumors were explanted, fixed in formalin, and embedded in paraffin using standard procedures.

Lung Cancer Tissues and Tissue Microarrays

To represent small tissue samples that might be typical for lung cancer biopsy samples, we examined cancer tissue microarray cores from a series of 79 squamous cell lung cancers and 71 pulmonary adenocarcinomas. Lung cancer tissue microarrays were constructed using punched core samples (0.6 mm in diameter, 4 cores per case) selected from various areas of tumors in paraffin tissue blocks of surgically resected primary cancers.17 The use of human tumor tissue was approved by the Johns Hopkins Institutional Review Board.

Immunohistochemistry and Scoring

Formalin-fixed, paraffin-embedded tissue sections were soaked in xylene to remove paraffin and then rehydrated through incubations in xylene (3 × 5 min), 100% ethanol (2 × 5 min), 95% ethanol (2 × 5 min), and H2O (1 × 5 min). Antigen retrieval was performed by heating the sections in pH 8.5 ethylenediaminetetraacetic buffer (Sigma; E1161) in a decloaking chamber (Biocare, Tempe, AZ, USA) for 10 min followed by 30 min cooling. Endogenous peroxide activity was blocked with 3% hydrogen peroxide for 10 min followed by a protein block (from kits) for 10 min. SP142 antibody (Spring Bioscience, Pleasanton, CA, USA) was diluted 1:100 in Dako antibody diluent with background reducing components (S3022) for 10 min. The staining was completed using the Abcam Mouse- and Rabbit-Specific HRP/DAB Detection Kit (AB64264), and sections were then counterstained with hematoxylin, washed, dehydrated, and coverslipped.

In keeping with current convention for scoring PD-L1 expression, we scored only percentages of tumor cells with some staining for PD-L1 and did not score intensity of staining or consider complete circumferential staining as different from partial membranous staining. Scores were classified according to thresholds that have been considered for use in companion diagnostic testing (0%, 1–5%, 5–10%, 10–20%, 20–50%, or >50%), and when classification of any core was not concordant among all three pathologists (TG, QKL, and EG) performing a visual estimate, cells were manually counted.

Estimating the Sensitivity and Specificity of a Small Biopsy

Different thresholds for defining positive PD-L1 staining, based on the percent-positive tumor cells by immunohistochemical staining (>1%, >10%, or >50%), were used for estimating the sensitivity and specificity of a small biopsy specimen as represented by tissue microarray cores. The ‘true’ positive or negative status of the tumor was defined as the tissue microarray scores (if all cores agreed) or the percent-positive tumor cells scored on a standard full-face histology section of tumor in cases where the tissue microarray scoring differed among the multiple cores from the same tumor. Sensitivity at each threshold level was calculated as the number of cores positive at the particular threshold value divided by the total number of cores from those tumors that were determined to be true positive for PD-L1 at that threshold level. Similarly, specificity at each threshold level was calculated as 1 minus the false-positive rate based on cores scored as positive at particular threshold in tumors that were not true positive at that threshold.

Results

Validation of SP142 Antibody for Immunohistochemical Detection of PD-L1

A number of different antibodies to PD-L1 are available and efforts to compare the performance of various antibodies and staining protocols are ongoing. We tested staining characteristics of several commercially available anti-PD-L1 antibodies in our laboratory and chose the SP142 rabbit monoclonal antibody based on consistent sensitivity and specificity in our assays. First, we confirmed that this antibody reliably detects PD-L1 expression in placental trophoblasts and tonsillar dendritic cells, both previously shown to express PD-L1.18 These tissues are commonly used as controls for PD-L1 staining, and as a more definitive demonstration of specificity, we also tested the antibody by staining xenograft tissues of CHO cells that had been transfected with human PD-L1.16 As shown in Figure 1, the SP142 antibody showed strong staining of normal human trophoblasts and tonsillar dendritic cells (panel a), as well as the xenograft tumors grown from cells that expressed human PD-L1 (panel b). The SP142 antibody is directed against the cytoplasmic domain of PD-L1, and as expected based on a previous report,19 the staining of xenografts of CHO cells transfected with PD-L1 showed a sharp membranous localization. Importantly, the antibody did not react with xenografts from control vector-transfected cells, confirming a high specificity of the staining. Several other antibodies tested showed less specific staining patterns, including cytoplasmic as well as membranous localization, nonspecific staining of vector-transfected xenograft tumor cells, or staining that was somewhat restricted to the periphery of the xenograft tumors of cells transfected with PD-L1.

Figure 1
figure 1

Validation of SP142 antibody. Immunohistochemistry was performed as described in the text and tested on standard control tissues, placenta (a) and tonsillar crypts (b). In addition, antibody was tested on xenograft tumors of CHO cells transfected with control vector (c) or a human programmed death-ligand 1 (PD-L1)-expressing vector (d).

Heterogeneity of PD-L1 Expression as Seen in Multiple Tissue Microarray Core Samples of Individual Tumors

By examining the expression of PD-L1 in lung cancer tissues, we stained tissue microarrays that represented 79 cases of primary pulmonary squamous cell carcinoma and 71 cases of primary pulmonary adenocarcinoma (150 total). Each tumor was sampled with four 0.6 mm diameter tissue cores that were randomly punched from a single paraffin block of each case and at least three cores from each case reported were adequate for scoring. As reported previously, we observed PD-L1 expression on tumor cells as well as on inflammatory cells, most notably macrophages in some, but not all, cases. Notably, attempts to standardize PD-L1 scoring to determine eligibility for checkpoint blockade therapy have not considered intensity of staining as is done for evaluating HER2 expression in breast cancer in routine pathology practice.20, 21 However, as observed previously, we did note substantial case-to-case variation in intensity of staining and extent of staining around circumference of tumor cells, as well as case-to-case variability in percentages of cells staining positive (Figure 2).

Figure 2
figure 2figure 2

Variability of programmed death-ligand 1 (PD-L1) staining in lung cancer tissue samples. (a–c) All represent samples of adenocarcinoma that were scored as >50% cells positive for PD-L1. (a) A tumor with strong and uniform circumferential staining for PD-L1, and (b and c) tumors with relatively faint, often non-circumferential staining that did involve >50% of cells. (d) A sample of adenocarcinoma that stained positive in 40% of cells, with strong staining that was heterogeneous. (e–f) Samples of squamous cell carcinoma that were all scored as >50% cells positive for PD-L1. (e) A tumor with strong and uniform circumferential staining for PD-L1, and similar to cases above, (f and g) tumors with relatively faint, often non-circumferential staining that did involve >50% of cells. (h) A sample of squamous cell carcinoma that stained positive in 35% of cells overall, with geographic heterogeneity of staining.

As these tissue microarray cores are similar in size to typical transbronchial or transthoracic needle biopsy samples, we reasoned that comparing the extent of staining across different tissue tissue microarray cores for given cases could provide an indication of how accurately individual biopsy samples of a tumor represent the overall expression of PD-L1 in the tumor. As shown in Table 1, 28 of the 71 cases (39%) of adenocarcinoma stained positive for PD-L1 in at least one tissue microarray core, but only 8 cases (11%) stained positive in >50% of tumor cells for all tissue microarray cores examined. Staining for PD-L1 was seen somewhat more frequently in cases of squamous cell carcinoma (Table 2), where 43 of 79 cases (54%) stained positive in at least one tissue microarray core and 17 of the 79 cases (22%) stained positive in >50% of tumor cells for all tissue microarray cores examined. Notably, however, substantial inconsistencies in percentages of cells staining positive for PD-L1 among different tissue microarray cores were observed for many cases of both adenocarcinoma and squamous cell carcinoma. These inconsistencies were seen among different tissue microarray cores of tumors with high levels of PD-L1 expression (i.e., some cores showing >50% tumor cells positive), as well as among different tissue microarray cores from tumors with relatively low levels of PD-L1 expression (i.e., highest expression in any one core <5% tumor cells positive). For example, <50% of tumor cells were positive for PD-L1 in 5 of 13 adenocarcinoma cases and 6 of 23 squamous cell cancer cases that could be classified as positive in >50% of cells for at least one tissue microarray core. Similarly, 21 of the 100 cases (adenocarcinoma and squamous cell carcinoma combined) that showed no PD-L1 staining in one or more tissue microarray cores did also have some tumor cells positive for PD-L1 in at least one tissue microarray core. These results suggest that in substantial percentages of lung cancers, the classification of PD-L1 staining with small biopsy samples could be highly inconsistent, depending on the particular area of tumor sampled.

Table 1 Scores for PD-L1 staining in adenocarcinoma
Table 2 Scores for PD-L1 staining in squamous cell carcinoma

Geographic Heterogeneity of PD-L1 Expression in Lung Cancers

Noting remarkable variability among different TMA cores of some tumors for percentages of cells staining positive for PD-L1, we then examined the expression of PD-L1 in whole sections of all tumors with core-to-core variability. Consistent with previous findings,13 we noted highly heterogeneous staining across different areas of these tumors. A striking example of heterogeneous staining for a case of squamous cell carcinoma is shown in Figure 3, with areas of tumor that show robust expression closely approximating areas with low or even absent expression. (The scores for PD-L1-positive cells in the four tissue microarray cores evaluated in this case 12%, 55%, 80%, and 95%.) As seen in the images shown in this figure, geographic variability in PD-L1 expression was seen across different areas of tumor. Accordingly, we conclude that the variable staining for PD-L1 among different tissue microarray core samples of tumors seen in some tumors is due to heterogeneous expression of the PD-L1 ligand in these tumors.

Figure 3
figure 3

Geographic heterogeneity of programmed death-ligand 1 (PD-L1) expression. A case of lung cancer with geographic heterogeneity that was reflected in discordant scores across different tissue microarray (TMA) core tissue samples. Three different areas of tumor are shown in (a–c), selected from the field shown in the top left image.

Estimating Effect of Geographic Heterogeneity on Sensitivity and Specificity of Single Small Tissue Sample to Assess PD-L1 Expression

Finally, we estimated the sensitivity and specificity of single small tissue samples for assessing PD-L1 expression at commonly used threshold levels for scoring percentages of positive-staining tumor cells (Table 3). As described above, sensitivity at each threshold level was defined as the number of positive-staining cores divided by the total number of cores taken from tumors determined to be positive at the threshold, and specificity was defined as 1 minus the false-positive rate at that threshold level. Remarkably, the sensitivity of a single tissue microarray core for scoring PD-L1 expression at a threshold of >50% is <100% (∼85% for adenocarcinoma and ∼95% for squamous cell carcinoma) as a result of cases such as that shown in Figure 3, where some areas of low PD-L1 expression are seen in a tumor that generally has a high percentage of tumor cells staining for PD-L1. The sensitivity of a single tissue microarray core for scoring PD-L1 expression at a threshold of >1% is also <100% (∼87% for adenocarcinoma and ∼90% for squamous cell carcinoma) because of cases where only a few scattered cells stain positive. These results suggest that in considerable percentages of lung cancers, the classification of PD-L1 staining with small biopsy samples might not represent the overall expression of PD-L1 in that tumor.

Table 3 Sensitivity and specificity of single TMA cores for predicting PD-L1 score of overall tumor at three clinically relevant thresholds

Discussion

There is increasing interest in defining biomarkers for predicting response to PD-L1 checkpoint blockade therapy as this treatment increasingly becomes a major therapeutic option for lung cancer. PD-L1 expression on tumor cells would intuitively seem to be a logical biomarker for anti-PD1 therapy, and early studies suggested that PD-L1 expression on tumor cells of various types does correlate with response to checkpoint blockade therapy.7, 8, 9, 10 Several other clinical trials, however, found that PD-L1 staining in lung cancer biopsy tissues only somewhat correlated with response to checkpoint blockade therapy, or did not correlate with response to any significant degree.4, 5 Nevertheless, measuring PD-L1 expression in lung cancer tissues is now used to determine eligibility for pembrolizumab therapy and is commonly used for predicting response to other checkpoint therapy.

The question addressed by this present study was whether small tumor biopsy samples can adequately assess PD-L1 expression in cases of pulmonary squamous cell carcinoma or adenocarcinoma, particularly when thresholds for positive-staining tumor cells are used. We found that the extent of heterogeneity of PD-L1 expression in lung cancer tumors affects how tumors will be scored for PD-L1 in small biopsy specimens, and that many cases of lung cancer could be inaccurately or variably scored with respect to a threshold based on a single biopsy sample.

In addition to a general demonstration of heterogeneous PD-L1 expression in lung cancers, two previous studies compared results of PD-L1 staining in diagnostic biopsy specimens and corresponding resected tumors from patients diagnosed with pulmonary squamous cell carcinoma or adenocarcinoma. In the first,14 PD-L1 expression results in biopsy specimens reportedly correlated well with those of the corresponding resected tumors in 92% of cases. However, a second study found relatively poor correlation between diagnostic biopsies and corresponding resected tumors for PD-L1 expression, with more than one-half of the cases showing lower percentages of PD-L1-positive cells in the diagnostic biopsy samples than in the resection specimens.15 Taken together with our results, these finding suggest that evaluation of PD-L1 expression in diagnostic biopsies can be misleading for estimating the general expression of PD-L1 in a lung cancer.

Notably, standards for scoring PD-L1 staining in lung cancer have not, to date, considered what pattern or intensity of PD-L1 staining on tumor cells correlates with response to immune checkpoint blockade therapy. Current assays quantify the percentage of tumor cells that have any membranous staining, which is an approach that might have an advantage of not being affected by technical variability in staining methods.11 Furthermore, PD-L1 expression can concentrate at contact points between tumor cells and immune cells, sites of cell–cell interaction known as ‘immune synapses’.22 Recognizing these technical and biological justifications, we also used this scoring standard in our study.

Accurate quantitative assessment of tumor cell PD-L1 expression in biopsy samples is likely not be the only obstacle for biomarker development in lung cancer checkpoint blockade therapy, however. In fact, some data suggest that PD-L1 expression on immune cells rather than on tumor cells might more consistently correlate with clinical response to PD-L1 checkpoint blockade therapy. For example, a recent study showing clinical benefit of immune checkpoint blockade in colorectal cancers with mismatch-repair defects found prominent membranous PD-L1 expression on tumor-infiltrating lymphocytes and tumor-associated macrophages at the invasive fronts of these tumors, but not on cancer cells themselves.23 In an exploratory analysis of biomarkers for the POPLAR trial of atezolizumab in previously treated pulmonary squamous cell carcinoma or adenocarcinoma, improved overall survival was most closely associated with pre-existing immunity as defined by high T-effector–interferon-γ-associated gene expression.24 Thus, PD-L1 expression on tumor cells—the variable measured in our study—might not be the best biomarker for predicting response to checkpoint blockade in lung cancer or other types of cancers.

One possible alternative to consider as a biomarker for response to PD1 checkpoint blockade is PD-L2, which is a second ligand for the PD-1 receptor. A number of studies have shown a significant role for PD-L2 in modulating immune responses, including downregulating CD8+ T-cell-mediated immune responses to endothelial cells.25 Remarkably, PD-L2 has received relatively little attention for its role in modulating tumor immunology,26 although in lung cancer, PD-L2 expression has been reported to be as common as PD-L1 expression.27, 28 Even as attempts are made to standardize staining and scoring for PD-L1 in lung cancer, alternative biomarkers should be considered for predicting response to checkpoint inhibitor therapy.