Main

Early detection of lung cancer is mandatory to reduce its extremely high mortality rate.1 Numerous trials, worldwide, have compared the roles of sputum cytology, conventional chest X-ray, and helical computed tomography (CT) scanning in detecting early lung cancer.2, 3, 4 Until now, however, the results of such screening tests have been controversial because of either low accuracy or the uncertain significance of their findings.2, 3

Although sputum abnormalities may reflect neoplastic events in the central airways,3 cytologic screening of sputum lacks sensitivity and has generally been abandoned in favor of tests, such as fine needle aspiration or bronchoscopically procured brush, wash, or lavage specimens. However, because sputum collection is noninvasive, it would be advantageous to develop robust biomarkers that could be tested on cells in sputum that indicate the presence or increased risk of lung cancer.

We previously demonstrated, through extensive fluorescence in situ hybridization (FISH) mapping studies of NSCLC and adjacent normal bronchial epithelial cells, consistently high levels of deletions in malignant cells and adjacent bronchial epithelial cells in genes located within 3p22.1 (containing the GC20 gene, RPL 14, CD39A, and PMGB) and 10q22.3 (containing the surfactant protein A gene, SP-A).5, 6, 7 These genetic deletions have also been observed in brush specimens of bronchial epithelial cells on the same side as the tumor (TBB), normal lung tissue (NTP) distal to the tumor, and in the main bronchus on the opposite side of the tumor (NBB).6, 7

This ‘so-called’ molecular field cancerization effect likely results from multiple clonal abnormalities arising within respiratory epithelial cells exposed to carcinogenic substances from tobacco smoke and other pollutants and reflects genetic predisposition for reduced DNA repair capacity in such patients.8, 9, 10, 11 The presence of concurrent cytologic atypia, especially moderate and severe dysplasia, in sputum cells is also believed to reflect this field effect and has been shown to be substantially associated with an increased risk of developing lung cancer.12 This observed field effect led us to hypothesize that sputum from a patient with lung cancer or at high risk for it would contain upper airway cells showing these molecular abnormalities, even if these cells showed minimal morphologic atypia.

Thus, the goals of our study were to evaluate the role of cytologic atypia and genetic abnormalities detected by FISH to predict the presence of lung cancer. We also wished to develop a quantitative model that could be used to predict an individual patient's risk for lung cancer.

Materials and methods

Study Design and Patient Population

In this prospective clinical trial, we evaluated sputum samples collected between April 1, 2004 and November 30, 2006, from consecutive patients with lung cancer and from age-matched healthy (nonsmokers) and high-risk (history of heavy smoking) control subjects. None of the participants had received prior radiotherapy or chemotherapy.

The University of Texas MD Anderson Cancer Center Investigational Review Board approved this study, and all study participants signed a consent form detailing the research methods.

Trained staff interviewers from MD Anderson Cancer Center administered an epidemiologic questionnaire to all study participants. Data collected included demographic characteristics and history of tobacco use.9 In the cancer patients who underwent resection, sputum samples were collected before surgery. All healthy and high-risk smoker control subjects had helical CT scans negative for the detection of lung cancer at the time of study entry and for the following 2 years.

All participants (except for the healthy control subjects, who underwent induced sputum production after saline inhalation with a nebulizer) were instructed to cough into a container that was filled with Sacommano's fixative (90% alcohol, 5% acetic acid, and 5% polyethylene glycol) on 3 consecutive days of arising. Mailed in sputa were cytocentrifuged and filtered through gauze. The sediment was used for at least 10 preparations. Eight cytospin preparations were generated on positively charged glass slides using a Shandon Cytospin 2 Cytocentrifuge (Thermo Fisher Scientific Inc., Waltham, MA) and fixed in a 4:1 mixture of methanol and acetone. The remaining two preparations were fixed in 95% alcohol for Papanicolaou's staining. Scoring of two Papanicolaou-stained sputa was performed by a cytotechnologist and two senior cytopathologists, none of whom had knowledge of the patients' clinical history. Discrepant diagnoses were scored by consensus over a multiheaded microscope by all three observers (CA, NPC, and RLK). Slides were screened and classified according to a seven-tiered scoring system as follows: negative, squamous metaplasia, mild dysplasia, moderate dysplasia, severe dysplasia, carcinoma, or insufficient for diagnosis. Cytologic classification was based on histologic criteria for squamous metaplasia, dysplasia, and carcinoma in situ according to the WHO classification.13 Slides were considered insufficient for diagnosis, if they had zero to three histiocytes; excessive cellular degeneration; obscuring bacterial, fungal or neutrophilic contamination; or limited number of epithelial cells.

FISH Analysis

A two-color FISH assay using bacterial artificial chromosome probes for 3p22.1 and 10q22.3 (SP-A), combined with commercial centromeric probes for chromosomes 3 (cep 3) and 10 (cep 10; Vysis Inc., Downers Grove, IL) was performed on two separate slides. A bacterial artificial chromosome clone of 180 000 bp, located at 10q22.3 and confirmed to contain genomic sequences of SP-A comprising both SFTPA1 and SFTPA2 (RP11-506MI3; Invitrogen Corp., Carlsbad, CA), was isolated and labeled with Spectrum Green dUTP (Vysis).5 The chromosomal location of the clone was confirmed on a normal metaphase spread in combination with a centromeric 10 probe, directly fluorescence labeled with Spectrum Orange (Vysis).

The BAC clone for 3p22.1, containing almost 200 kb of genomic sequences and consisting of four known genes—RPL 14, CD39A, GC20 translation factor sui1 homolog, and PMGB—was obtained commercially from Invitrogen and grown in Escherichia coli. It was subsequently isolated, linearized, and labeled with Spectrum Green dUTP according to the manufacturer's directions. Localization of the bacterial artificial chromosome clone on chromosome 3 was confirmed by using normal metaphase FISH. One hundred nanograms of each labeled probe was mixed with an equal quantity of human Cot-1 DNA (Life Technologies, Rockville, MD) in 10 μl of LSI hybridization buffer (Vysis) and mounted on a slide together with 1 μl of either cep 3 (for 3p22.1) or cep 10 (for the SP-A gene on 10q22.3). Hybridization and post-washing were carried out as described previously.5 Counterstaining of nuclei was performed with 4′, 6-diamidino-2-phenylindole and evaluated under a fluorescence microscope equipped with the appropriate filter combinations. If the hybridization signals were deemed satisfactory, the slides were sent for automated FISH scanning.

Automated FISH Scanning

Slides were automatically scanned by the Duet™ (BioView Ltd Rehovot, Israel). The Duet is based on a fully automated microscope (Olympus BX61, Japan), a motorized 8-slides stage (Marzhauser, Wetzler, Germany) and a 3CCD progressive scan color camera (JVC KYF75U, Japan). FISH scanning was performed using × 60 oil objective in fluorescent illumination and by using appropriate filters and a software program specifically designed to capture and analyze the orange and green fluorescent signals generated from our probes. While scanning, the system produces images of all captured cells that can be further reviewed by the operator.

Two slides were scanned for each sputum samples: one slide was hybridized to the 3p22.1 and its control centromere and the second slide was hybridized to the 10q22.3 probe and centromere 10. At least 100 epithelial cells per slide were scanned and evaluated for FISH abnormalities. For each cell, the presence or absence of centromeric probes (orange signals) or locus-specific probes (green signals) was recorded. Cells were classified into four major subgroups: (a) Normal—displaying two centromeric and two locus-specific signals; (b) Deletion—displaying more centromeric than locus specific signals (ie, two centromeric and one locus-specific signals, three and two, etc); (c) Polysomy—displaying multiple gains of both centromere and locus-specific signals; (d) Monosomy—displaying only one centromeric signal.

Cells that could not be scored were discarded into an unclassified category.

Only cells that were clearly nonoverlapping and complete, with well-preserved nuclei, with high-quality fluorescence signals, and without background fluorescence were scored. At the end of each scan, the images of all cells that were scanned and analyzed were displayed and a pie chart summarizing the chromosomal abnormalities found in each scan was generated. Two experienced observers, who were trained on the system and blinded to the patients' clinical status confirmed the automatic classification and reclassified cells that were misinterpreted by the system.

Normal controls consisted of pooled human lymphocytes hybridized and quantitated automatically in the same batches as sputa for 3p22.1 and 10q22.3 as the mean number of cells with summed abnormalities of 3p22.1, cep 3, 10q22.3, and cep 10±1 s.d.

Statistical Analyses

The mean, s.d., median, and range for continuous variables were analyzed using Wilcoxon's rank-sum test to assess the differences in the distribution of genetic abnormalities between cancer patients and control subjects. For categorical variables such as sex, smoking history, cytologic diagnosis, and disease stage, Fisher's exact test was used to assess the association between the different variables and cancer status.

Univariate and multivariate logistic regression models for estimating cancer status were also performed. From the multivariate models, receiver–operator-characteristic (ROC) curves were produced to estimate each individual's predicted probability of having cancer.

Results

Patient Population

A total of 71 patients were enrolled in the trial, but five were excluded, because they had poor-quality sputum specimens that did not produce valuable cells for FISH analysis. Of the 66 patients whose sputa could be evaluated, 35 had predominantly early-stage lung cancer and 31 were control patients, of whom 6 were healthy and 25 were at high risk for lung cancer because of their history of heavy smoking (Table 1). The patients with cancer had NSCLC, classified predominantly as adenocarcinoma and squamous cell carcinoma. Most of the patients had peripherally based tumors. Cytologic diagnoses of squamous metaplasia and mild, moderate, and severe dysplasia were significantly associated with cancer status (Table 2). Of 31 patients without cancer, 19/31(70.37%) were negative by cytology; conversely 27/35 patients (77%) of patients with cancer had cytology diagnosis of squamous metaplasia or worse levels of atypia (P=0.009). A diagnosis of moderate or severe dysplasia was associated with cancer in 42% of cases, while only 4/31 or 13% of negative for cancer high-risk smokers had moderate or severe dysplasia.

Table 1 Characteristics of the subject populations
Table 2 Association between cytologic diagnosis on sputum and cancer†

Hybridization Efficiency and Cutoff Values for FISH

Each gene-specific probe was confirmed on metaphases from normal lymphocytes to hybridize to the appropriate centromeric and locus-specific regions for cen 3, 3p22.1, cen 10, and 10q22.3. Cancer patients had an average of 183 cells (median, 131; range, 49–589) for 3p22.1 and 158 cells (median, 139; range, 35–455) for 10q22-23 evaluated. Control subjects had an average of 204 cells for 3p22.1 (median, 183; range, 40–673) and 170 for 10q22-23 (median, 189; range, 13–474) evaluated.

Similarly, diploid signals were noted for each probe in interphase nuclei from five batches of normal lymphocytes. The mean (±s.d.) deletion and polysomy value for 10q 22.3 was 1.14±0.59, and that for deletions and polysomies of cen 3 and 3p22.1 was 3.02±1.73. For all chromosomal abnormalities of 3 and 10, the mean and s.d. were 4.91±2.50.

Comparison of Patient Characteristics, Genetic Changes in Sputum and Cancer Status

There was no significant difference between cancer status for patients' age or smoking history in pack years (Table 1). There were, however, significant differences in the percentages of chromosomal abnormalities in epithelial cells in relationship to the patients' cancer status. (Table 2). Significantly, more abnormalities in epithelial cells of 3p, deletions of 10q, all abnormalities of 3, deletions and polysomies of 3p, all abnormalities of 10, and all 3 and 10 abnormalities (P-values <0.018, <0.013, <0.033, <0.026, <0.018, and <0.008, respectively) were present in the cancer patients than in the control subjects (Figures 1, 2, 3 and 4).

Figure 1
figure 1

Sputum sample from a healthy control patient, a 52-year-old never smoker, showing (a and b) squamous epithelial cells with diploid signals for 3p22.1 (green) and centromeric 3 (orange) and (c and d) SP-A gene (green) and centromeric 10 (red). (e and f) Normal squamous cells in sputum with unremarkable nuclear features and thin homogenous cytoplasm (Papanicolaou's stain; original magnification × 400). Total FISH abnormalities of 3p22.1, 3, 10q22-23, and centromeric 10 were 6.7. Probability of cancer risk=0.064.

Figure 2
figure 2

Sputum specimen from a high-risk 61-year-old man with 100 pack years of smoking history and a CT scan negative for lung cancer. FISH assay shows (a) deletion of 3p22.1 green signal (one green signal) relative to centromeric 3 (two red signals). Similarly, (b) depicts loss of green signal at 10q22.3 (SP-A) relative to two red signals of centromeric 10 (oil immersion objective, original magnification × 630). The total number of chromosomal abnormalities of chromosomes 3p22.1, centromeric 3 and 10, and10q22.3 was 15.5. (c and d) Cytologic examination shows cells of (c) moderate to severe dysplasia with hyperchromatic irregular nuclei and keratinized thickened eosinophillic cytoplasm and (d) atypical squamous metaplasia (Papanicolaou's stain; original magnification × 400). Probability of cancer risk=0.844.

Figure 3
figure 3

Sputum specimen from a 67-year-old woman nonsmoker with a family history of lung cancer, who had stage III adenocarcinoma (peripheral lesion). FISH analysis showed (a and b) deletions of 3p22.1 as indicated by the loss of one green signal relative to two red centromeric 3 signals. Total deletion of 10q 22.3 was 5.69%, centromeric10 monosomy was 2.43%, of and deletion 3p22.1 was 6.79%. Total number of FISH abnormalities involving chromosomes 3 and 10 were 15.89. Cytologic examination showed (c–e) extensive moderately and severely dysplastic cells (arrows; Papanicolaou's stain; original magnification × 400). This patient died with metastatic disease 24 months later. Probability of cancer risk=0.817.

Figure 4
figure 4

Sputum sample from a 63-year-old man with a history of 50 pack years of smoking and no evidence of cancer shows (a and b) cells of moderate and severe dysplasia (arrows) on cytologic examination (Papanicolaou's stain; original magnification × 400) FISH showed deletions of 3p22.1 (c) and 10q22.3 (d) as indicated by the loss of one green signal relative to two orange signals. 3p22.1 deletions were 5.63% and 10q22.3 deletions were 3.78%. Total FISH abnormalities of chromosomes 3 and 10 were 9.4. Probability of cancer risk=0.830.

In a logistic regression model estimating cancer status (Table 3), the most significant parameters were age, deletions of 3p22.1 and 10q22.3, and a variety of abnormalities of 3p22.1 and 10q22.3, and chromosomes 3 and 10 as well as both squamous metaplasia and moderate dysplasia vs negative cytologic results. There was no correlation between sex and smoking history (including pack years, current smoker, former smoker, or never smoker), and cancer status.

Table 3 Univariate logistic regression model for diagnosis of cancer

The multivariate logistic regression model to estimate cancer status (Table 4) selected six variables—two genetic and four cytologic—as the most predictive parameters for estimating cancer status where the P-value cutoff point was chosen as 0.10. The variable with the highest odds ratio (OR) was moderate dysplasia (OR 17.96) followed by squamous metaplasia (OR 14.84), severe dysplasia (OR 5.39), mild dysplasia (OR 3.63), deletion and polysomy of 10q22.3 greater or less than 2 (OR 4.38), and deletions and polysomies of 3p22.1 and cen 3 greater or less than 5 (OR 3.01). The ROC curve (Figure 5), using a cutoff point of estimated P=0.004, showed the area under the curve to be 0.822, when using both the cytologic and FISH parameters. Using only the selected FISH variables of deletions of 10q and 3 and 3p abnormalities resulted in an ROC curve of 0.682 (P=0.065), whereas using only the cytologic diagnosis resulted in an ROC curve of 0.742 (P=0.040) (Figure 6).

Table 4 Multivariate logistic regression model for diagnosis of cancer
Figure 5
figure 5

ROC curve based on fish variables and cytology diagnosis.

Figure 6
figure 6

ROC curves based on different models.

If just moderate and severe dysplasia were considered to be the predictors of the presence of lung cancer, then the sensitivity and specificity of the sputum cytology test were 32% and 87%, respectively. If only severe dysplasia was considered, the sensitivity of sputum cytology was 11% and the specificity, 99%.

Table 5a shows the risk score, either: low, intermediate, or high, based on both cytology and FISH variables.

Table 5a Risk score of cancer by FISH and cytology score5

Table 5b (Supplementary Information) shows the actual probability for the presence of cancer in the sputum for each study participant using a combination of genetic variables and cytologic diagnosis. Assuming a cutoff of P>0.600 to indicate high risk for cancer, then six of the high-risk control subjects appear to be at high risk for developing cancer compared to 21 of the cancer patients. Similarly, there were two patients with cancer that had low probability scores of 2 or <2.

Table 5b Estimates of probability of cancer for each patient by sputum evaluation of genetic and cytologic variables

Discussion

Because of CT's high sensitivity and lack of specificity, it would be desirable to develop a minimally invasive test for genetic susceptibility that may assist in identifying those individuals at the highest risk for developing lung cancer. However, conventional cytologic sputum screening lacks sensitivity for various reasons, including its inability to detect small atypical squamous cells, the fact that abnormal cells may not be shed from peripherally based lesions, the patient's inability to produce an adequate cough specimen, and contamination of the specimen by neutrophils and bacteria.

The uninduced sputum from healthy subjects contains for the most part squamous cells from oral contamination and upper airways, a few alveolar macrophages, and a few bronchial epithelial cells.14 In patients who have smoked and are at risk to develop bronchogenic carcinoma, the precursor lesions are squamous metaplasia, squamous dysplasia and carcinoma in situ that occur in the bronchi and large airways. Sputum cytology is the only noninvasive way to assess these lesions. The sputum classification schemes consist of gradations of microscopic abnormalities similar to those observed in histologic sections from lower airways of smokers. The cytologic manifestations of dysplasia occur as increasingly severe cellular changes involving both: cell size and nuclear size, chromatin density, and nuclear membrane irregularity.13

Molecular changes occur early in neoplastic development including LOH at one or more 3p regions and 9p21, and may be found in histologically normal epithelium15 similar to the molecular abnormalities of 3p22.1 and SP-A, and are the manifestations of a field cancerization effect.6, 7

Recently, Wang et al16 detailed four types of bronchial epithelial dysplasia (basal cell dysplasia, columnar cell dysplasia,bronchial epithelial dysplasia with transitional differentiation, and squamous dysplasia) that occurred throughout the bronchial tree in patients who had their lungs resected for bronchogenic cancer. For both squamous and adenocarcinomas, there existed large numbers of various types of bronchial epithelial dysplasias regardless of the histological subtype of the bronchogenic carcinoma. Hence, it is not surprising that sputum from patients with adenocarcinoma should manifest cellular atypias in the form of squamous metaplasia, mild, moderate, and severe dysplasia, representative of the field effect of atypias secondary to tobacco usage, passive smoking, or industrial or asbestos exposure.15

In a recent prospective study of smokers with COPD, risk for incident lung cancer was increased among those with sputum cytologic atypia graded as moderate or worse.17 Because of the weaker association between adenocarcinoma and sputum atypia, it was suggested that other biomarkers in sputum such as FISH assays of chromosomal changes may be complementary to diagnosis.17

The findings in this study showing squamous metaplasia and all degrees of dysplasia to be present in patients with lung cancer regardless of histologic subtype are consistent with the findings of recent investigators.12, 16, 17

If only the abnormalities detected on FISH analysis were used, the sensitivity was higher but the specificity was lower. However, a combination of FISH and sputum cytology resulted in a considerably higher sensitivity rate than obtained with morphology alone. We concluded that subjects in a high-risk group, with high probability scores derived from cytologic and FISH analyses according to our model, should undergo CT scanning. If the CT findings were negative, these subjects would be ideal candidates to undergo fluorescence bronchoscopy to exclude the presence of central airway pre-invasive malignant lesions.

In this study, there were six high-risk patients without CT evidence of lung cancer that might qualify for bronchoscopy based on their probability scores. The finding of a low probability score in several of the lung cancer patients no doubt reflected an inadequate sputum sample. In the future, use of induced sputa and stricter criteria for adequacy, such as the presence of bronchial epithelial cells and a greater number of histiocytes may improve the accuracy of the test.

The results of this study validated our approach to measuring and quantitating molecular abnormalities in consecutive fields of epithelial cells that were not necessarily cytologically abnormal. The presence of chromosomal abnormalities in both normal and atypical epithelial cells exfoliated in sputum confirmed our previous findings that cellular genetic abnormalities of 3p22.1 and 10q22.3 reflect a field cancerization effect within the bronchial cells of individuals at high risk for developing cancer.6, 7

Other investigators tested epithelial cells in sputum by FISH using a commercial probe set for four different chromosomal regions (5p15, 6p11-q11, 7p12 (including epidermal growth factor receptor) and 8q24 (including C-myc), and required a positive sputum diagnosis to be based on DNA copy number gains for at least two probes in a minimum of two or three cells.18, 19 When using this approach, the sensitivity (50%) and specificity (81%) of FISH did not exceed the sensitivity of sputum cytology to detect lung cancer.18 Furthermore, heavy tobacco smokers and asbestos-exposed workers had FISH results similar to those seen in never smokers, suggesting that the composition of these probes was not optimal for detecting early lung cancer in high-risk populations.18 In contrast, others reported that FISH combined with cytology led to an improved diagnosis of malignancy.20 In a recent study, combined genetic aberrations for genes HYAL2 and FHIT were found with FISH in 76% of sputa from patients with cancer, but in only 47% of cases that considered positive on cytology, demonstrating that with an appropriate choice of probes, FISH can detect abnormal cells that may be undetectable by cytology.

Our sputum probe set of 3p22.1 and SP-A was selected on the basis of results of high-resolution comparative genomic hybridization analysis of cDNA microarrays in adenocarcinomas and squamous cell carcinomas that showed significantly higher levels of these deletions, relative to those in normal human bronchial epithelial cells, in almost all tumors tested.21 These probes were subsequently tested by FISH in adenocarcinomas and squamous cell carcinomas and found to correlate significantly with the results of the comparative genomic hybridization.21

Deletion of SP-A are frequent in lung cancer cells as well as in adjacent bronchi, normal lung, and bronchial cells from main stem bronchi on the normal and tumor sides.5, 6, 7 Deletions of SP-A in lung cancer cells are associated with polysomy of the genes for centromeric 10/10q22.3, EGFR, 5p15, 6p11-q11, and C-MYC, which are genes that drive cell cycle and proliferation.22 SP-A deletions also correlated with lymph node metastases, higher stage and poor prognosis.5,22

Chromosome 3p deletion is currently the most common finding in lung cancer. Allelic losses at one or more chromosome 3p21.3 locus are the most frequent chromosomal abnormalities detected in the bronchial epithelia of smokers and are detected even in normal bronchial mucosae of smokers.15

The use of FISH for diagnostic purposes has increased considerably in the last few years, primarily because FISH permits visualization and examination of genetic aberrations as rare events in a large number of cells that may have normal genetic composition. FISH is ideally suited for cytologic specimens such as sputum, which may be obtained spontaneously by coughing or induced by inhalation of nebulized saline. The major value of sputum biomarkers is to identify patients at high risk for cancer-related events, such as the development of premalignant lesions or early cancers, so that these patients may be subject to intense surveillance either by fluorescent bronchoscopic examination with removal of neoplastic lesions23, 24 or by regular helical CT scanning of lungs to detect peripheral carcinomas. Additionally, this is an ideal population to benefit from the use of chemopreventive agents and smoking-cessation counseling.

In summary, we used an automated quantitative system to score FISH abnormalities in epithelial cells from noninduced sputum specimens, which resulted in a gallery of cells that could then be interactively classified in conjunction with morphologic findings. Software programs specific to the sputum application and the size of the probes and specific filter sets were used to maximize the accuracy of the testing. Correlating our sputum findings with disease state per individual, we discovered that epithelial cells in sputum from patients with NSCLC were cytologically and genetically abnormal relative to those from a high-risk control group with no CT evidence of lung cancer as well as healthy controls. On the basis of the results of our model for risk of lung cancer, which should be validated in a larger study, we conclude that the best predictive sputum assay for lung cancer will be a combination of morphologic characteristics and molecular abnormalities in both atypical cells and morphologically normal cells.