Main

Bile duct brushings are often used as the initial investigative pathology test for pancreatobiliary tract lesions as they widely sample the bile duct and have a low complication rate. Unfortunately, bile duct brushings are some of the most challenging cytologic specimens to evaluate due in part to poor specimen quality and quantity, as well as frequent ulceration, inflammation, and stent-related atypia that may make the distinction of benign or reactive biliary epithelium from neoplasms particularly challenging.1 Additionally, the high frequency of deceptively benign-appearing carcinomas in this site accounts for a significant number of false-negative results.2 As a result of these factors cytological diagnosis of malignancy on bile duct brushing is notoriously limited with low sensitivity rates (6–64%; mean 42%) and negative predictive value,3, 4, 5, 6, 7, 8, 9, 10, 11, 12 despite high test specificity (98–100%).2, 13, 14

Because of newly improved imaging modalities bile duct brushings are steadily increasing and cytologic abnormalities are being identified in these samples even before a mass is visible. Early diagnosis of pancreatobiliary tract carcinoma ensures early treatment while for unresectable tumors accurate diagnosis preempts unnecessary surgery and ensures appropriate chemo-radiation and/or palliative therapy.

Several studies have attempted to identify definite cytologic criteria that can better predict malignancy in bile duct brushings.1, 5, 14, 15, 16, 17 In 1995, Cohen et al were among the first to use cytologic criteria to improve diagnostic accuracy.13 Their analysis showed that nuclear molding, chromatin clumping and increased nuclear-cytoplasmic ratio were frequently associated with malignancy (deemed ‘primary Iowa criteria’). If all primary criteria were not present, secondary criteria (anisonucleosis, nuclear irregularity/grooves, and nuclear enlargement) were used. Presence of 2/3 primary criteria resulted in 83% sensitivity and 98% specificity for carcinoma detection. In 1998 Renshaw et al similarly showed that nuclear molding, chromatin clumping and loss of polarity were associated with malignancy (deemed the ‘ Boston criteria’).17 Henke et al were later successfully able to apply the Iowa criteria to liquid-based specimens.18 Others have since examined different cytologic criteria to diagnose cholangiocarcinoma on bile duct brushings, with variable success.1, 2, 3, 4, 5, 6, 7, 8, 9 Barr Fritcher et al recently examined 16 cytologic criteria associated with malignancy in pancreatobiliary brushings with corresponding positive fluorescence in situ hybridization (FISH), and found that abnormal single cells, nuclear membrane irregularity and enlargement were independently associated with malignancy.14

Most major previous studies typically involved 2–4 reviewers,14, 17, 19, 20 some of them cytopathology experts, thus their observations may not be translatable to daily practice where bile duct brushings are frequently evaluated by general surgical pathologists with limited cytopathology or gastrointestinal pathology experience. Additionally, in some studies, patients being alive after 6 months was considered evidence of benignity of the lesion,21 although findings in more recent observations in bile duct carcinoma suggest that this is too short a time frame in which to exclude malignancy.22, 23

In this study, we investigated the specific cytologic criteria that lead pathologists from various backgrounds (but who are involved in the sign out of these specimens) to the accurate diagnosis of malignancy in 60 BDB samples with >18 months follow-up or definitive tissue diagnosis of malignancy.

Materials and methods

Case Selection

After approval by our institution’s Review Board, a computer-based search of the Emory University Pathology Department’s archives 444 bile duct brushings were collected over a 16-year period (2000–2015). Among the 444 bile duct brushings available for review in our Institution, 253 fulfilled the follow-up criteria (see below) and from these 30 malignant and 30 benign samples were randomly selected by a non-reviewing author (EH) for inclusion in the study. After consultation with our statisticians a sample size of 60 was determined based on the ability to detect a proportion (sensitivity or specificity) of 80% with a 95% confidence interval from 70–90%.24 All 30 malignant cases had a definitive histologic diagnosis of adenocarcinoma either by biopsy (n=6) or resection (n=24) and was verified by the authors. All 30 benign cases had 18-plus months of uneventful follow-up, 15 of them with prior stent placement and 15 without. Patient demographics, imaging details, cytologic and histologic diagnoses, and clinical follow-up were collected. Follow-up histologic biopsies, resections and other FNAs of primary and/or metastatic tumors were reviewed in all cases.

All bile duct brushings had been obtained during endoscopic retrograde cholangiopancreatography (ERCP). These were directly smeared and stained with Papanicolaou stain or collected in ethanol or liquid-based cytology preservative and ThinPrep smears were prepared and stained with Papanicolaou stain. Where possible, cell blocks were also prepared from paraffin embedded material.

Cytopathologic Criteria

Based on the results of prior studies and the authors’ personal pathology experience,1, 2, 3, 4, 5, 9 the presence or absence of 14 malignant characteristics which included (1) hypercellularity, (2) 3-dimensional clusters, (3) cellular discohesion characterized by single cells with high (>50%) nuclear to cytoplasmic ratio, (4) high nuclear to cytoplasmic ratio (>50%), (5) nuclear molding or hugging, (6) cytoplasmic mucin vacuoles, (7) 2-cell population, (8) nuclear chromatin changes (hypochromasia or hyperchromasia), (9) nuclear membrane irregularity, (10) large prominent nucleoli, (11) nuclear pleomorphism, (12) necrosis (single cell or background), (13) abnormal mitoses, and (14) infiltrating inflammation within epithelial cells (Figures 1, 2, 3, 4, 5, 6, and 7). High nuclear to cytoplasmic ratio (>50%) in malignant cells was defined as a nuclear to cytoplasmic ratio that was greater than 2:1 or in which the nucleus constituted more than 50% of the entire cell volume.

Figure 1
figure 1

(a,b) Three-dimensional clusters of malignant cells with high nuclear to cytoplasmic ratio, hyperchromasia, nuclear contour irregularity and prominent nucleoli. Note the clinging tumor diathesis in (b) (Papanicolaou stain, magnification × 200).

Figure 2
figure 2

(a,b) Clusters of malignant cells with marked nuclear pleomorphism, 4-fold anisonucleosis, high nuclear to cytoplasmic ratio and hyperchromasia (Papanicolaou stain, magnification × 200 (a) and × 400 (b)).

Figure 3
figure 3

(a,b) These malignant cells range from hyperchromatic (a) to hypochromatic (b) and have large circumscribed intracytoplasmic mucin vacuoles that focally displace their nuclei eccentrically (Papanicolaou stain, magnification × 200).

Figure 4
figure 4

(a,b) These malignant cells are crowded and overlapping with high nuclear to cytoplasmic ratio, marked nuclear contour irregularity and change in chromatin pattern, with hyperchromasia (in a) and hypochromasia (in b) as well as focal cellular dissociation and intracytoplasmic mucin vacuoles (in a). Prominent red nucleoli are also present in (b) (Papanicolaou stains, magnification × 400).

Figure 5
figure 5

(a,b) This examples of a 2-cell population shows a flat honeycomb sheet of evenly distributed benign columnar cells with low nuclear to cytoplasmic ratio and even chromatin distribution next to a crowded cluster of malignant cells with marked nuclear crowding and overlapping, high nuclear to cytoplasmic ratios and hyperchromasia (Papanicolaou stains, magnification × 200 (a) and magnification × 400 (b).

Figure 6
figure 6

(a) Cellular dissociation of single intact malignant cells (Papanicolaou stain, magnification × 200) with high nuclear cytoplasmic ratio (>50%) (b,c) are shown in these examples (Papanicolaou stains, magnification × 400).

Figure 7
figure 7

(a) Drunken honeycomb sheet of crowded malignant cells with hypochromasia, irregular nuclei, prominent nucleoli and infiltrating neutrophils (Papanicolaou stain, magnification × 200). (b) This loose cluster of malignant cells shows marked nuclear irregularity, hyperchromasia, prominent nucleoli and single cell necrosis (Papanicolaou stain, magnification × 200). (c,d) These 3-dimensional cell clusters show prominent nuclear molding/hugging (Papanicolaou stain, magnification × 200 (c) and magnification × 400 (d)).

Review

Cytology slides from the 60 bile duct brushings were de-identified and given unique identification numbers known only to one study participant (EH) who was not a reviewer in the study. A log form with 14 established malignant cytologic criteria was designed to facilitate the 7 blinded reviewers’ entry of their cytologic diagnoses. The reviewers were instructed to render a diagnosis of benign vs malignant for each case, and to document the presence or absence of the 14 criteria in each case. The 7 reviewers included 3 fellowship-trained cytopathologists (MDR, KH,UK- reviewers 1, 2, and 6), 2 surgical pathologists with pancreatobiliary pathology expertise (one gastrointestinal pathology fellowship-trained with limited cytology experience (AK; reviewer 3) and the other an oncology fellowship-trained pathologist with expertise in pancreatobiliary pathology who routinely signed out pancreatobiliary cytology (VoA; reviewer 4)), 1 general surgical and genitourinary pathology-trained pathologist with sign out responsibility in cytopathology (AOO; reviewer 5) and 1 cytopathology fellow with 3 months of training (VaA; reviewer 7). Reviewers were blinded to all clinical or radiologic information, histologic diagnoses, patient outcome, as well as other reviewers’ diagnoses. All 60 BDB samples were then blindly reviewed, corresponding log forms were completed and ultimately tabulated.

Statistical Analysis

The utilization of the 14 characteristics in the accurate cytologic diagnosis by the 7 reviewers was analyzed through different approaches. To be able to perform the statistical analysis our expert statisticians (MG, LD, and AF) determined that a ‘gold standard’ would need to be developed to determine the presence or absence of the 14 characteristics. In this manuscript, we present results using three different reviewer groups as ‘gold standard’. In the primary analysis, agreement by two of three cytopathologists (MDR, KH, UK) was used as the gold standard for the parameter validation, as all three cytopathologists were board certified and had similar or comparable years of experience (10, 7, and 11 years respectively). There were two samples where one of the cytopathologists reported the sample as non-diagnostic and did not review the criteria. In these samples, agreement was based on 2/2 reviewers and if the reviewers did not agree, then the characteristic was considered inconclusive and the sample was excluded from analysis of that particular characteristic. Separately, ‘gold standards’ based on agreement between any 4 of 7 reviewers (independent of cytopathology training and experience) was also determined as well as agreement between 7 of 7 reviewers. Agreement between 7 of 7 reviewers reflects the most stringent analysis because a characteristic is only considered present if all 7 reviewers identified it and absent if <7 reviewers identified it.

A score based on the number of malignant characteristics present was calculated for each BDB sample and analyzed as a continuous variable. Sensitivity, specificity, and accuracy were evaluated for each cut point of this continuous variable to determine the cut point that maximized these proportions. Following this, once the parameters with strong association were established, then the potential value of combination of parameters were also tested by using backwards elimination approach, which was used to reduce the model containing all characteristics to the number of characteristics indicated by the best performing cut point to determine if a specific set of fewer variables could be used to determine a diagnosis. P-values for differences between the malignant and benign groups were calculated using chi-square tests. Logistic regression was used to calculate odds ratios for individual characteristics and for malignancy score. All statistical tests used a P-value of <0.05 to determine significance. Analyses were conducted using SAS, version 9.4 (SAS Institute, Cary, NC, USA).

Results

Cytologic Findings Using Agreement by 2/3 Cytopathologists as the Gold Standard

Of the 14 malignant characteristics, only 11 were statistically significantly associated with malignancy while two were not (inflammation (P=0.4237) and necrosis (P=0.7188)). Mitoses could not be evaluated because there was no agreement between 2 or more cytopathologists on the presence of characteristic in any of the 60 bile duct brushings (Table 1). Among all malignant cases the most frequent characteristics (from greatest to least) were change in chromatin pattern (hypo/hyperchromasia (70%), nuclear irregularity (67%), pleomorphism (62%), 2-cell population (57%), 3-dimensional clusters (52%), high nuclear to cytoplasmic (>50%) ratio (48%), cytoplasmic mucin vacuoles (43%), inflammation (43%), cellular discohesion of cells with high N/C ratio (38%), hypercellularity (23%), prominent nucleoli (21%), and necrosis (17%).

Table 1 Prevalence of malignant characteristics in benign and malignant bile duct brushings

Among benign cases, the frequency of malignant characteristics (from greatest to least) was infiltrating inflammation (33%), nuclear irregularity (13%), necrosis (13%), cytoplasmic mucin vacuoles (13%), 3-dimensional clusters (3%), cellular discohesion (3%), high N/C (3%), and pleomorphism (3%). There was no agreement between 2 or more cytopathologists on the presence of hypercellularity, nuclear molding, 2-cell population and prominent nucleoli in any of the benign cases.

Interestingly, among all 60 Bile duct brushings, ‘necrosis’ was recorded with similar frequency in malignant and benign cases (17% vs 13%, P-value 0.718). Inflammation was present in 33% of benign bile duct brushings and was greater in those without stents (40% vs 27%). The prevalence of malignant characteristics in all malignant and all benign Bile duct brushings as analyzed by the cytopathologists is summarized in Figure 8.

Figure 8
figure 8

Using agreement by 2 of 3 cytopathologists as the initial gold standard 11 of 14 cytologic characteristics emerged as being statistically significantly more prevalent in malignant vs benign bile duct brushings, except for necrosis, abnormal mitoses and inflammation (red arrows) which showed no significant difference between benign and malignant cases.

Cytologic Findings Using Agreement by 4/7 Reviewers as the Gold Standard

When agreement between 4/7 reviewers was used as the gold standard, 11/14 malignant characteristics were statistically significantly associated with a malignant diagnosis while two characteristics (inflammation (P=0.4051) and nuclear molding/hugging (P=0.1503) were not. Similar to the previous analysis, mitoses could not be evaluated. The prevalence of malignant characteristics in malignant and benign (stented and non-stented) bile duct brushings is summarized in Table 1.

Cytologic Findings Using Agreement by all 7 Reviewers as the Gold Standard

When agreement between all 7 reviewers was considered the gold standard, 4 malignant characteristics were statistically significantly associated with a malignant diagnosis (hypercellularity (P=0.0384), change in chromatin pattern (P=0.0384), nuclear irregularity (P=0.0384), pleomorphism (P=0.0384)) (Table 1). Nuclear molding or hugging, necrosis, and mitoses could not be evaluated because there were no samples where all 7 reviewers agreed on the presence of these characteristics.

Odds of Malignancy Based on Presence of Each Cytologic Characteristic

Using agreement by 2/3 cytopathologists, odds ratios could not be calculated for 5 of the 14 characteristics because they were not present in both benign and malignant cases. When odds ratios were calculated for the remaining 9 malignant characteristics, all were >1.00 (range 1.30–47.46), indicating increased odds of malignancy if that specific characteristic was identified in a sample (Table 2). Nuclear pleomorphism had the highest odds ratio (OR): 47.46; 95% confidence interval (95% CI): 5.64, 399.29. The odds ratios for inflammation and necrosis were not statistically significant.

Table 2 Most helpful characteristics based on greatest odds of malignancy

Prevalence Score of Malignant Characteristics in Malignant and Benign Bile Duct Brushings

When each malignant characteristic that was present was given a numerical weight of ‘1’ point, based on agreement between 2/3 cytopathologists, the optimum number of characteristics needed to achieve an accurate diagnosis was investigated and was found to be ‘≥3’ (Table 3). In this analysis, characteristics that were not significantly associated with a malignant diagnosis (inflammation, necrosis and mitoses) were excluded so the score had a possible range of 0–11. Additionally, when various score cut points were evaluated, the ‘≤ 2 compared with ≥3’ malignant characteristics cut point resulted in the best combination of sensitivity (70%), specificity (97%), and accuracy (83%), but above that cut point (as the number of characteristics increased) overall specificity increased but accuracy and sensitivity decreased (Table 4). Of the 22 bile duct brushings with≥3 malignant characteristics, only 1 patient’s sample turned out to be benign (5%) on long-term follow-up. This patient was a 70-year-old female in whom 2 of 3 cytopathologists and one GI pathologist identified 7 malignant characteristics (Figure 8). However, despite prolonged follow-up of 55 months she was alive with no clinical or radiologic evidence of pancreatobiliary tract malignancy. In retrospect, we now conclude that this case represents one of the mimickers of carcinoma that we now refer to as detachment atypia. The features of this type of atypia elucidated by this study include clustered or crowded epithelial cells, without true 3-dimensionality and with increased nuclear to cytoplasmic ratio sometimes approaching, but not exceeding 50%. Additionally in these examples single intact cells with increased nuclear to cytoplasmic ratio cells, with smooth nuclear contours, hyperchromasia and small but distinct nucleoli are also seen (Figure 9).

Table 3 Prevalence scores of malignant characteristics in malignant and benign bile duct brushings
Table 4 Sensitivity, specificity, accuracy of malignant characteristics scores using various cut points
Figure 9
figure 9

Benign bile duct brushing with detachment atypia. This example was called ‘malignant’ by 2/3 cytopathologists, based on their identification of 7 malignant characteristics including 3-dimensionality, discohesive single cells with high nuclear to cytoplasmic ratio and change in chromatin. Although these atypical cells appear clustered they lack true 3-dimensionality and have increased nuclear to cytoplasmic ratio, approaching, but not exceeding 50% (Papanicolaou stain, magnification × 200). Rare single intact atypical cells with increased nuclear to cytoplasmic ratio, smooth nuclear contours, small but distinct nucleoli and hyperchromasia are present in the top right (Papanicolaou stain, magnification × 400).

Twenty-three of 30 (77%) benign samples had none of the 11 statistically significant malignant characteristics. Interestingly, 7 (23%) malignant bile duct brushings lacked all 11 characteristics when agreement by 2/3 cytopathologists was used as the gold standard (Table 3). When these 7 cases were later blindly re-reviewed by all 7 reviewers malignancy characteristics were indeed identifiable to a variable degree by several reviewers. Indeed, 3 or more malignant characteristics were identified by at least 2 reviewers in all 7 cases (range 3–8 characteristics/case). Re-review of the two cases that initially had only 1 malignant characteristic (per agreement between 2 of 3 cytopathologists) revealed 4–9 characteristics (mean of 6 characteristics) in one case and only 1 characteristic (mucin vacuoles) in the other. It is likely that although the latter case had a malignant biopsy there were no malignant cells on the bile duct brushing.

Using different gold standards (agreement by 2/3 cytopathologists vs agreement by ≥4/7 reviewers) the ‘≥ 3’ cut point remained the best one with identical sensitivity (70%), specificity (97%) and accuracy (83%) rates for both gold standards (Table 4). When agreement between 2/3 cytopathologists was used as the gold standard, the odds of malignancy increased 1.82-fold (95% CI: 1.29, 2.26) with each additional parameter. This figure was increased to 2.14-fold (95% CI: 1.36, 3.36) when agreement by 4 or more reviewers was analyzed as the gold standard. Based on the identified cut point of 3 characteristics, backwards elimination identified chromatin pattern, nuclear irregularity, and pleomorphism as the three characteristics that may be the most useful in determining a diagnosis (all three P-values <0.001). Based on the gold standard of agreement between 4/7 reviewers, the sensitivity for the presence of all three characteristics was 57%, the specificity was 100%, and the accuracy was 78%. When agreement between 2/3 cytopathologists was used as the gold standard, the sensitivity was 43%, the specificity was 97%, and the accuracy was 70%. These numbers proved to be lower than when any 3 of 11 parameter was utilized.

Reviewer Performance Using the Presence of ≥3 Malignant Characteristics

We also sought to determine if there was a difference in performance between experienced (n=4) and inexperienced (n=3) reviewers when utilizing malignant characteristics, based on the identification of 3 or more malignant characteristics cut point. The most experienced reviewers were the board-certified cytopathologists (reviewers 1, 2 and 6) and the oncologic pathologist who routinely signed out pancreatobiliary cytology samples (reviewer 4, VoA). The least experienced reviewers were defined as the reviewers who rarely examined pancreatobiliary cytology specimens including one GI pathologist, the genitourinary pathologist and the cytopathology fellow with 3 months of training (reviewers 3, 5, and 7). Using a cut point of identification of≥3 malignant characteristics, more experienced reviewers had a mean sensitivity of 69% (range 60–77%), specificity of 85% (range 77–93%) and accuracy of 77% (range (74–80%). Less experienced reviewers had a higher mean sensitivity of 73% (range 67–83%), lower mean specificity of 77% (range 66–90%) and slightly lower accuracy of 75% (range 73–78%) compared with more experienced reviewers. Individual reviewer performance is summarized in Table 5.

Table 5 Reviewer performance using (≥3) score based on presence of malignant characteristics

Discussion

While the identification of malignant characteristics in bile duct brushings would theoretically appear to be a straightforward process, it is clearly a challenge for pathologists, and for general surgical pathologists in particular. While ancillary studies such as UroVysion FISH and immunohistochemistry are promising, they are not foolproof and are not widely available. Therefore, the identification of specific and reproducible morphologic characteristics should improve pathologist performance in identifying more malignant tumors.

We examined the strength of several well-established and some less well appreciated cytologic characteristics in identifying pancreatobiliary tract malignancy in bile duct brushings. Not surprisingly we found that, in order of (descending) proportion, change in chromatin pattern, nuclear irregularity, pleomorphism, 2-cell population, and 3-dimensional clusters were extremely helpful in accurately identifying malignancy in these specimens and were present in over 50% of cases, based on agreement between 2/3 cytopathologists. Other studies have also found nuclear irregularity and chromatin changes to be significantly helpful in identifying malignancy.13, 16, 19 High nuclear to cytoplasmic ratio and cellular discohesion (present in 38–48% of our cases) were also very helpful in our study. While prominent nucleoli are supportive of malignancy, they were not commonly identified in our malignant cases (20–28%). Layfield,1 Nakajima15 and Waugh et al19 found that while prominent nucleoli were an important malignant characteristic on both conventional smears and ThinPrep, they were not always a very sensitive marker (20%), a finding that is similar to our own study. While malignant characteristics are important, if they are only rarely present in malignant samples then their helpfulness in accurate diagnosis becomes questionable. Hence, the prevalence of helpful ‘malignant’ characteristics is even more critical. This is exemplified by ‘abnormal mitoses’ which are a known malignant characteristic but were surprisingly only rarely seen (0–2%) in our malignant bile duct brushings. Another unexpected finding in our study was the similar frequency of what is interpreted by practitioners as ‘necrosis’ in benign and malignant cases, which suggests that it is not as specific a marker of malignancy as previously thought by some.7 Others have also reported similar results.13, 14, 16, 19

Inflammation also had a similar frequency in benign and malignant bile duct brushings, making it less useful in the accurate identification of malignant brushings. However, it should be noted that acute inflammation (a finding typically assumed to represent a benign, reactive inflammatory process) should not be ignored in all cases, particularly when associated with other concerning malignant characteristics. We have previously reported on the fact that pancreatic ductal adenocarcinomas, particularly its micropapillary and undifferentiated subtypes, may demonstrate marked intra-epithelial infiltration by so-called ‘tumor-infiltrating’ neutrophils.25, 26 Pancreatic ductal adenocarcinoma may involve the bile ducts resulting in ‘positive’ bile duct brushings. Thus, the presence of infiltrating neutrophils should not be ignored in such samples until other malignant characteristics have been clearly excluded. Among the benign bile duct brushings we also noted that non-stented benign brushings had a higher prevalence of inflammation than stented ones, suggesting that stent placement and associated subsequent management protocols suppress loco-regional inflammation, an expected treatment-related phenomenon.

The current study shows that although certain typically ‘malignant’ characteristics were seen in some of our benign bile duct brushings (albeit infrequently), this would suggest that some features, particularly when used single-handedly, are unreliable stand-alone criteria for distinguishing benign from malignant brushings, similar to some previously published observations.13, 19 Nonetheless, we were able to define 11 cytomorphologic characteristics that are statistically significant in separating malignant from benign bile duct brushings among both experienced and inexperienced pathologists. In fact, for every 1-point increase in the number of malignant characteristics identified per case there was an incremental increase in the odds of malignancy, and brushings with 8 or more characteristics all turned out to be malignant on resection or follow-up. However, when so many characteristics are expected to be present in a sample then test sensitivity drops significantly with an increase in false-negatives. In fact, the optimum number of criteria that struck the critical balance in this study was 3. When 3 or more malignant characteristics were identified the sensitivity, specificity and accuracy were 70%, 97% and 83%, respectively. Additionally, the minimal gain in specificity with use of 4 or more characteristics was at significant cost to sensitivity and accuracy. Unfortunately, there was no ideal numerical combination of any specific cytologic criteria that brought additional strength to the diagnostic algorithm. This may not be surprising considering that adenocarcinomas manifest a variety of morphologic patterns in different patients. Additionally, when one considers the failed attempts at discovery of a ‘magical numerical combination of criteria’ for other extensively studied diseases including rheumatic fever, Kawasaki disease or papillary thyroid carcinoma, it is no surprise that bile duct brushings are not unique in this arena and failed to yield a specific combined set of criteria for optimal accuracy.

Tumor differentiation also has a critical role in the identification of malignancy. Poorly differentiated carcinoma is typically not a diagnostic challenge but well-differentiated tumors can be cytologically bland and extremely difficult to distinguish from benign and/or reactive changes morphologically, particularly in patients with primary sclerosing cholangitis, inflammation, prior stenting or even chronic pancreatitis, which is partly responsible for the test’s low sensitivity. The Papanicolaou Society of Cytopathology recently published standardized reporting terminology for the sign out of pancreatobiliary cytology.27 However, no specific cytologic criteria were defined for bile duct brushings in their study. Robins et al evaluated 19 cytologic criteria in pancreatic fine needle aspiration and suggested 3 major (overlapping nuclei, nuclear irregularity and chromatin clearing and/or clumping) and 4 minor (single epithelial cells, necrosis, mitoses and nuclear enlargement) criteria for improved sensitivity and specificity,28 similar to the criteria our study elucidated. Their study was based on conventional smears, while most current bile duct brushings are made using liquid-based preparations. Waugh et al evaluated 100 ThinPrep bile duct brushings in search of useful malignant characteristics, and found that only nuclear features and patient age were statistically significant in separating benign from malignant cases.19 Additionally, 38% of their benign cases were misdiagnosed as malignant by at least one of four reviewers. Unlike Waugh et al,19 we found that cellular discohesion, 3-dimensional architecture and mucin vacuoles were additionally useful features and we included both ThinPrep and conventional smears in our study. We found that type of preparation had no significant effect on diagnosis.

While many studies used a small numbers of reviewers our study based the presence or absence of characteristics on agreement not only by three fellowship-trained cytopathologists but also agreement by non-cytopathologists who perform bile duct brushing evaluations in their daily practice, and found that the same criteria could be successfully applied by all seven pathologists, despite lack of formal cytopathology training. On the other hand, not surprisingly, experience played a role in reviewers’ performance with more experienced reviewers having slightly higher specificity and accuracy rates than inexperienced ones. This implies that general surgical pathologists and others who less frequently sign out bile duct brushings could still be trained to correctly identify characteristics, thus maximizing sensitivity, specificity and accuracy of diagnosis.

The search for improvements in diagnostic accuracy of bile duct brushings has led some to suggest triple testing (brush cytology, fluorescence in situ hybridization (FISH) and forceps biopsy) which has better accuracy, 82% sensitivity, 100% specificity, 100% positive predictive value, and 87% negative predictive value than brushing alone.29, 30 However, these studies had sufficient specimen cellularity for additional testing, which is often not the case in clinical practice.1, 31, 32 UroVysion FISH has proven helpful in evaluating pancreatobiliary cytology specimens.14, 29, 33 Its’ sensitivity (42.9%) was significantly higher than that of routine cytology (20.1%) (P<0.001) and had identical specificity (99.6%).34 A newer pancreatobiliary tract specific FISH probe that targets 1q21, 7p12, 8q24, and 9p21 detects significantly more cancers than UroVysion FISH (65% vs 46%; P<0.001) with similar specificity (93% vs 91%).35 However FISH poses technical and financial challenges. Immunohistochemistry is another useful adjuvant study in bile duct brushing assessment.36 Over 50% of biliary cancers showed a maspin+/IMP3+/S100P+/pVHL− staining profile, and 20% showed a maspin+/IMP3−/S100P+/pVHL− profile in one study.37 However immunohistochemistry lacks specificity, hence, identifying key morphologic characteristics as we did in this study will go a long way in improving test sensitivity without compromising specificity. Along those lines, in this cost-containment era, the assessment of ‘risk of malignancy’ by conventional cytology will likely be favored over other more expensive ancillary tests such as immunohistochemistry and FISH, hence the additional importance of more useful cytomorphologic criteria.

In conclusion, we evaluated specific cytologic criteria of malignancy in the diagnosis of carcinoma on bile duct brushings and found 11 malignant characteristics (hypo/hyperchromasia, nuclear irregularity, nuclear pleomorphism, presence of 2-cell population, 3-dimensional architecture, high nuclear cytoplasmic ratio (>50%), hypercellularity, cytoplasmic mucin, nuclear molding, discohesion, and presence of prominent nucleoli) to be very helpful in identifying malignancy. In fact, for every 1-point increase in number of characteristics, the odds of malignancy increased 2-fold. These characteristics were recognizable by both users with and without experience and training in cytopathology. Moreover, in evaluating the number of malignant characteristics, we determined that if ≥3 malignant characteristics were identified, sensitivity, specificity, and accuracy of a malignant diagnosis were maximized. However, we were unable to identify a set of three specific criteria that could be used in combination to improve diagnostic accuracy. The criteria we evaluated are equally applicable by both very experienced and less experienced pathologists. Hence pathologists who infrequently see bile duct brushings can potentially be trained to correctly identify these characteristics so as to maximize the identification of malignancy in these samples.