INTRODUCTION

The effective treatment of cancer patients essentially depends on the primary anatomical site and histologic type of the tumor. The classification of primary cancers into several groups based on their tissue of origin and histopathological appearance is important for the optimal management of patients (1). The use of antibodies for detecting specific tumor antigens by immunohistochemistry has led to significant improvements in the diagnosis of cancer. Until now, there have been several reports that suggested that the CK profiles of metastatic cancers corresponded to those of the primary sites (2, 3, 4, 5, 6, 7). Thus, obtaining the expression patterns of primary tumors with several antibodies may help to clarify the origin of tumors.

Mucins are large, abundant, filamentous glycoproteins that are present at the interface between various epithelia and their extracellular environments (8, 9). In many cases, this interface is the lumen of a hollow organ within the body, such as the gastrointestinal tract, lungs, and urogenital tract. Several of these mucins are known to form mucus layers, whereas others form the glycocalyx on the intestinal enterocytes (9). Currently, 14 mucin-type glycoproteins (MUC1, 2, 3A, 3B, 4, 5AC, 5B, 6, 7, 8, 11, 12. 13, 15) have been assigned to the MUC gene family, as approved by the Human Genome Organization Gene Nomenclature Committee, and some of them are expressed in a cell- and tissue-specific manner (9). Among them, MUC1, MUC2, MUC5AC, and MUC6 have been investigated in various organs of GI systems (8, 10, 11). Despite several studies, the overall expression of mucins in each type of cancer remains conflicting and inconclusive (12).

MUC1, located in 7q22, is a membrane-bound mucin that is expressed in the mammary glands, pancreatic ducts, and superficial foveolar epithelium (8, 9). It also expressed in the gallbladder and colorectal adenocarcinomas (13). MUC2, MUC5AC, and MUC6, which are located within the 11p15 locus, share considerable overall homology and probably evolved through duplication of one ancestral gene (9, 11). These 11p15 mucins form a closely related family and are responsible for the formation of the mucus layers in the body (9). MUC2 is a large secretory mucin expressed in intestinal-type tumors and mucinous carcinomas (11). It is also expressed in goblet cell and early colorectal cancer. MUC5AC expression is observed in colorectal carcinoma and mucinous cystic neoplasms of the pancreas (12). Many studies have reported that this mucin was positive in gastric foveolar epithelium, which constitutes major gastric mucins (8, 10, 11). MUC6 is also a major gastric mucin (8, 10). So far, MUC5AC and MUC6 have not been well investigated in digestive carcinomas, except in the case of gastric carcinomas.

Cytokeratins are the principal intermediate filament proteins of normal epithelia and epithelial tumors. Until now, 20 different subtypes of cytokeratins have been classified and numbered, based on their molecular weight and isoelectric pH (14, 15, 16). The expression pattern of cytokeratins is dependent on the embryonic development and the degree of cellular differentiation (17, 18). Because the expression of the cytokeratins is usually preserved in neoplastic cells, the use of specific antibodies is of value for determining the origin of metastatic carcinomas (2). Cytokeratin 7 (CK7) is found in many ductal and glandular epithelia, including in the lung, breast, ovary, and endometrium (15). CK20 is expressed in gastrointestinal epithelium, urothelium, and Merkel cells (16). Recently, the combined expression patterns of CK7 and CK20 have been extensively studied in various primary and metastatic carcinomas (2). It was also reported that CK8, CK18, and CK19 are expressed in some GI cancers (4). The objective of this study was to classify carcinomas of the GI organs to provide a means of identifying the primary sites according to the expression pattern of mucins and CKs.

MATERIALS AND METHODS

Tumor Samples

A total of 486 surgically resected specimens of primary carcinomas in the digestive system were selected from the files of the Department of Pathology, Seoul National University Hospital. The primary cancers comprised 60 hepatocellular carcinomas, 55 gastric adenocarcinomas, 50 adenocarcinomas of the intrahepatic bile duct, 59 adenocarcinomas of the gallbladder, 40 adenocarcinomas of the extrahepatic bile duct, 54 adenocarcinomas of the pancreas, 39 adenocarcinomas of the ampulla of Vater, 23 adenocarcinomas of small intestine, 63 adenocarcinomas of the colon, 18 adenocarcinomas of the appendix, and 15 adenocarcinomas of the anus. The metastatic adenocarcinomas were excluded in this study, and all tumor materials were from the primary cancer.

Tissue Array and Immunohistochemistry

Core tissue biopsies (2 mm in diameter) were taken from individual paraffin-embedded digestive carcinomas (donor blocks) and arranged in a new recipient paraffin block (tissue-array block) using a trephine apparatus (Superbiochips Laboratories, Seoul, Korea). Each tissue-array block contained up to 60 cases, with a total of 12 blocks of tissue-array.

Four-micrometer-thick sections were cut from each tissue-array block. They were then deparaffinized and dehydrated. Immunohistochemical stainings for 4 mucins and 7 CKs were performed using an avidin-biotin peroxidase complex (ABC) method after antigen retrieval with microwave (Table 1). A tumor was regarded as positive staining if 10% or more of the tumor cells were stained.

TABLE 1 The List of Antibodies Used in This Study

Statistical Analyses

The chi-square test or Fisher exact test (2-sided) was performed. The results were considered to be statistically significant when the P value was <.05. Hierarchical clustering was performed for 486 patients. All statistical analyses were conducted using the SPSS 10.0 statistical software program (SPSS, Chicago, IL). We also calculated the sensitivity, specificity, positive predictive value and diagnostic efficacy of antibodies.

RESULTS

Expression of Mucins and Cytokeratins

Of the 4 mucins and 7 CKs, some showed distinctive characteristics at each site (Fig. 1). The positive expression at each site for the 4 mucins and 7 CKs is summarized in TABLE 2. In the vast majority of cases, all of the GI organs were highly positive for CK8 and CK18 and negative for MUC6. MUC1 expressed frequently in the pancreas (87%), anus (73%), and gallbladder (70%). MUC2+ was characteristic of appendiceal cancer (100%). MUC5AC+ was high in the pancreas (70%) and appendix (67%). CK7+ showed the positivity of 75% in the stomach, 80% in the intrahepatic bile duct, 81% in the gallbladder, 100% in the extrahepatic bile duct, 94% in the pancreas, and 97% in the ampulla of Vater cancers. On the other hand, CK20 was highly expressed in the colon (77%), appendix (100%), and anus cancers (80%). In particular, we found that both CK13 and CK19 can be used as practical markers for discriminating between hepatocellular carcinomas and cholangiocarcinomas. The positivity of CK13 and CK19 was very low in hepatocellular carcinomas, and this low positivity of CK13 in hepatocellular carcinomas compared with cholangiocarcinomas has not been previously reported.

FIGURE 1
figure 1

Expression of mucins and CKs in the cytological aspect. MUC1 was expressed in membranous portion of epithelial cells (A, 200×). MUC2 expression was cytoplasmic and high positive in mucinous carcinoma (B, 200×). MUC5AC was stained in cytoplasmic portion of epithelial cells and mainly ductal cells (C, 200×). MUC6 expression was almost negative except in the lower glandular zone of the stomach (D, 100×). CK7 was diffusely expressed in cytoplasm of ductal cells (E, 200×). CK20 was strongly and diffusely expressed in membranous portion of epithelium and well differentiated carcinoma (F, 200×).

TABLE 2 The Positive Rates of 4 Mucins and 7 Cytokeratins in Digestive Cancers

Classification of Cancers According to the Expression Profile

We classified the cancers based on the expression profile using a dichotomous outcome. In comparing the positivity of cancers, those organs whose positivity was higher than the median value for the specific antibody were regarded as positive whereas those whose positivity was lower were regarded as negative. CK7 allows the cancers to be divided into 2 categories (Fig. 2). The first category includes the cancers of the stomach, intrahepatic bile duct, pancreas, extrahepatic bile duct, gallbladder and ampulla of Vater. The second category includes the cancers of the liver, small intestine, colon, appendix, and anus. The first category can be further divided into Group 1 and Group 2 by MUC5AC. Group 1 comprised the cancers of the pancreas and the extrahepatic bile duct, characterized as MUC5AC+. Group 2 consisted of cancers in the gallbladder and the ampulla of Vater with MUC5AC−. The second category can be also further subdivided into Group 3 and Group 4 by CK20. Group 3 was characterized by CK20+ and included the cancers of the colon, anus, and the appendix. This group was further separated into 2 parts by MUC5AC. MUC5AC+ was the appendiceal cancer, and MUC5AC− comprised the cancers of the colon and the anus. Then MUC1 provided a distinction between the colon cancers (34%) and the anus cancers (73%). Group 4 includes the cancers of the liver and is characterized by CK20−. Hepatocellular carcinoma is characterized by CK13− and CK19−. In this dichotomous tree, digestive cancers can be classified by their mucin and CK profiles.

FIGURE 2
figure 2

Dichotomous tree of digestive cancers according to the expression profile. We performed hierarchical clustering for individual antibodies to select the optimal antibodies that divided GI organs distinctly into two or three groups. For the divided organs, the positivity of the optimal antibody in each organ was compared with the median value for that antibody. Then the significance of difference between two organs, randomly selected in each group, was calculated. We selected the significant pairs (P < .05), and excluded the rest of the data. aThe positive rate expressed as a percentage.

During the above process, we investigated whether the use of mucins and CKs as benchmarks was statistically significant. Comparisons were done of hierarchical clustering of every combination, such as organs in one group with other organs in another group. In all of the above comparisons of the positive rates with the median value, the P value for all combinations was <.05. Thus, the classification of the organs or groups above the dichotomous branching was statistically significant.

Different Expression in Sublocations

Two organs, the small intestine and the colon, demonstrated a different expression pattern depending on which sublocation was analyzed. Consequently, we subdivided the above two organs and compared the expression patterns (Table 3).

TABLE 3 Positive Rates according to Sublocations within Two Organs

The expression patterns of MUC1, MUC2, and CK20 were significantly different between the duodenum cancers and those of the remaining small intestine. The duodenum cancers showed MUC1+, MUC2−, and CK20−. On the other hand, jejunal and ileal cancers showed an opposite expression pattern; that is, MUC1−, MUC2+, and CK20+. MUC1 and CK20 provide a significant distinction between right colon cancers (0) and left colon cancers (53%; P = .02). The cancers of the right colon comprise those of the cecum, ascending colon, and a two-thirds portion of the transverse colon. The cancers of the left colon comprise those of one-third of the transverse colon, descending colon, and the sigmoid colon. Furthermore, CK20 can also distinguish significantly the cancers between the right colon (0%) and the rectum (92%; P < .05) or between left colon and the rectum (P < .05). In addition, MUC2 separated right colon cancers from those involving the left-hand side of the colon with the exception of the sigmoid colon within the left colon (data not shown). Thus, in terms of the classification of colon cancer, cancers occurring in the left colon, the right colon and the rectum can be distinguished.

Clustering of Cancers by Site

According to this hierarchical clustering, the cancers of 11 GI organs fall into three categories (Fig. 3). The first category includes cancers occurring at the colon, anus, small intestine, and appendix. That is, adenocarcinomas in these organs show similar characteristics with respect to the expression profiles of 4 mucins and 7 CKs. Adenocarcinomas originating at the small intestine have relative-distance values that are twice those of the colon and anus. Adenocarcinomas originating at the appendix have relative-distance values that are 6 times those of the colon and anus. The second category comprises cancers occurring at the stomach, intrahepatic and extrahepatic bile ducts, pancreas, ampulla of Vater, and gallbladder. In particular, adenocarcinomas of the extrahepatic bile duct, pancreas, and ampulla of Vater show greater similarity than do those of the other organs. Cancers belonging to the first and second categories have relative-distance values that differ approximately by a factor of 8 in comparison with the most closely related organs. The third category, as liver cancers, is distant from the other categories. When we performed hierarchical clustering using the expression results of 11 antibodies for the individual cancers of 486 patients, the tendency of the clustering dendrogram was similar to that shown in Figure 3. (data not shown).

FIGURE 3
figure 3

The clustering of cancers in 11 digestive organs. The dendrogram showed relative distance between the cancers in 11 digestive organs.

DISCUSSION

It is often important to identify the site of origin of a metastatic carcinoma, particularly because this may have a bearing on the selection of an appropriate treatment (1). The histologic assessment is often very helpful; however this may not differentiate adequately between various primary tumors. In other occasion, a large tumor mass may involve several organs, thus obscuring the site of origin. To resolve this difficulty, we classified primary carcinomas in the digestive system according to their anatomical sites by means of a study involving 486 cases using 4 mucins and 7 CKs. Our results may be helpful in predicting the primary sites of digestive cancers.

MUC1 has a tendency toward high expression in the cancers of the pancreas and gallbladder. It has been known that cancers occurring in these organs have a poor prognosis. CK13 is known to be a marker for squamous cell carcinomas (20). However, its expression was also observed in ductal adenocarcinomas and demonstrated a difference in intensity according to the organ involved. Because previous studies dealt with only a limited number of cases or a limited number of organ systems (2, 3, 4, 5, 6, 7, 19, 22), it was claimed that CK8, CK18, and CK19 reactivity was specific to the organs being investigated. For example, pancreas cancers of the ductal type, intrahepatic bile duct cancers, and colon cancers are specific for the above CKs (4, 19). However, these CKs are found to expressed in all GI cancers, except for CK19 in the case of liver cancers.

In the current study, we classified the cancers of 11 GI organs into a dichotomous tree (Fig. 2). At each individual branch point, we first performed hierarchical clustering of 11 unspecified GI organs for individual antibodies using SPSS 10.0. After comparing 11 clustered dendrogram results for individual antibodies, we selected the antibody that gave the best separation into two groups. And then the significance of the difference in positivity between the cancers of the two organs was investigated using the P value. For example, the significance of the difference in positivity between two organs was investigated for individual antibodies. A P value of <.05 in the dichotomous pairs led to the selection being retained. Third, we compared the positivity of organs with the median value of the specific antibody for the whole GI system. CK7, MUC5AC, CK20, MUC1, CK13, and CK19 were found to function as the main discriminators. The stomach and the intrahepatic bile duct were excluded in the course of second division by MUC5AC because of a lack of significance. For the same reason, the small intestine was also excluded in the course of the second division by CK20.

The sensitivity, specificity, positive predictive value, and diagnostic efficacy in significant antibodies were calculated. As for CK7 in the first branching point of Figure 2, it had the sensitivity of 88% and specificity of 82%. The positive predictive value of it was 90%, and diagnostic efficacy was 85%. MUC5AC in the second branching point of the first category had lower sensitivity, of 70%, and specificity, of 75%, than those of CK7. Positive predictive value of it was 72%, and diagnostic efficacy was 73%. In the second category, CK20 showed the sensitivity of 82% and the specificity of 100%. Positive predictive value was 100% and diagnostic efficacy 89%. MUC5AC in the second division in the second category showed the sensitivity of 67% and the specificity of 92%. The positive predictive value of it was 89% and diagnostic efficacy was 88%. Finally, MUC1 at the last branching point presented the sensitivity of 73% and the specificity of 66% with the positive predictive value of 68% and diagnostic efficacy of 70%, which is lower than those of any other antibodies.

Expression patterns of CK7 and CK20 have been well investigated in the stomach, pancreas, colon, lung, and breast cancers (2, 21, 22). However, these results were limited to one or two organs only. Furthermore, the majority of the previous studies involved a limited number of cases, and some of them showed conflicting results (2, 3, 4, 5, 6, 7, 15, 16, 21, 22). In our study, the expression pattern of CK7 and CK20 was able to be classified into three categories. Then MUC5AC and MUC1 allowed them to be classified into specific organs. CK19, in particular, has been used routinely to discriminate between cholangiocarcinomas and hepatocellular carcinomas (4). We found the additional marker, CK13, to provide superior differentiation between cholangiocarcinomas and hepatocellular carcinomas than CK19. CK13 and CK19 can distinguish hepatocellular carcinomas not only from cholangiocarcinomas but can also allow the identification of origin in metastatic GI cancers.

It was noteworthy that two organs, the small intestine and colon, demonstrated different expression patterns according to their sublocation (Table 3). Small intestinal cancers show two distinctive expression patterns. The expression patterns of MUC1, MUC2 and CK20 in the duodenum cancers were opposite to those in the jejunum and ileum cancers (Table 3). This might be related to the fact that the duodenum originates from the foregut, whereas the jejunum and ileum originate from the midgut.

In previous studies, the positive rate of CK20 in the colon varied from 84 to 100%. CK7-/CK20+ in the colon ranged from 75 to 94% (6, 7, 21). Loy et al. (6) classified colorectal cancers into primary ones and metastatic ones. Their result showed that the positive rate of CK7 in metastatic colon cancers was 31%, higher than the 16% rate observed in primary colon cancers. We used only primary cancers and divided the locations within the whole colorectum. Our data suggested that the expression pattern of MUC1 and CK20 varies according to the site of the colorectum. Right colon cancers included the cancers of the cecum, ascending colon and two thirds of the transverse colon. Left colon cancers included the cancers of one third of the transverse colon, descending colon, and the sigmoid colon. They were divided according to their embryological origin, and we can classify colon cancers into left and right colon cancers by means of the expression pattern of MUC1. Subsequently, the expression pattern of CK20 allowed colon cancers to be divided into right colon and rectum, or into left colon and rectum. In addition, MUC2 divided right colon cancers from left colon cancers, for which the sigmoid colon was excluded (data not shown). In other words, the colon cancers can be classified into the left colon, the right colon, and the rectum. The fact that among these 11 antibodies, 2 or 3 specific antibodies were associated with organ locations is assumed to be related to the expression periods of the fetus according to organic development (17, 18, 23, 24).

Finally, we performed hierarchical clustering according to the positivity of each antibody (Fig. 3). There were 3 categories. The first category included the colon, anus, small intestine, and the appendix cancers. Except for the duodenum cancers of the small intestine, these organ cancers originated from the hindgut. The appendiceal cancers were the furthest away from the other organ cancers in this first category. When we divided the cancers of the small intestine into the duodenum and the other portions and performed the clustering again, the jejunum and ileum cancers of the small intestine broadly linked to the category of cancers originating from the hindgut, whereas the duodenal cancers were more related to the category of cancers originating from the foregut (data not shown).

The adenocarcinomas of the stomach, intrahepatic bile duct, extrahepatic bile duct, pancreas, ampulla of Vater, and gallbladder belonged to the second category. The organs clustered in this category originated from the foregut. Particularly, the organs originating from the primitive duodenum bud of the foregut, such as the extrahepatic bile duct, pancreas, and ampulla of Vater showed an almost identical pattern. In this cluster analysis, the liver cancers were the last category with the highest relative-distance values. We assumed that different histologic types might be the cause of this distance.

The present comprehensive study was made possible through the use of the tissue array method, which enabled the analysis of protein expression in a large number of digestive carcinomas from defined tumor regions (25, 26, 27). The advantage of this method was its efficiency in terms of the time and cost of immunohistochemistry studies. The potential limitations of this method are associated with the fact that information is acquired from only a tiny area in each tumor.

In conclusion, it is suggested that the primary sites of the digestive cancers can be identified by means of the pattern of mucin and CK expression. Among the different sites, the small intestine and the colon cancers showed characteristic expression patterns according to embryonic origin. Hierarchical clustering analysis allows the GI cancers to be divided into three categories, an outcome that can be understood in terms of their histologic types and embryological origin.