Main

‘Those who cannot learn from history are doomed to repeat it.’

George Santayana

The pathology literature contains innumerable reports of immunohistologic tumor markers claimed to predict clinical behavior. Such reports are accumulating at an even more rapid pace as a result of the explosive growth in availability of commercial antibodies raised against cloned gene products or synthetic peptides. Unfortunately, this activity has not had great impact in practical terms: pathologists continue to make diagnoses as in the past and prognostic markers are rarely used to modify treatment strategies. While important exceptions, such as the crucial relevance of analyzing estrogen receptor and Her2Neu expression in breast carcinomas, validate the approach, the overall utility of immunohistochemistry in guiding clinical treatment decisions remains limited. Why then is there such a great disparity between the number of papers reporting novel prognostic markers and their impact on clinical practice?

One problem may be the general approach used: most immunohistologic studies of prognostic markers aim to find, in a more or less random fashion, molecular features within the tumor that correlate with clinical behavior. This is reminiscent of the epidemiologist's quest, again frequently on a random basis, for factors outside the individual that are associated with the risk of developing disease. Epidemiological investigations have also been surprisingly unproductive in terms of advancing our understanding of the ‘causes’ of diseases and devising better strategies for their prevention and management. It is therefore instructive to see if lessons already laboriously learned from epidemiology are relevant to current attempts to find immunohistologic predictors of tumor behavior.

Two problems that bedevil epidemiological studies are associations between putative risk factors and disease frequency that are weak (eg increases in risk well under two-fold) and/or difficult to reproduce (often because of technical difficulties in gathering valid data). These two features interact: the weaker the association the greater the possibility that it has been observed by chance and that it will not be found in a second study. A notorious example of the confusion that can ensue from weak associations is found in the claims that residential proximity to high-tension power lines carries an increased risk of childhood cancer, particularly acute leukemia. Extensive conflicting studies for over 25 years, including a pooled meta-analysis of 5393 leukemia cases and 10 704 controls, seemed finally to have confirmed the conclusion, and logical instinct, of many scientists that there is no causal association.1, 2 But just as it seemed, after all, safe to live near electricity pylons, a new report has reignited the controversy.3

It would be difficult to find anything comparable in terms of expenditure of effort and funds for so little benefit in studies of prognostic immunohistologic markers. However, there are many examples of correlations that are relatively weak, although apparently statistically significant, and reported in one study but not confirmed in another. For example, at least 40 molecular markers have been reported as predictors of prognosis in diffuse large B-cell lymphoma (DLBCL), but many have not been subjected to the litmus test of the epidemiologist: confirmation in a second independent study. When further studies are performed the results often lack reproducibility. For example expression of BCL6, a germinal center-associated marker, correlates with better overall survival in DLBCL in some studies4, 5, 6 but not in others.7, 8 Likewise, several investigations have found that increased expression of BCL2 protein or the proliferation marker Ki-67 (MIB1) correlates with inferior survival in DLBCL,4, 7, 9, 10, 11, 12, 13, 14, 15 whereas others have found no impact on outcome.6, 8, 9, 16, 17, 18, 19, 20, 21, 22 A more recent example is the transcription factor FOXP1, whose expression as detected by immunohistology was found to be associated with adverse outcome in DLBCL patients in two studies23, 24 but not in a third.25

When discrepancies of this sort are encountered it is often argued that the immunostains were not performed satisfactorily or were not interpreted correctly. These arguments are reminiscent of epidemiologists' disagreements over deficiencies in the collection or statistical handling of data. At best, these conflicting conclusions suggest that the immunohistologic marker is unsuitable for routine use: and, at worst, that the reported correlation may be an artifact.

An even more troublesome problem in epidemiological studies which also has a clear parallel in immunohistologic studies, is posed by associations between external factors and the risk of disease that are reproducible but are non-causal. For example, it is beyond dispute that smoking causes damage to lung epithelium and increases the risk of carcinoma. However, there is also a clear statistical association between smoking and increased risk of suicide; but nobody would argue that this represents an underlying causal link (Figure 1a). The epidemiological literature contains many examples of such associations that are difficult, or impossible, to rationalize. As an example, Table 1 lists a number of risk factors reportedly associated with heart disease (selected from among a total of 246 identified in a review 25 years ago),26 most of which are highly unlikely to reflect causal links. The same risk of detecting non-causal associations exists when attempting to correlate immunohistologic tumor markers with clinical outcome. For example, Figure 1b shows how a molecule that tends to be expressed in proliferating cells might appear to be responsible for poor prognosis. However, both features are probably secondary consequences: it is unlikely that expression of the molecule itself directly drives tumor growth.

Figure 1
figure 1

This scheme illustrates how non-causal associations between risk factors and tumor development can be confused with causal links (a). This is paralleled by studies of prognostic tumor markers (b). The distinction between causal and non-causal associations becomes evident in the example shown in (a) if the consequences of stopping smoking are considered: the risk of carcinoma will diminish, but the risk of suicide will stay the same (or might well rise!).

Table 1 Risk factors for heart disease

The ‘success stories’ of epidemiology can also be instructive. For example, many of the associations that have stood the test of time involve a plausible causal scenario beginning with a specific factor, for example a virus or a mutagenic chemical agent that triggers a cascade of intermediate events, such as DNA damage and dysplasia, leading to neoplastic transformation. These causal ‘narratives’, tracing a pathway from an identifiable agent to increased tumor risk, are not always immediately evident: for example, the risk of scrotal cancer in child chimney-sweeps was identified in the 18th century, many years before causal mechanisms could be proposed. Nevertheless the guiding principle is clear: one should always seek convincing explanatory models; schemes with poorly defined starting points, such as the vague description ‘living near an electricity pylon’, rarely open fruitful new avenues for understanding the origins of neoplasia.

The implication for immunohistologic studies is that tumor markers of genuine prognostic value often reflect acquired genetic abnormalities in the neoplastic cell that unleash downstream tumorigenic events. Epidemiological studies have uncovered a number of oncogenic microorganisms that induce a risk of tumor development, as illustrated by the link between human papilloma virus and cervical carcinoma. This finds a clear parallel in the acquired genetic abnormalities that dictate tumor behavior (eg deregulation of the cellular equivalent of viral oncogenes such as Myc or Ras). In effect these acquired oncogenic alterations can be considered the endogenous equivalent of external tumor-inducing factors. For this reason many immunohistologic markers that genuinely correlate with prognosis reflect acquired genetic lesions. Examples are listed in Table 2. It may be added that the study of prognostic markers related to genetic lesions can raise complex questions concerning the classification of tumors, and, in particular, whether a genetic lesion should be the diagnostic criterion for a disease. For this reason, it can happen on occasion that a marker is promoted, as data accumulate, from the status of prognostic or diagnostic marker to become the defining feature of a neoplasm. For example, it is now uncommon to make the diagnosis of a gastrointestinal stromal tumor if it lacks c-KIT (CD117) expression.

Table 2 Examples of prognostically significant immunocytochemical markers that are indicative of an underlying genetic abnormality

Immunocytochemical abnormalities that reflect the presence of acquired genetic alterations are not the only category of valid prognostic markers, and here data from microarray-based profiling of tumors provide some useful clues. These studies have defined abnormal expression patterns secondary to genetic alterations that correlate with clinical behavior, but they have also drawn attention to three other prognostically relevant categories of markers.

Firstly, several gene expression studies have identified patterns that are of prognostic significance because they reveal the cell of origin of the tumor. For example, a poor prognosis subtype of breast carcinoma arising from basal cells can be defined on the basis of gene expression.27 Similarly, molecular profiling in DLBCL has provided evidence for prognostically diverse subtypes that appear to derive from different stages of B-cell differentiation.28 In fact, this is no surprise: the entire process of categorizing lymphomas is largely based on immunohistologic markers of maturation and lineage, and the very different clinical behavior of individual lymphoid neoplasms is believed to reflect their derivation from distinct lymphoid cell subpopulations. This also raises (as for genetic lesions) questions of disease definition: for example the expression of terminal deoxynucleotidyl transferase (TdT) could be considered a marker of prognosis within the spectrum of lymphoid neoplasms, but in practice it is recognized as defining the clinically distinctive lymphomas that arise from precursor B- and T cells. It is possible that a variety of prognostically valid markers reflect a tumor's cellular origin but are not at present recognized as such for want of evidence.

Secondly, gene expression profiling studies have provided evidence for the impact of stromal cell reactions on tumor behavior. For example, prognosis in breast cancer correlates with the expression of genes comprising a ‘wound-response signature’,29 and with the expression of extracellular matrix genes and growth factors.30 Tumor-infiltrating cellular elements involved in innate and adaptive immunity (eg macrophages and cytotoxic lymphocytes) have also been implicated in the modulation of tumorigenesis.31, 32, 33 These findings raise the possibility that immunohistologic analysis of background cellular infiltrate/stromal reactions might be used to predict prognosis. For example, increased numbers of non-neoplastic macrophages and a combination of low numbers of regulatory T cells and increased cytotoxic T cells appear to predict for inferior survival in follicular and Hodgkin lymphomas.33, 34 It may seem paradoxical that cytotoxic T cells, which should combat the tumor, correlate with poor prognosis, but this is consistent with studies of diffuse large B cell and Hodgkin lymphomas,32, 33 in which tumors that contain higher numbers of cytotoxic T cells fare worse. It therefore seems likely that increasing attention will focus in the future on the prognostic implications of the background non-neoplastic component in tumors, and these clearly lend themselves well to immunohistologic analysis. It must be added, however, that robust guidelines for reproducibly evaluating and quantifying these non-neoplastic cell populations will need to be established.

Thirdly, microarray-based gene expression studies can occasionally identify genes whose overexpression, even if it is not clearly associated with either a genetic alteration or with cellular origin, is empirically shown to correlate with survival. Immunohistologic detection of the corresponding protein might then allow these findings to be introduced in the diagnostic laboratory. ZAP70 in chronic lymphocytic leukemia is one example, albeit a somewhat controversial one, of a marker that has emerged in this way. However, such examples are rare and the chances of finding one by random screening for markers whose expression correlates with survival is low. It can be argued that immunohistologic studies should follow the lead of gene expression studies and screen large numbers of markers, but this is to ignore the practical reality that even the most ambitious immunohistologic project can only cope with tens of antibodies. Thus, the study of numerous randomly chosen immunohistologic markers increases the workload without a commensurate increase in the chance of success.

Finally, even the most powerful markers of prognosis can fall at one of two final hurdles, both relating to the practical realities of treatment. Firstly, advances in management can obviate the need for stratifying patients. For example, Hodgkin lymphoma currently has such a good outcome that it could be difficult to define prognostic markers that would have wide utility. Similarly, it has been reported that survival differences associated with BCL2 and BCL6 protein expression in patients with DLBCL are no longer seen when the anti-CD20 antibody rituximab is added to anthracyclin-based chemotherapy.6, 35 Secondly, the practical relevance of a marker of probable tumor behavior is related to the strength of its predictive power and the available treatment. A marker associated with only a 25% difference in long-term survival in a disease for which there is essentially only a single treatment option will probably not find any practical application. Conversely, the clinical relevance is obvious for a marker that defines a subgroup of patients with a very good prognosis in a disease for which the conventional treatment carries a high risk of mortality/morbidity. The same would also be true if a very poor prognosis patient subgroup for which more heroic treatment might be justifiable can be identified on the basis of marker expression.

These factors explain further why so many reported prognostic markers have failed to appear in the clinic. One positive aspect is that the number of valuable markers can only increase, as it becomes possible to detect more cellular molecules in tissue biopsies. However, in designing strategies for identifying prognostic markers it may be beneficial to learn from the lessons of the many epidemiological studies of the past. Ultimately the search for prognostic tumor markers should be no more immune to cost-benefit analysis than any other field of research.

…the guiding principle is clear: one should always seek convincing explanatory models…