Automatic quantification of microvessel density in urinary bladder carcinoma

Seventy-three TUR-T biopsies from bladder carcinoma were evaluated regarding microvessel density, defined as microvessel number (nMVD) and cross-section endothelial cell area (aMVD). A semi-automatic and a newly developed, automatic image analysis technique were applied in immunostainings, performed according to an optimized staining protocol. In 12 cases a comparison of biopsy material and the corresponding cystectomy specimen were tested, showing a good correlation in 11 of 12 cases (92%). The techniques proved reproducible for both nMVD and aMVD quantifications related to total tumour area. However, the automatic method was dependent on high immunostaining quality. Simultaneous, semi-automatic quantification of microvessels, stroma and epithelial fraction resulted in a decreased reproducibility. Quantification in ten images, selected in a descending order of MVD by subjective visual judgement, showed a poor observer capacity to estimate and rank MVD. Based on our results we propose quantification of MVD related to one tissue compartment. When staining quality is of high standard, automatic quantification is applicable, which facilitates quantification of multiple areas and thus, should minimize selection variability. © 1999 Cancer Research Campaign

1. The choice of antibody and staining protocol: several antibodies have been tried, including factor VIII-related antigen, CD31, CD34, vimentin, Ulex europaeus agglutinin I, collagen IV, cathepsin B and histological stainings, such as haematoxylin-eosin and Masson's trichrome. Studies performed to evaluate which antibody is best suited for use in microvessel quantifications have resulted in contradictory conclusions. Besides the choice of antibody, there are also differences between the staining protocols used, such as pretreatment methods and chromogens. Regarding quantification strategies, several approaches have been taken. 2. Definition of MVD: number of vessels, stained endothelial cell area, total vessel area including lumen or perimeter. 3. Choice of area for MVD quantification: the entire tumour area, hot spot, random, in areas of tumour invasion or in areas representative of the overall tumour grade. 4. Total magnification: ϫ40, ϫ50, ϫ100, ϫ160, ϫ200, ϫ400 and ϫ500 (The number of microscopic fields and corresponding size of the tumour area analysed vary greatly). 5. Quantification techniques: manual counting in the microscope and manual counting in the microscope using an ocular raster/grid, using Chalkley point eyepiece graticule, using a projection microscope with a grid square on the table, stereological quantification, image analysis equipment or by subjective grading of the vessel density. These issues have recently been reviewed by Vermeulen et al (1996) and others (Barbareschi et al, 1995a;Weidner, 1995aWeidner, , 1995bFox, 1997) In the present study, a new, automatic quantification method was used which automatically classifies an image into vessels and background, without observer interactivity. The technique has been recently described (Ranefall et al, 1999), proven stable regarding variations in light and focus settings, and showed a high concordance with manual counting of microvessels. For comparison, a semi-automatic quantification method was also used. This method, in which reference points are subjectively chosen by the observer, has also been described previously (Ranefall et al, 1997), although not tested for microvessel quantification.
The aims of the present study were: 1. to develop an optimized IHC staining protocol suitable for automatic quantification of microvessels 2. to compare MVD in cystectomy specimens and preceding biopsies to assess the representativity of biopsy material 3. to evaluate both semi-automatic and automatic quantifications of MVD with respect to intra-observer reproducibility 4. to evaluate the observer capability in discriminating different MVD contents in neighbouring images.

Patients
Two separate materials were analysed: 1. Twelve cases of bladder carcinoma, where both biopsy (transurethral resection material) and the corresponding cystectomy specimens were available, were obtained from the Department of Pathology, University Hospital Uppsala and used to test sampling representativity. All cases were cystectomized without preoperative treatment. 2. Biopsy specimens from 73 patients with invasive bladder carcinoma, stage T1-T4a (Malmstrom et al, 1996) were collected. The patients were randomized, regarding preoperative treatment, into a multicentre study (Nordic Cystectomy Study I) and were all subsequently cystectomized. The patients were recruited from ten hospitals in Sweden and the material was formalin-fixed and paraffin-embedded, according to standard procedures, at the respective local pathology department. The study was approved by ethical committee.

Immunohistochemistry
Parallel sets of slides were IHC stained for endothelial cells (single IHC) and for epithelial cells, as well as endothelial cells in the same section (multiple IHC) ( Figure 1).

Single IHC
Paraffin sections were cut at 4-µm thickness and placed onto Super frost/plus ® slides (Mentzel, Germany). Two mouse monoclonal antibodies, CD31 (clone JC70, Dako, Glostrup, Denmark) and CD34 (clone QBEND 10, Oxoid, Basingstoke, England) were used as a mixture, diluted 1:80 and 1:100 respectively and incubated for 1 h. Prior to IHC, heat mediated antigen retrieval (AR) was performed by boiling the slides in 0.01 M citrate buffer, pH 6.0, for 16 min at 750 W (Malmstrom et al, 1992) in a microwave oven (Whirlpool VIP34, Sweden). Both blocking for endogenous peroxidase in 0.3% hydrogen peroxide and preincubation in 10% normal rabbit serum (Dako), were diluted in phosphate-buffered saline (PBS) and incubated for 20 min. As link antibody, a biotinylated rabbit anti-mouse (Dako) was applied, followed by a peroxidase-labelled streptavidin-biotin complex (Dako), both diluted 1:200 and incubated for 30 min. The slides were developed in nickel-sulphate-enhanced DAB (Merck, Darmstadt, Germany; Sigma, St Louis, MO, USA respectively) for 6 min (Green et al, 1989) and counterstained in lightgreen (Merck). Finally, the slides were dehydrated through graded alcohols to xylene and mounted in organic mounting medium. Unless otherwise stated, reagents were diluted in 0.5% BSA-C (Aurion, Wageningen, The Netherlands) in PBS and incubations were performed at room temperature. Washings, for 3 × 10 min, between incubation steps were done in 0.05 M Tris, pH 7.6, containing 0.3 M sodium chloride and 0.1% Tween-20 ® .

Double IHC
The labelling of vessels was performed as above, except for the AR treatment, where 1 mM EDTA (Morgan et al, 1994), pH 8.0, was used instead of citrate buffer. After developing the first antibody (CD31/CD34) the slides were rinsed in tap water for 10 min before incubation in 1% normal mouse serum (Dako) in PBS for 20 min. To visualize the epithelial structures, an anti-cytokeratin antibody (clone AE1/AE3, Boehringer Mannheim, Mannheim, Germany) was applied for 16 h at 4°C. The link antibody and the streptavidin complex were diluted and incubated as above, except that alkaline phosphatase was used as labelling enzyme. Slides were developed in Vector red ® (Vector, Burlingame, CA, USA) for 25 min, rinsed, counterstained in light green and mounted as above.

Image selection
In all 73 cases an area of 3-4 mm 2 was outlined. The criterion used for selecting these areas was the area subjectively judged as containing the highest number of microvessels at the deepest level of the bladder wall engaged by tumour, i.e. the tumour invasion front ( Figure 2). The selection was done by three observers (KW, CB and PUM) by parallel examination of both Weigert-van Gieson and microvessel-stained slides. The selection process was performed before the quantifications started and without knowledge of patient outcome. Semi-automatic quantifications were also performed once without the requirement of tumour front localization, but in invasive tumour. Quantifications were always performed in 2 microscopic fields, except for one occasion where 10 microscopic fields were quantified, subjectively ranged in a descending order regarding MVD. The size of the recorded images corresponded to 0.22 mm 2 per image.

Image acquisition
The 756 × 572 pixel colour images with 3 × 256 grey levels were grabbed by a Sony DXC-151 colour video camera attached to a standard Olympus BH-10 microscope, using 20× objective. This gave a final magnification of ×50 and a pixel size of about 0.8 µm for a wavelength of 550 nm. For all images, Köhler illumination was maintained and the aperture iris diaphragm ring was fixed to 0.5.

Classification of video images and definition of variables
The semi-automatic counting of vessel number was an algorithm allowing the operator to label each structure, subjectively judged as a microvessel, in an image and the result was subsequently presented as the total number of labelled objects per image. Any stained isolated endothelial cell or cohesive endothelial cell cluster was considered to represent a single countable microvessel. Vessel  lumen or red blood cells were not necessary for a structure to be defined as a microvessel. All muscular arteries were excluded. The semi-automatic vessel area quantifications were based on manually selected reference points, representing the respective colour subclass in an image. Microvessel area, aMVD, was defined as stained endothelial cell surface area, excluding vessel lumen.
The automatic quantification was performed without influence from the operator. No interactive steps were involved besides the selection of respective image. In contrast to the semi-automatic method, the area quantification here included information describing both stained endothelial cell area and total vessel area. From here on, stained endothelial cell area is refered to as aMVD.
The software was developed at the Centre for Image Analysis, Uppsala University, Sweden and is presented in detail elsewhere (Ranefall et al, 1997(Ranefall et al, , 1998.

IHC protocol evaluation
Four different antibodies to endothelial cells; and seven to cytokeratin were tested. Staining of endothelial cells were also tested using cocktails, composed of two or three antibodies each, in varying combinations and concentrations. Different pretreatment protocols, chromogens and counterstains were also tested. Details are shown in Tables 1-3. All stainings were evaluated by two observers, regarding specificity, sensitivity, intensity, resolution and contrast in the respective staining alternative. One single IHC protocol (1 × IHC) was chosen because of its excellent outlining of vessels, with a sharp contrast towards the surrounding tissue. A mixture of CD31 and CD34 was judged as the best alternative for visualizing the vessels. The double IHC protocol (2 × IHC) chosen showed an almost equally high contrast and was also the superior alternative for the visualization of the epithelial compartment. The latter was mainly an effect of the antigen retrieval in EDTA-buffer which further enhanced the cytokeratin immunostaining. The vessel staining (CD31/CD34 mixture) was only marginally effected by the exchange of citrate-buffer for EDTA. The two protocols are described in detail in Materials and Methods and the staining outcome illustrated in Figure 1.
Area selection in biopsies compared to cystectomy specimens In this study it is stated that MVD is quantified in microvessel 'hot spots' in the tumour front ( Figure 2). The question is, however: is it possible, in fragmented biopsy specimens, to select a welldefined area that corresponds to the 'reality' in cystectomy specimens? Therefore, nMVD and aMVD were quantified in parallel biopsy and cystectomy specimens. As visualized in Figure 3A an excellent agreement for nMVD was found in 11 of 12 cases (92%), whereas aMVD showed a good concordance in eight of 12 cases (67%) ( Figure 3B).

Material 2
Evaluation of automatic and semi-automatic quantification Semi-automatic quantification of nMVD was highly reproducible (r = 0.88) when related to total tumour area and applied on  1 × IHC stainings. The corresponding number for aMVD was 0.72. When these quantifications were performed on 2 × IHC stainings, a decreased reproducibility was observed for aMVD (r = 0.56) whereas nMVD reproducibility was only marginally affected (r = 0.85). In this analysis, new images were recorded for each quantification round. If exactly the same images were quantified on two occasions, the reproducibility was increased for nMVD and aMVD (r = 0.97 and 0.88). Simultaneous quantification of three compartments, vessels, epithelium/tumour and stroma, outlined by the double IHC-protocol and expressing MVD related to stromal fraction, was less reproducible, r = 0.51 for nMVD and 0.43 for aMVD.
The automatic quantifications of nMVD and aMVD, applied to single IHC stainings, showed a lower reproducibility compared to the semi-automatic quantification, r = 0.53 and 0.65 respectively. However, when 19 cases judged as of poor IHC staining quality were excluded, the corresponding figures were 0.73 and 0.87 (Figure 4 A, B). The criteria for exclusion were based on a subjective judgement of the same staining qualities as described in the 'IHC protocol evaluation' section. This was performed by one observer on two occasions resulting in a concordance in 71 of 73 cases. There was a good agreement between semi-automatic and automatic quantifications for both nMVD and aMVD respectively, r = 0.72 and 0.75 (Figure 5 A, B). If the 19 cases of poor IHC staining quality were excluded, the correlation improved (r = 0.80 for nMVD and 0.92 for aMVD).
It was not possible to perform three-compartment quantification automatically.
Assessment of the intra-observer reliability regarding selection of the highest MVD areas Based on the subjective impression of MVD, ten images from each case were selected, ranged in a descending order (1 to 10). Subsequently, the subjective ranking of the images were compared to the results from the corresponding semi-automatic quantification. Regarding nMVD in 20 of 73 cases (27%), the subjective impression was in agreement with the semi-automatic quantification. The corresponding fraction for aMVD were 16 of 73 (22%). In the discrepant cases, one or both of the images with the highest MVD, as quantified semi-automatically, was found in images ranked from 3 to 10 by subjective impression. The correlation between results from quantification of 2 images (subjectively ranked as 1 and 2) and the total 10 was good, r = 0.67 and 0.77 for nMVD and aMVD respectively.
Quantifications in hotspot areas, chosen without further specification within invasive tumour, was performed once. The results were poorly correlated to, and generally higher than, quantifications performed in tumour front areas (results not shown). Hotspot quantifications were performed by semi-automatic techniques and no attempts were made to evaluate the reproducibility.

DISCUSSION
We have previously applied computer-based image analysis technique for malignancy grading of urinary bladder carcinoma with promising results regarding reproducibility (Jarkrans et al, 1995;Choi et al, 1997). In this study, we used a similar methodological approach to assess MVD.
The validity of quantitative results from IHC-stainings are sometimes questioned (Wold et al, 1989;Wagner, 1993;Lambkin et al, 1994). Recently, several investigators have compared sensitivity and specificity for different antibodies against endothelial cells (Hollingsworth et al, 1995;Arakawa et al, 1997;Lee et al, 1997;Martin et al, 1997;Duarte et al, 1998), but additional factors such as pretreatment, choice of chromogen and counterstaining are seldom evaluated. Based on an extensive screening of available antibodies, pretreatments and chromogens, optimized IHCstaining protocols for visualization of the histological compartments of interest were developed. Despite our efforts, using the automatic quantification technique, 19 cases had to be excluded in this study. Using CD31 or CD34 as single antibody, instead of a mixture, did not alter the staining result. It is therefore reasonable to assume that fixation and/or histoprocessing is responsible for the poor immunostaining results. Of the excluded cases, 13 originated from three of the ten hospitals included in the study. The exclusion rate for specimens from these three hospitals were 42, 60 and 72% respectively. This illustrates clearly one of the major problems in standardization of IHC. The rate of poor quality stainings probably increased because of the fact that the IHCstaining protocols were developed and optimized on material from one of the contributing hospitals and subsequently applied on material from nine other hospitals. Potential variations in fixation time, histoprocessing regimens and material storing conditions between laboratories are known to effect the IHC-staining quality (Leong and Gilham, 1989;Fisher et al, 1994;McDermott et al, 1997). The necessity of an initial subjective judgment of the staining quality is a limiting factor in the development of an objective automatic quantification method.
We suggest that the prognostic implicit of tumour angiogenesis relates to the amount and mode of growth of the tumours. Therefore, to express MVD related to tumour stroma area or epithelial fraction of the tumour might add important prognostic information. To our knowledge no such attempts have been previously made. In this study an IHC-staining protocol was developed for the staining of both the microvessels alone and the tumour cells alone. These three-compartmental quantifications performed were associated with a decreased reproducibility. However, these stainings proved helpful in the screening for appropriate areas, here tumour front, for MVD quantification. The double IHC and single IHC stainings were equally suited for semi-automatic quantifications of nMVD related to total tumour area, whereas semi-automatic aMVD quantifications were less suited, regarding reproducibility, when applied to multiple IHC stainings.
Studies of MVD in urinary bladder carcinoma have investigated biopsy (Dickinson et al, 1994;Philp et al, 1996;Nakanishi et al, 1997) and cystectomy specimens (Bochner et al, 1995;Jaeger et al, 1995;Grossfeld et al, 1997). The latter should be more suitable as the relation between tumour and vasculature is more easily assessed, presumably in a more representative and reproducible way. This is further stressed by Bochner et al (1995) who state that most bladder tumours exhibit a large degree of heterogeneity with respect to microvessel density. However, the preceding transurethral resection biopsy is the basis for treatment decision and prognostication. Also only a minority of newly diagnosed cases are operated with cystectomy. Therefore, for prognostication in clinical practice any marker has to be evaluated in the biopsy specimen.
Comparing biopsy material and corresponding cystectomy specimen, a major discrepancy in MVD was detected in one of 12 cases. This case exhibited a low MVD in the cystectomy specimen (Figure 3 A, B). An explanation may be that the most vascularized tumour fraction was removed at biopsy. There are discrepancies between the cystectomy specimens and the fragmented biopsy specimens regarding evaluation of MVD. Cystectomy specimens provide a better representation of the tumour growth pattern. 'Keyhole' view in biopsy specimens illustrates the importance and problems of representativity. More extensive MVD quantification in biopsies might also be a way to minimize the effects of heterogeneity.
Different quantification strategies have been reported. In most studies different manual counting techniques have been proposed (Weidner et al, 1991;Dickinson et al, 1994;Jaeger et al, 1995) and sometimes also compared with semi-automatic, image-analysisaided quantification of stained endothelial cell pixel area (Barbareschi et al, 1995b;Fox et al, 1995;Kohlberger et al, 1996). The reproducibility has been reported as moderate to high, at both intra-and inter-observer level (Bochner et al, 1995;de Jong et al, 1995;Penfold et al, 1996;Philp et al, 1996). All methods described hitherto are influenced by subjective decisions made by the observer. To avoid these potential sources of intra-and interobserver variability, an automatic method was developed and evaluated in comparison to semi-automatic quantification methods. The study was designed to analyse the effects of observer quantification interactivity. By minimizing the area selection variability, this could be done more explicitly. Thus, all quantifications were performed in a strictly defined histological area, previously demarcated on respective slides. In this way the intraobserver reproducibility was considered as an assessment of the quantification method stability. No attempts were made to evaluate inter-observer or area screening reproducibility.
When semi-automatic quantification of nMVD and aMVD is performed at two occasions on identical images the reproducibility is high. The reproducibility for the automatic quantification here may be regarded as perfect, even if variations in light settings and focus are considered, as we have previously shown (Ranefall et al, 1998). When quantifications were performed within an outlined area, but with a potential variance in the choice of image, they were slightly less reproducible for the semi-automatic method. For the automized quantifications this was more evident. This was due to the latter method's vulnerability to reduced staining quality and inability to recognize obvious artefacts. Exclusion of cases with visible poor IHC-staining quality resulted in a reproducibility similar to the semi-automatic quantifications.
As observed, the intra-operator variability is more pronounced regarding estimation of area fractions than object recognition and counting. This is also evident when automatic and semi-automatic quantification of aMVD are compared, where elimination of the intra-operator variability using an automized method improves the reproducibility ( Figure 5B). On the contrary, the semi-automized method was superior in object recognition and counting, especially when specimens of poor IHC stainings were not excluded ( Figure 5A).
The results from the semi-automatic quantifications indicate that the subjective selection of images, even within a small outlined area, influenced the intra-observer reproducibility. This was also found for the interactive quantification steps, i.e. counting and reference point selection.
In a study performed in breast carcinoma Martin et al (1997) concluded that finding the area with the highest MVD appeared to be the most subjective step in microvessel quantification methodology. This was based on results that showed an efficiency of only 20% in finding the 'hottest spot' in the first chosen microscopic field out of 10 counted. In this study a similar test resulted in a slightly higher efficiency (27 and 22% for nMVD and aMVD respectively). Still, in most cases an increased MVD was observed when multiple areas were used to identify the hot spot images. The correlation between MVD from 2 images only and the true highest 2 out of 10 was high regarding nMVD, but lower for aMVD. Considering all images were selected within a limited area, the consequence was that quantification of an extensive tumour fraction led to increased hot spot recognition. Even when an apparently homogeneous fraction within a tumour is quantified, observer subjectivity in the selection of images generates method variability. Using the automatic quantification technique described here, multiple areas can be rapidly evaluated. The automized technique reduces the quantification time up to three to four times compared to interactive measurements (data not shown) and simultaneously provides both number and area estimates of the microvessels.
The medical application of computerized image analysis is a progressing field. An increased awareness of the need for objective quantification methods together with improved equipment performance will further promote this progress. The expensive equipment has earlier restricted the accessibility to computerized image analysis techniques to major hospitals and laboratories. Today, high performance systems are available for less than £10 000, thus increasing the number of potential users.

CONCLUSIONS
Based on our results, on reproducibility, we propose the following schedule for assessment of MVD in urinary bladder carcinoma biopsies: • To perform an initial subjective evaluation of the IHC staining quality, excluding specimens of poor quality.
• To evaluate MVD related to total tumour area using a single chromogen IHC staining protocol.
• To use an automatic quantification technique that simultaneously and rapidly provides information of both number and total vessel area as well as stained endothelial cell area.
• To quantitate in a large proportion of the tumour fraction of interest.
This method is now being tested regarding its prognostic value, specifically considering the choice of assessment region in the tumour. Strictly defined regions for assessment of MVD is, in addition, crucial to an improvement of inter-observer reproducibility.

ACKNOWLEDGEMENT
This work was supported by Linnérs-Hagstrands fund and grant from the Swedish Cancer Society (grant 2323-B97-11XAA).