1 Introduction

In cancer research immunohistochemistry is an important technique to characterize protein expression in tumors, while preserving tissue morphology. By clarifying aberrant overexpression of proteins, prognostic factors can be discovered and highly specific therapies can be developed. A well-known example of such a tumor-specific marker is Her-2/neu in breast cancer [21].

Immunohistochemistry is based on an antibody-antigen interaction, combined with various detection techniques [19]. A frequently used method is based on a peroxidase-catalyzed reaction. The enzyme is conjugated to the primary (direct method) or secondary (indirect method) antibody and by adding diaminobenzidene (DAB) the labelled antibody is visualized by brown staining. The staining can be accentuated by counterstaining with haematoxylin, which stains the background tissue blue.

The interpretation of immunostains has made a rapid development from manual counting to fully automated techniques for image capture and analysis [1, 11, 13]. Although visual evaluation cannot be replaced entirely by these methods, automated image analysis has some major advantages. In clinical practice for instance, it increases throughput and reproducibility [25]. Furthermore, in recent research even localization of protein expression at the subcellular level was achieved [5], illustrating the additional possibilities of automated analysis over manual counting. Finally, manual interpretation is subject to high interobserver variability and is semi-quantitative at best [13], whereas in a research setting quantitative information on a continuous scale can be of great importance.

Several systems for analysis have been described, varying in software, threshold selection, colour format and algorithms [22]. The reports on quantitative automated image analysis focus on various parameters, like vascular density [14], nuclear staining or intensity of staining [11, 15]. Automatic quantification of the fraction of a membrane-bound protein poses some difficulties, although automated filling operations are applied to facilitate this analysis [7, 8]. Intensity-based quantification of a membrane-bound protein has been described using a membrane isolation algorithm [6].

The automated evaluation of multiple markers is a challenging issue. Most of the automated analysis methods that are available concentrate on one marker at a time. Whereas, especially in tumor cells, the co-localization of different markers can provide valuable information. The exact co-localization on a cellular level is difficult to assess when the markers have a different intracellular location, e.g. nuclear and membranous. Here, the absence of overlap on pixel level renders a binary image map comparison [10] unsuitable. Parametric mapping can solve this problem by creating an analysis grid defining square regions of multiple pixels. With this technique the whole image is subdivided in small squares. The information of all immunopositive objects or pixels within a square can be translated into numerical data, e.g. cell number, mean staining intensity, number of positively stained pixels etc. Different parameters can be combined in one value, like presence or absence of colocalization, and can be assessed for the whole tissue section. Thus, colocalization can be determined on a near-cellular level instead of a pixel level. This technique has been described previously in a different field of research for the assessment of the immunopositive cell density in the rat brain [24]. An advantage of this method is the retainment of intensity values, reflecting the concentration of the stain in the tissue, which would be lost when performing binary analysis.

In this study we applied the parametric mapping method to examine two important features of malignant tumors: proliferation and hypoxia. Proliferation is an important prognostic factor in many types of cancer. Highly proliferating tumors are associated with more aggressive biological behaviour and a worse prognosis [16, 23]. Hypoxia is another adverse prognostic factor, making tumor cells more resistant to therapy as well[4]. A combination of these two features could identify a subpopulation of tumor cells that is highly relevant for treatment responsiveness [7]. Several hypoxia-related markers are available to measure the amount of hypoxia in tumors [12, 18]. For the research presented here, carbonic anhydrase IX (CAIX) was used, a membrane-bound endogenous hypoxia-related marker to assess the feasibility of this method. In future research other biologically connected markers can be investigated, for example epidermal growth factor receptor (EGFR) and Ki-67.

In this study a quantitative computer-assisted analysis of immunohistochemically stained sections using parametric mapping is described and evaluated for CAIX, the membrane-bound protein, and for the colocalization of CAIX with Ki-67, a nuclear proliferation marker, to identify the proliferating hypoxic subpopulation of tumor cells. The effect of varying the resolution of the analysis will be presented and the method will be correlated to manual scoring. Finally, multiple biopsies from the same tumor will be scored to get an indication of the intratumor variability.

2 Materials and methods

2.1 Samples

Biopsy material obtained for routine purposes of 103 patients diagnosed between 2001 and 2008 with advanced laryngeal carcinomas was retrieved from the pathology department of the Radboud University Nijmegen Medical Centre in the Netherlands. From some patients multiple biopsies of the same tumor were available. The biopsies were obtained for diagnostic purposes and had been fixed in formaldehyde and paraffin-embedded. The samples were cut in sections of 5 μm. One section was stained for haematoxylin and eosin and two consecutive sections for Ki-67 and CAIX, both with a haematoxylin counterstain.

2.2 Immunohistochemistry

Sections were deparaffinised in Histosafe (clearing agent, Adamas, the Netherlands), rehydrated through a graded ethanol series and boiled for 30 min in antigen retrieval solution. After cooling for 25 min and rinsing in PBS, endogenous peroxidase was blocked with 3% H2O2 in methanol. Then, sections were incubated with 5% normal donkey serum in PAD (primary antibody diluent, Abcam, UK) to block aspecific binding.

Sections were incubated at 4°C overnight with mouse-anti-CAIX (E. Oosterwijk, department of Urology, Radboud University Nijmegen Medical Centre) diluted 1:25 in PAD or rabbit-anti-human Ki67 (Abcam, UK) diluted 1:50 in PAD. Subsequently, the biotinylated secondary antibodies were applied; biotin-labelled-F(ab’)2-donkey-anti-mouse IgG (Jackson Immuno Research) for CAIX and biotin-labelled-F(ab’)2-donkey-anti-rabbit IgG (Jackson Immuno Research) for Ki67, followed by ABC-reagent (Vector Elite kit, Vector Laboratories) for 30 min. Peroxidase activity was detected with diaminobenzidine (DAB). Finally, sections were counterstained with haematoxylin for 30 s and mounted with KP mounting medium (Klinipath, Duiven, the Netherlands). For negative controls, PAD was added without the primary antibody.

2.3 Image processing

A digital image processing system for fluorescence microscopy [20] was adapted to scan immunohistochemically stained sections using bright field microscopy. This system consisted of a monochrome CCD camera (Retiga SRV, 1392 × 1040 pixels) and a RGB filter (Slider Module; QImaging, Burnaby, BC, Canada) attached to a motorized bright field microscope (DM6000 Leica, Wetzlar, Germany). Whole tumor sections were scanned with a 10× objective at 100× magnification using a Macintosh computer running IPLab for Macintosh (Scanalytics Inc., Fairfax, VA, USA), which controlled this motorized system and generated 24-bit colour composite images (pixel size 1,8 μm). For every scan session a separate background image was recorded from an individual microscopic field in one focus plane and used to construct composite background images corresponding in size with the tumor section scans.

To extract and separate the individual colours from the DAB (brown) and haematoxylin (blue) signals, the RGB linear unmixing module in the TRI2-software (P.R. Barber, R. Locke, R. Edens, S. Ameer-Beg, B. Vojnovic and J. Gilbey; Randall Division and Gray Institute) was applied using the “Absorption” mode with a nonnegative least squares algorithm[2]. This resulted in grayscale images in which the pixel values represent the concentration of a marker (Fig. 1, second panel, pseudo-coloured).

Fig. 1
figure 1

a Overview of the process of image preparation and analysis. Example of an original DAB and hematoxylin stained tumor section for Ki-67 and CAIX (top) with the resulting pseudo-colored grayscale images after linear unmixing (second row). The third and fourth row show the conversion of immunopositive objects into numerical data by parametric mapping using square compartments of 20 x 20 pixels. Note the intensity values in the resulting CAIX image map and the labeling indexes in the Ki-67 image map. b A pseudo-colored image was constructed by merging the Ki-67 (red) and CAIX (green) images to check the match with a magnification of the area indicated by the white square. Scalebars: 100 μm

The colours to which the images were unmixed were obtained from previously saved reference files. In these files colour information was stored from the blue (haematoxylin) and brown (Ki67, CAIX) signals and was selected in images that were acquired from single-stained control sections. During linear unmixing, the RGB colour image was corrected for the microscope illumination by using the ‘background’ image. For further processing of the grayscale images, cut off values for the DAB and haematoxylin signals were selected visually above the background staining for Ki-67. For CAIX, one threshold could be set for each staining series, based on the mean background of each section within the series. The cut off value was defined as the mean background + two times the standard variation. To be able to quantitatively compare the overlap of the Ki-67 signal and the CAIX stained areas in the consecutive images, CAIX images were rotated and shifted to fit as close as possible the Ki-67 images using the interactive “register” function in IPLab. A pseudo-coloured image composed of the corresponding grayscale images was used to check the match (Fig. 1, right panel). Large deviations (>2 cells disparity) were corrected by refitting. In case of a fragmented or damaged tissue section, it was required to divide the image in two parts and match and analyse these parts separately.

Haematoxylin/eosin stained sections were used as a guide to manually delineate the tumor area on the Ki-67 image scan in IPLab, excluding normal tissue, necrotic areas and artefacts, creating a mask for image analysis.

2.4 Parametric mapping

As the markers investigated in this study are localized in different subcellular compartments, i.e. nucleus and membrane, a newly developed parametric mapping technique was applied to reduce the spatial information in the images. Hereto, all grayscale images were subdivided in small squares of 20 × 20 pixels, corresponding to a size of 36 μm × 36 μm (Fig. 1, third panel). This size was chosen based on cell size and small deviations in the fitting of consecutive tissue sections. From the Ki-67 image and the corresponding haematoxylin image the labelling index of Ki-67 was calculated for every square, defined as DAB-positive area divided by the total haematoxylin-stained nuclear area (values between 0 and 100). This result was set as a new value in the corresponding square of a new image, creating a so-called parameter image map (Fig. 1, bottom panel). In the corresponding CAIX image, the average intensity of the CAIX staining was measured for each square (values between 0 and 255) and set as a new value in the corresponding square of a second parameter image map. In all cases, only the pixels with a value above the preset cut off value were included. Objects (contiguous groups of pixels above the cut off value) smaller than six pixels were considered non-specific staining or cutting artefacts and were excluded from the analysis. Using these parameter image maps, the CAIX positive area and the Ki-67 labelling index for the whole section were calculated. The CAIX positive area (or CAIX fraction) was defined as the percentage of squares in the parameter image map with values higher than zero. For Ki-67, the labelling index of the whole section was calculated by averaging the values of all squares in the parameter image map.

The relationship between proliferation and CA-IX expression was analysed by determining the overlap for each biopsy. By combining the CAIX image map and the Ki-67 image map, every square has a value for CAIX and one for Ki-67. Overlap for the whole tissue section was defined by the percentage of squares having a value >0 for both parameters, leaving the intensity values out of consideration in the first calculations.

To further evaluate this colocalization and explore the possibilities of this method, 4 classes were defined. Class 1 includes squares with CAIX intensity < 100 and Ki-67 labelling index < 25, representing colocalization of weak staining patterns for both markers. Class 4 represents strong colocalization with square values of CAIX ≥ 100 and Ki-67 labelling index ≥ 25. Class 2 (CAIX < 100, Ki-67 ≥ 25) and class 3 (CAIX ≥ 100, Ki-67 < 25) include the squares with one parameter high and one low.

To evaluate the effect of the size of the squares (or grid size) on the results, the analysis was repeated with a square size of 10 × 10 pixels (approximately one cell) and 40 × 40 pixels (approximately 16 cells) in a subgroup of 18 tumors.

2.5 Manual scoring

To compare manual assessment of both markers with quantitative digital analysis, two investigators (S.R. and W.P.) scored the area positive for CAIX on the microscope in a subgroup of 17 tumors, blinded for the result of the automated analysis. The whole tumor section was scored semiquantitatively per field of view at 100× magnification, the mean of all fields of view representing the final score. The labelling index of Ki-67 was scored in three fields of view in representative parts of the tumor section with a 400× magnification by counting the positive nuclei and the total number of nuclei in that field. Only dark brown stained nuclei were considered Ki-67 positive.

To validate the colocalization, the percentage of “true” colocalization and the percentage of mismatch were determined within the squares. In three sections 200 squares were scored for CAIX-positivity, Ki-67-positivity and the presence of true colocalization.

2.6 Statistics

Statistical analyses were performed on a Macintosh computer using Prism 4.0 (Hearne Scientific software, Dublin, Ireland) software package. To assess the correlation between manual and computer-assisted scores and the inter-observer variation the Spearman correlation coefficient was calculated and Bland-Altman analyses were performed. Linear regression analysis was done to correlate the results of the different grid sizes with the manual score. P-values <0.05 were considered significant.

3 Results

Biopsy material was obtained from 103 patients and from 15 of these patients two or more biopsies were available. All sections were evaluated on their suitability for analysis. Fourteen biopsies were excluded, because of the absence of invasive carcinoma or the poor quality of the sample. In total, 104 biopsies of 89 tumors were available for analysis. The tumor area of all sections ranged from 0.1 mm2 to 26.4 mm2 with a median value of 3.7 mm2.

After immunohistochemical processing, Ki-67 and CAIX gave a clear brown staining with intensities varying gradually from light brown to dark brown. Clear membranous CAIX staining was observed in most tissue sections. The frequency distribution of the CAIX fraction and the Ki-67 labelling index for all tumors as obtained by automated analysis is shown in Fig. 2. The CAIX positive area showed a range of 0%–93% with a median value of 27%. The Ki-67 labelling index varied from 0%–42% (median 14%).

Fig. 2
figure 2

Results of the automated single marker analysis for CAIX and Ki-67. Frequency distribution of the relative area positive for CAIX showed a range of 0%–93% with a median of 27% (a). Examples of tumor sections with a CAIX positive fraction of 3% and 81% are depicted. The labeling index of Ki-67 varied from 0–42% with a median value of 14% (b), with a representative tissue section of 3% and 33%

To validate the automated analysis, the results were compared with visual scoring for both parameters. The area positive for CAIX showed a strong interobserver correlation (r s = 0.96, p = < 0.0001, Fig. 3a). Likewise, the correlation between the observers and the automated analysis was strong (r s = 0.97, p < 0.0001 and r s = 0.93 p < 0.0001) and is shown for one of the observers (Fig. 3b). Bland-Altman analysis confirmed this correlation with a bias of only 0.5. For Ki-67 the interobserver correlation was less strong, but also significant (r s = 0.90, p = 0.0001, Fig. 3c). Again, the correlation with the automated analysis was significant for both observers and is shown for one of them. By using Bland-Altman analysis a bias towards higher scores by manual assessment was observed (Fig. 3d).

Fig. 3
figure 3

Correlation of automated analysis with manual scoring. A strong relationship between CAIX score of two observers (r s = 0.96, p < 0.0001) (a) and between automated measurement and manual score (r s = 0.97, p < 0.0001) (b) was observed, with a small bias of−0.6 as determined with the Bland-Altman analysis. For Ki-67, the interobserver correlation was less strong (r s = 0.90, p = 0.0001) (c) and the correlation with the automated analysis showed a bias of 4.2, with higher manual scores (d). Dashed horizontal lines represent the mean difference and the mean difference plus and minus 1.96 times the SD of the differences

From 15 patients two biopsies from the same tumor were available for analysis. To get an indication of the intratumor variation of CAIX and Ki-67, sections from both biopsies were analyzed. The CAIX-positive area and the Ki-67 labelling index both showed a strong correlation (r s = 0.81, p = <0.0003 and r s = 0.89, p = <0.0001, Fig. 4a and b).

Fig. 4
figure 4

Correlation of CAIX positive area and labeling index of Ki-67 in multiple biopsies of the same tumor. Both CAIX (a) and Ki-67 (b) showed a strong correlation (r s = 0.81, p < 0.0003 and r s = 0.89, p < 0.0001 respectively)

The effect of the grid (square) size on the results was evaluated, by using different square sizes, varying between 10 × 10, 20 × 20 and 40 × 40 pixels. For CAIX, a course grid resulted in higher values and a fine grid in lower values, as can be expected. Beforehand, a grid size of 20 pixels was chosen as the preferred size for the analysis, based on average cell size and small inaccuracies in the matching of the Ki-67 and CAIX sections. On top of that, the results obtained with this grid size had the best fit with the absolute values found by the visual scoring of the CAIX area (slope 0.99 (20) slope 0.75 (10) and slope 1.15 (40).

For the labelling index of Ki-67, varying the grid size from 20 to 10 or 40 showed little effect on the results (slopes 0.64, 0.59 and 0.62 respectively), suggesting that the outcome is independent of the size of the square compartments.

As a measure for colocalization the percentage of squares with a value greater than zero for both the CAIX parameter map and the Ki-67 labelling index map was calculated for each biopsy by juxtaposition of the image maps. A wide range of colocalization was observed (0%–76%), with a median value of 15% (Fig. 5). The percentage of true colocalization divided by the total colocalization as calculated from the automated analysis was 79%, 82% and 84% for the three sections scored. The percentage of mismatch (defined by expression of both CAIX and Ki-67 within the same square, but not in the same cell, Fig. 5c) was 5%, 10% and 8% respectively. The remaining squares showed a discordant result due to thresholding differences between visual scoring and the preset cut off value for the automated analysis.

Fig. 5
figure 5

Colocalization of CAIX expression and proliferation (labeling index of Ki-67) in biopsies of 89 laryngeal tumors assessed by parametric mapping. The frequency distribution a shows a wide range of colocalization (0%–76%), with an uneven distribution towards lower values. A peudo-coloured merged image with an overlay grid illustrates our definition of colocalization in this study (b). c Schematic representation of mismatch (left) and “true” colocalization (right) within a square

Two examples with a comparable percentage of overall colocalization (28% and 38%), but an opposing class distribution, taking into account the intensity of CAIX staining and the value of the Ki-67 labelling index, are depicted in Fig. 6. In tumor 66 the colocalization consists mainly of areas with low CAIX intensity and high Ki-67 labelling index (class 2). In tumor 71-II the class distribution is totally different with 70% of the colocalization consisting of areas with high CAIX intensity and low Ki-67 labelling index (class 3).

Fig. 6
figure 6

Example of two tumors with a different degree of colocalization. The squares of a whole tissue section were subdivided into four classes: class 1 contains squares with CAIX intensity < 100 and Ki-67 labelling index < 25, class 2 CAIX < 100 and Ki-67 ≥ 25, class 3 CAIX ≥ 100 and Ki-67 < 25, class 4 CAIX ≥ 100 and Ki-67 ≥ 25. The relative frequency distribution of tumor 66 (a) shows more class 2 square compartments, tumor 71-II (b) more class 3, representing a different degree of colocalization

4 Discussion

Recognizing specific tumor markers that can predict outcome or response to treatment can be of great value in therapy decisions in many different types of cancer. Various single markers have already shown prognostic value, but as the aggressiveness of a tumor is a complex interaction of different pathways, knowledge about colocalization of multiple markers could be of great importance.

In this report we describe an automated computer-assisted analysis of immunohistochemically stained tumor sections using parametric mapping for simultaneous quantification of Ki-67, a nuclear proliferation marker, and CAIX, a membrane-bound hypoxia-related marker. We assessed the staining percentages of CAIX and Ki-67 individually and the percentage area showing colocalization of the two markers. To our knowledge, this is the first study that uses the parametric mapping technique to compare two biologically connected features located at different subcellular compartments.

This parametric mapping method has two strengths. The first is its potential to determine the relative tumor area positive for a membrane-bound protein in an accurate way, correlating strongly with manual assessment. The second is the measurement of colocalization of markers with different intracellular locations for a whole tumor section, which is almost impossible to assess by visual scoring. Although we focused on CAIX and Ki-67, this method is applicable to many other tumor markers, for example epidermal growth factor receptor (EGFR) and Ki-67 or hypoxia-inducible factor 1α (HIF-1α) and CAIX, and can therefore be of great significance for future research.

As our parametric mapping technique requires grayscale images as input, we first had to separate the individual colours in the sections. In this study, the resulting images obtained by the linear unmixing algorithm were of good quality and, despite a high background staining, apt for further analysis. CAIX showed a strong membranous staining, of which the cut-off value could easily be set against the background. The correlation of the automated results for CAIX and Ki-67 with manual scoring was high. No systemic errors were shown by the Bland-Altman analysis, making the parametric mapping technique an accurate method for assessing the relative CAIX-positive tumor area in whole tissue sections. Although the results with varying grid sizes all showed a strong correlation with visual scores, the linear regression slope of the 20 × 20 pixel square was closest to one. Therefore, in absolute values this was the best estimation compared to manual scoring.

This is a somewhat unexpected finding, as theoretically the ideal square size would contain one cell. However, due to heterogeneity in cell size and tissue structure this ideal size is difficult to determine. Considering the other goal of our research: assessing the overlap of the two markers in consecutive tissue sections, the 10 × 10 pixel square would likely be too small. In conclusion, the 20 × 20 pixel size is suitable for analysis of the relative positive area of a membrane-bound marker.

Although the colocalization of CAIX and Ki-67 cannot be assessed exactly on a cellular level, this is a very close approximation. For the analysis of the colocalization the size of the square compartments of 20 × 20 pixels seemed appropriate. This size includes at least one cell and can correct for small errors in the fitting of consecutive sections, without the loss of information. As the deviations in fitting were very small and because of the regional pattern of CAIX staining [3], this is a decent and viable approach to assess the degree of colocalization. This is supported by the finding that only 5%–10% mismatch is present, when observing the colocalization at the level of the square compartments.

As becomes clear in Fig. 1, information about intensity values of CAIX and labelling index of Ki-67 remains present in the parameter image maps. This information was used to create classes of colocalization, with class four representing the strongest colocalization with intense CAIX staining and a high Ki-67 labelling index. This is illustrated in Fig. 6 with the relative frequency distribution of these classes of two different tumors. Tumor 66 and tumor 71-II have similar overall colocalization rates but show a different distribution of colocalization over the various classes, possibly representing dissimilar biological behaviour. This is just an example of how the information in the image maps can be used. For other parameters alternative calculations and comparisons can be done with numerous possibilities.

Segmentation of the Ki-67 signal was sometimes difficult due to the gradual intensity scale and was subject to interpersonal interpretation. The large interobserver variation in the manual scores of the labelling index of Ki-67 confirmed this. Other authors have encountered this problem as well [9, 17]. This could possibly be improved by using a different staining method, such as immunofluorescence, or altered background stain. Another issue is the higher overall labelling index for Ki-67 obtained with manual scoring compared to the automated analysis. This is probably due to the intensity scale of the DAB staining and the difference of manually counting positive nuclei on the microscope with a 400× magnification and setting a threshold on a tissue image scan. In the first situation the estimated cut off value will be lower, resulting in a higher value. However, most importantly, the relative scoring of the observers and the automated analysis correlated well, indicating that with different observers as well as with the automated analysis tumors are ranked similarly.

As compared to a high throughput method such as tissue microarray, our parametric mapping technique is a labour-intensive method due to the delineation of the tumor areas and the matching of the tissue sections. However, a large disadvantage of tissue microarrays is the use of small tissue cores, leaving the heterogeneity within a tissue section out of account and increasing the risk of sampling error. Analysis of whole tissue sections is a major advantage of the current method. This is supported by the strong correlation seen between CAIX fractions in multiple biopsies from the same tumor. As described before, there can be large intratumor heterogeneity with respect to CAIX fraction [8] and Ki-67 labelling index [26]. The wide range we observed in CAIX staining (0%–93%) can be another explanation for this remarkable finding. A good correlation is more likely to occur in a wide range of values, than with small intertumor variations. For the Ki-67 labelling index a strong correlation between biopsies was shown as well, although the data showed a smaller, but still considerable range (0%–42%). Therefore, it can be concluded that if the range of observed values is sufficiently wide, a random biopsy could give a fairly representative indication of the hypoxic or proliferation status of a tumor. A marker with little variability would be less informative anyway as it has less potential to discriminate between tumors with different biological behaviour.

In conclusion, research on verification of specific tumor markers runs parallel to the development of targeted therapies. This enables a more elaborate prediction of tumor response and the selection of patients for a specific therapy. Automated quantification of multiple markers in immunohistochemically stained tumor sections can be of great use to achieve this goal. By parametric mapping image maps are created for each marker containing numerical data representing a particular biological feature (area, intensity, labelling index) quantitatively. Once these image maps are prepared for analysis and a threshold is set, multiple analyses can be performed, depending on the research question. Moreover, when using multiple markers, the relationship between the corresponding biological features can be studied quantitatively independent of their subcellular localization, even in adjacent tumor sections. This parametric mapping technique can have a wide application in cancer research and patient selection.