Main

Aided by advanced image analysis technologies, digital pathology is revolutionizing histopathology by providing objective assessment of cellular components within tumor samples and assisting tumor grading.1, 2 To date, remarkable progress has been made to obtain clinically relevant quantitative data from pathological samples, including grade differentiating features in prostate cancer,3, 4 mitotic counts5, 6 and features for subtyping breast cancer7 and other cancer types such as oropharyngeal squamous cell carcinoma.8 The majority of these efforts have focused on the changes in tumor morphology related to cancer cells for grading or subtyping purposes. Nonetheless, digital pathology has a unique advantage towards studying the microenvironment in solid tumors because of its capacity to map the spatial context of normal cells interacting with cancer cells.

The spatial context is key to understanding the microenvironment, the intrinsic architecture of which is highly heterogeneous with profound clinical implications. The role of individual microenvironmental components in cancer development and treatment resistance has long been recognized and extensively reviewed elsewhere.8, 9, 10 In this review, we focus on the application of digital pathology for studying the intra-tumor heterogeneity of the microenvironment and discuss its clinical and biological impact in the ‘omics’ era (Figures 1a–c). We note that other methodologies such as those employing non-invasive imaging are particularly useful for offering measurements of temporal sampling, 3D vasculature and heterogeneity of metastatic tumors.11, 12 Whilst the sensitivity and specificity of imaging techniques have advanced dramatically in recent years, pathology still offers the ability to map genetic and phenotypic aberrations at cellular resolution. Therefore, the focus of this paper is on approaches that offer objective identification and consider the spatial context of individual microenvironmental components in pathological samples. Finally, we offer thoughts and potential solutions to future needs and challenges that lie ahead for translating advances in digital pathology into knowledge of the microenvironment.

Figure 1
figure 1

Synergies between digital pathology offering spatial context of the tumor microenvironment and ‘omics’ high-throughput molecular profiling. (a) A hemotoxylin & eosin (H&E) breast tumor section displaying high spatial variability in the tumor microenvironment in different tumor regions. (b) Digital pathology to quantify the spatial heterogeneity of lymphocytic infiltration. (c) Quantitative data at the morphological level can be directly related to ‘omics’ data at the molecular scale to discover new clues for subtyping and integrative biomarkers.

Microenvironmental Heterogeneity and Digital Pathology

From a cancer evolution point of view, differential microenvironmental conditions such as immune infiltration, hypoxia and nutrient and drug diffusion provide selective pressure and shape cancer development. Treatment resistance as a result of selection represents a main cause for treatment failure and a significant obstacle to effective cancer therapeutics.13 Microenvironmental cells orchestrate their influence on cancer heterogeneity with strong regional differences (Figure 1a). Therefore, a sufficient knowledge of the intra-tumor heterogeneity of the microenvironment is critical for understanding environmental selection. Pathological tumor sections provide the spatial context of cancer-microenvironment interactions at single-cell resolution. Such spatial data has already aided our identification of clinically relevant features, potentially yielding predictions more powerful than simple cell counts that ignore the tumor context. For example, a number of studies have shown that the spatial location of immune cells relative to invasive cancer cells is of clinical interest in many different cancer types.14, 15, 16, 17, 18 Abundance of CD8+ cells in distant stromal regions independently predicts breast cancer-specific survival.14 A high density of CD3+ cells in the invasive margin was found to be significantly associated with disease-free survival in colorectal cancer.18

However, technological development to facilitate automated identification of stromal components and rigorous analysis of their spatial heterogeneity is still in its infancy. Thus, despite substantial advances in our knowledge of the functions of individual microenvironmental components, our understanding of intra-tumor heterogeneity of the microenvironment is limited.10 Objective and reproducible methods for automated identification and statistical analysis of the spatial distribution of microenvironmental components such as immune cells, fibroblasts and vessels remain an unmet need. Development in this direction will accelerate the rate at which histology data is processed as well as the translation of our knowledge of the microenvironment into biomarkers.

During technology development, there are many challenges that need consideration, including quality control, robustness and reproducibility. Multiple factors such as tissue handling, section thickness and staining protocols can contribute to the high variability in pathological samples. To enable fair comparison across different samples, methods need to be tested to ensure robustness in accounting for such variability, preferably using samples from independent, large-scale patient cohorts. An array of high quality digital pathology solutions, including those from TissueGnostics (Austria), Definiens (Germany) and PerkinElmer (USA), among many others, have become commercially available. They offer user-friendly interfaces that enable researchers with little knowledge of image processing to perform quantitative image analysis. Image analysis platforms such as MATLAB (Mathworks, USA), ImageJ (National Institutes of Health, USA), and Fiji (an enriched ImageJ version) complement digital pathology solutions and have been proven useful for technically affine users. Thus far, a number of these techniques have been applied to the study of tumor pathological features in cancer samples, leading to significant findings.7, 19, 20, 21, 22 However, the considerable cost of commercial software may prohibit independent studies from evaluating their performance and reproducing their results. On the other hand, the availability and maintenance of non-commercial software in research papers is not always ensured. Moreover, commercial or non-commercial software do not always support exporting spatial information for subsequent analysis.

Mapping Individual Microenvironmental Components

Digital pathology has been successfully applied for objective assessment of overall abundance or activation of various microenvironmental components, including immune cells,20, 23 cancer-associated fibroblasts24, 25 and vessels.26, 27, 28 In this section, our primary focus is on methodologies that employ image analysis techniques to assess the spatial context of intra-tumor heterogeneity in three main categories: immune cells, fibroblasts and vessels.

Immune Cells

It has become evident that the spatial context of immune cells is critical for cancer development.18, 29, 30 Therefore, although a number of gene expression signatures have been published revealing high levels of molecular heterogeneity in immune infiltration,31, 32, 33, 34 pathological assessment remains critical for discerning tumor spatial heterogeneity. Lymphocytes can be identified based on their typical morphology of small, round and homogeneously basophilic nuclei which differentiates them from other leukocytes, such as neutrophils with more elongated and segmented nuclei. Thus, in certain breast tumor types, lymphocytes can be differentiated in general from cancer cells, which have larger and more pleomorphic nuclei. This is the principle of a number of image analysis tools to identify lymphocytes in H&E sections.35, 36, 37 Following image analysis, distances between individual cancer cells and lymphocytes can be quantified using spatial data uniquely identifying cell locations.38 Such methods facilitate systematic investigation of cancer-lymphocyte interactions in large patient cohorts.

However, lymphocytes encompass diverse subclasses including helper T cells, regulatory T cells, natural killer cells and B cells. In the right context, different subclasses of lymphocytes may exhibit entirely different phenotypes with pro- or anti-tumor roles.30, 34, 39 It is thus of paramount importance to discriminate them. One of the first studies to apply rigorous spatial statistics on data from fully automated image analysis investigated clustering patterns of B cells and T cells in healthy and tumor-draining lymph nodes in breast cancer patients.40 Specifically, the L-function,41 which is a statistical test of spatial homogeneity, was applied to the location data of B cells and T cells in immunohistochemically (IHC) stained sections generated from the image analysis software GemIdent (Stanford University, USA) developed by the same group.42 The L- and K-functions43 are estimators of spatial homogeneity in a set of points, and can be used to gauge the extent of spatial clustering or dispersion in the data across any scale length (Figure 2a) in two or three dimensions. Not surprisingly, higher cell clustering was found in tumor-draining lymph nodes compared with healthy lymph nodes, possibly due to the aggregated patterns of tumor cells. Nonetheless, T cells exhibited a more pronounced clustering pattern than B cells, suggesting differential dispersion patterns between the two groups. More recent work has demonstrated the use of digital pathology to identify functionally active natural killer cells (CD3−/perforin+) and CD3+ T cells in breast cancer IHC images using a software solution from Definiens.19 Spatial analysis was subsequently performed using k-means clustering directly on the detected tumor and immune cells. k-means clustering44 is a process of partitioning data points into a pre-defined number of groups, k, and is based on minimizing the Euclidean distance between each point and a cluster center (Figure 2b). As a result, distinct spatial patterns of cancer-immune interactions were revealed within the same tumor, highlighting spatial heterogeneity in the cross-talk between cancer cells and lymphocytes.

Figure 2
figure 2

Schematic diagrams of k-means cluster analysis and Ripley’s K statistic. (a) K-means clustering used to group data points based on proximity to k (here, 4) centroids in the data, where x1 and x2 are measurements such as cell abundance, immune density and distance between cancer and immune cells, thus grouping them into different categories for subsequent analyses. (b) Ripley’s K can be used to investigate the variation in clustering with distance. Here, the distribution exhibits clustering at small scale (d1) and dispersion at large scale (d2).

Spatial analysis based on automated image processing has led to the identification of new prognostic factors by analyzing lymphocytes as well as other types of immune cells. For example, FOXP3+CD3+ regulatory T cells in the intra- and extra-follicular areas of follicular lymphoma, quantified using software from PerkinElmer,45 are associated with good outcome. In contrast, CD3+CD69+ activated T cells are only prognostic if present intra-follicularily but not in the extra-follicular areas. The concept of functional distance between immune cells was highlighted in a recent paper where double IHC staining was used to study CD8+ (cytotoxic T) and FOXP3+ (Treg) cell infiltration of the tumor epithelium and stroma in 50 patients with gastric cancer.46 Using image analysis for cell count and cell–cell distance measurements determined from a model for heterogeneous spatial distribution fitted using their data, the authors discovered a functional distance range of 30–110 μm between the two types of immune cells that was associated with significant improvement in 10-year survival. The clinical relevance of considering lymphocytes as well as other immune cell types and their spatial context was again emphasized recently.47 Density of major immune cell subclasses in the center of the tumor and invasive margin was quantified separately using the Spot Browser (ALPHELYS, France) software on IHC tissue microarrays of colorectal cancer samples. For the same immune cell type, its association with clinical outcome may differ depending on its exact locale. For example, CXCR5+ cell density in the invasive margin correlates with poor outcome, yet it is associated with good outcome in the center of the tumor. In another study, spatial clusters of dendritic cells were quantitatively evaluated across 59 breast tumor samples using a density-based clustering algorithm,48 and a higher percentage of dendritic cell clustering was found to correlate with better prognosis.

Cells of macrophage/monocyte lineage represent another important class of immune cells. Compared with lymphocytes, macrophages are larger in size and have a more variable nuclear morphology. The nuclear morphology of fibroblasts appears to be a function of spatial distance to mammary glands in normal control and Neu-expressed mice.49 Whilst their methodology and findings need further validation on a larger set of samples, the link is interesting and can be potentially powerful for studying the role of fibroblasts. Despite these efforts, however, the use of spatial analysis to quantify the intra-tumor heterogeneity in the relationship between fibroblasts and cancer cells remains underdeveloped.

Vessels

Defined by their specific architectural arrangement in vascular structures, endothelial cells can display a variable morphology and are more specifically identified as belonging to blood or lymphatic vessels with the help of immunohistological markers. While vessel density has long been associated with clinical outcome in a number of cancers,50, 51, 52 investigation of spatial heterogeneity of vessel distribution has only recently begun. Following image analysis, distance between tumor mass and lymphatic vessels has been quantified in cervical cancer.53 Recently, the same group demonstrated that spatial distribution of lymphatic vessels along tumor edges correlates with lymphovascular space invasion and lymph node metastasis in early cervical cancer.54 The fact that in their study lymphatic vessel density, often measured in clinical studies, does not correlate with these two important clinical parameters of cervical cancer suggests that the spatial relationship between lymphatic vasculature and tumor cells may be a novel prognostic factor. Besides distance measures, approaches that utilize graphical representation of tumor architecture have been proposed.55 Following identification of CD31-stained microvessel identification, the spatial architecture of microvessels in prostate cancer was analyzed with four different graphical methods including Voronoi Graph and minimal spanning tree.55 As a result, 50 features were identified representing mean and variations of the microvessel spatial pattern. Specifically, a subset of features that represent intra-tumor heterogeneity in microvessel distribution was correlated with specific Magnetic Resonance Imaging features and Gleason grades, thereby identifying non-invasive ways to monitor microvessel heterogeneity.

These studies have demonstrated that the use of digital pathology opens new avenues for investigating the spatial architecture of the tumor microenvironment beyond simple cellular enumeration. Often, this involves identification of cancer and microenvironmental cells/components by analyzing images stained by means of H&E, IHC or others, before computationally modeling their spatial patterns.

Synergies between Digital Pathology and High-Throughput Molecular Profiling

Despite the advent of molecular profiling, including next-generation sequencing, histological examination remains critical for clinical diagnostics. Recently, studies have begun to highlight the synergies between molecular profiling and digital pathology techniques for an integrative understanding of cancer. This has led to the development of methods to perform tumor subtyping by data integration,56, 57 to correct molecular data using pathological data36 and to uncover clinically relevant features for outcome prediction.58

Integrative study design requires an in-depth understanding of the sample processing procedure and capacities of available image analysis and molecular profiling techniques. Freezing and formalin-fixation and paraffin-embedding (FFPE) procedures are the two major types of tumor tissue processing techniques with dramatically different implications for subsequent analysis. While frozen samples may present less well-preserved morphology, the quality of nucleic acids is generally better than that in FFPE samples, thus frozen samples are often preferred for molecular profiling. On the other hand, FFPE sections, due to their well-preserved cell morphology, are often used for developing pathological image analysis tools,3, 4, 5, 7, 8 but obtaining good quality DNA and RNA can be problematic. Thus, for integrative analysis of molecular profiling and digital pathological data, it is a challenge to obtain quality image analysis results on sections adjacent to tumor materials that also offer well-preserved nucleic acids.

Thus far, the integration of digital pathology and omics data powered by large international consortium efforts, such as The Cancer Genome Atlas (TCGA) hosting multiple data types, holds promises for elucidating molecular aberrations and correlates with morphological heterogeneity, and ultimately identifying new integrative subtypes and biomarkers. Albeit initial integrative omics efforts have been published, the analyses mainly focus on the tumor rather than its microenvironment. One of the first studies to develop computational pipelines to integrate omics with digital pathology aimed at subtyping glioblastoma and renal cell carcinoma using TCGA data.56 Spatial architecture and organization of the tumor sections were considered for subtyping, although the microenvironmental structure was not explicitly modeled. In another study,59 different cell types were also not discriminated for morphometric analysis in glioblastoma. However, glioblastoma subtypes resulting from clustering analysis of morphological data were found to be enriched with specific microenvironmental cells and genomic, methylation and expression patterns. To identify morphological patterns associated with survival outcome in triple-negative breast cancer, whole-tumor section images from TCGA were used for the development of an image analysis scheme, where nuclear features were quantitatively measured using ‘superpixels’ that represent geographically separate areas of tumor mass and the microenvironment.58 Specifically, tumor images were divided into ‘superpixels’ using an entropy-based method and nuclei within the superpixels were further segmented. The average morphological features of nuclei, regardless of their cell types, were found to correlate with gene expression data of the same set of triple-negative breast tumors. Some of these features, such as the standard deviation of nuclear size, were found to have significant association with prognosis. Although the morphological features were not cell-type specific, such a method, when extended to incorporate cell-type information, is potentially promising for investigating molecular correlates of individual microenvironmental components. Therefore, although the tumor architecture has been considered during morphological analysis and integrated with omics data,56, 57, 58, 59 the spatial arrangement and distribution of specific microenvironmental components are rarely explicitly modeled.

Future Outlook

The tumor microenvironment consists of many different cell types, with different biological roles and therefore unique relationships with cancer cells. Spatial analysis can help to elucidate these relationships in several ways. Here we categorize the spatial analysis techniques reviewed above into four types of methods.

Spatially Defined Tumor Areas

Recent studies45, 47 highlight the heterogeneity of immune infiltration in different areas within a tumor and the importance of separate analysis of these areas to study impact on clinical outcome. The added temporal component in the former study allowed the observation of spatial dynamics of immune cells as the tumor developed. The power of spatial analysis in specific areas of the tumor lies in the ability to follow up any significant findings with further molecular and genomic analyses on the same areas; however, the importance of rigorously defining constituent areas of a tumor cannot be understated if reproducibility is to be achieved.

Measuring Spatial Distances

Facilitated by image analysis that automates the identification of hundreds of thousands of cells in pathological samples, spatial relationships between cancer and microenvironmental components can be quantitatively measured using Euclidean distance to understand the clinical implications.46, 54 These studies illustrated the benefit of using image analysis to go beyond sample cell counts towards spatial analysis based on geographical distances. Both studies analyze the distance parameter, between tumor and vessels in the former and two immune cell types in the latter, with the latter46 also reporting an association between spatial distance and clinical outcome in their sample set. These reports highlight yet another advantage of digital pathology: the ability to automate spatial analyses that are impossible to carry out visually.

Statistical Methods Including Graphs and Clustering

The complexity of spatial patterns in the tumor microenvironment cannot be fully captured unless we venture beyond simple distance measurements.19, 48, 55 More complex spatial analysis methods include graph and cluster studies, which can shed further light on the interaction patterns of cancer cells with their microenvironment. For example, differences in role of the multitude of immune cell types can make the immune response to cancer difficult to elucidate. Clustering analysis can help isolate areas within a tumor where a particular immune cell type is more likely to be found, or to highlight any potential clinical correlates with having larger/more cell clusters in specific tumor regions. As reported in the study by Kruger,19 it can also reveal areas of interaction patterns between cancer and immune cells that may have a clinical implication but which, possibly due to a low overall immune score of a sample, could be missed in visual assessment.

Spatial Statistics

A powerful approach to explore the spatial dimension of the tumor ecosystem is to employ statistical tools designed to incorporate spatial data, that is, spatial statistics.36, 40, 60 In ecology, spatial patterns of different species are extensively studied and there exist many well-established tools to aid these investigations. These tools have been widely applied to robustly analyze spatial variability and capture patterns of environmental interaction, eg, the influence of hurricanes on dolphin foraging in marine ecology.61 These powerful spatial descriptive statistics, commonly used in geographical data analysis, can be applied to pathological images that have been processed by digital pathology to extract spatial data such as cell locations. Studies employing the L-function40 and the K-function statistic36 to measure spatial homogeneity in cell location data have demonstrated initial success. In one study36 it was applied to investigate the degree of clustering in cell distributions in estrogen receptor-negative breast cancer patients. It was found that extreme levels of clustering (both high and low) in stromal cells were associated with a good prognosis. The availability of spatial locations of heterogeneous ecosystem components in large tumor cohorts will provide unique opportunities to draw on the advances in spatial statistics.

In summary, there is a need for fully automated and quantitative spatial analysis in large sample sets of cancer pathological specimens, where complex spatial patterns of cells as well as cellular characteristics can be systematically explored. Limitations of current technologies encompass financial cost of commercial software, maintenance issues and lack of accessibility of spatial data. With the availability of databases such as TCGA and an increasing number of studies making their data and tools publically available for reproducibility of their work, it is likely that approaches combining bioinformatics, digital pathology and omics will see a rapid growth in the coming years. Rapidly advancing sequencing technology will also enable multi-region sequencing at different spatial points of reference in the same tumor, thereby creating a comprehensive map of morphology and molecular heterogeneity. With the use of multidisciplinary approaches to evaluate morphological and spatial patterns in pathological images, new prognostic indicators to be used in the clinical setting may become a reality and further light may be shed on the biological framework underpinning the relationship between the tumor and its microenvironment.