Abstract
Biomarkers are indispensable for precision medicine. However, focused singlebiomarker development using human tissue has been complicated by sample spatial heterogeneity. To address this challenge, we tested a representation of primary tumor that synergistically integrated multiple in situ biomarkers of extracellular matrix from multiple sampling regions into an intratumor graph neural network. Surprisingly, the differential prognostic value of this computational model over its conventional nongraph counterpart approximated that of combined routine prognostic biomarkers (tumor size, nodal status, histologic grade, molecular subtype, etc.) for 995 breast cancer patients under a retrospective study. This large prognostic value, originated from implicit but interpretable regional interactions among the graphically integrated in situ biomarkers, would otherwise be lost if they were separately developed into single conventional (spatially homogenized) biomarkers. Our study demonstrates an alternative route to cancer prognosis by taping the regional interactions among existing biomarkers rather than developing novel biomarkers.
Similar content being viewed by others
Introduction
Biomarkers improve patient care and impact therapeutics development^{1}. The risk of a disease, e.g., recurrence of cancer, is routinely assessed by prognostic biomarkers^{2}. The development of tissue prognostic biomarkers for breast cancer (exemplary of most diseases) has historically followed the onebiomarkeratatime approach, in which candidate biomarkers underwent representative multiregion sampling to suppress the spatial intratumor heterogeneity highlighted by multiregion sequencing and gene expression^{3}. The multiregion sampling may take a virtual form such as multiple microscopic views of one histologic tissue section, or a physical form such as multiple microdissections from one tissue section, multiple tissue sections/blocks or tumor microarray cores from one primary tumor, and specific multisite sampling^{4}. The former is frequently used in standard (H&E) histology and immunohistochemistry (IHC), whereas the latter in molecular profiling^{5} (MP) such as multigene assays^{6}. Regardless of the associated natures (morphological vs. molecular) and methods (H&E, IHC, and MP), single spatially homogenized biomarkers robust against the intratumor heterogeneity have been sequentially developed from underlying in situ biomarkers and included into various multivariate prognostic models for clinical use (Fig. 1a).
This homogenization of biomarkers is beneficial due to: (i) reliable pathology reports with less interobserver discordance; (ii) simple patient stratification for biomarker validation; and (iii) easy integration with other noninsitu biomarkers into various prognostic models^{7}. On the other hand, it is costly due to: (i) the discordances between primary tumors and distant metastases, e.g., discordance of estrogen receptor (ER) (or progesterone receptor PR) observed in 7–25% (or 1649%) of breast cancer patients^{8}; (ii) the subjectivity of diseasedependent criteria and thresholds, e.g., the histological grade based on averaged tubule formation but the highest mitotic count and degree of nuclear pleomorphism over sampled regions/views^{9}; and (iii) small number of homogenized biomarkers in comparison to in situ biomarkers, which contributes to few reported biomarkers implemented into clinical practice^{10}. A natural question arises whether the homogenization selfevidently leads to a high benefittocost ratio, or whether it is selfevident to conduct breast cancer prognosis using the homogenized (rather than in situ) biomarkers (Fig. 1b).
In the simplest case of 1to1 correspondence between one homogenized biomarker and one in situ biomarker (e.g., regional percentage of ER + tumor nuclei vs. ER overall status, Fig. 1a), the benefits of the homogenization generally outweigh the corresponding costs because the candidate biomarker (ER) is not placed in the context of (does not spatially interact with) other biomarkers. However, ER overall status is rarely used as a standalone prognostic biomarker but routinely combined with other single homogenized biomarkers in a multivariate prognostic model. In this case of NtoN correspondence between homogenized and in situ biomarkers, the balance of benefittocost ratio may tip against the homogenization owing to the loss of information on implicit biomarkerbiomarker interactions from multibiomarker spatial heterogeneity (Fig. 1c). That is, multiple in situ biomarkers with coregistered multiregion sampling may derive differential prognostic value from these interactions over the multivariate nongraph prognostic model wherein the in situ biomarkers are developed into the corresponding homogenized biomarkers (Fig. 1c). This “hidden” prognostic value has not been demonstrated to date possibly due to: (i) technical difficulty to spatially coregister many in situ biomarkers among consecutive sections and different microscopic methods or scales (H&E and IHC); (ii) high cost of multiregion MP for a large cohort of patients; and (iii) lack of a computational representation for the biomarkerbiomarker interactions.
We aim to demonstrate this value by representing the recently reported tumorassociated collagen signatures^{11} (TACS18), i.e., multiple coregistered biomarkers from multiphoton microscopy (MPM), with a graph neural network^{12} of primary breast tumor^{13}. It should be noted that TACSs and other collagenbased biomarkers have been extensively studied^{14} and explored for clinical translation^{15}. The resulting prognostic model of intratumor graph neural network (IGNN) not only forms a highperformance kernel to include additional in situ biomarkers (from H&E, IHC, MP, etc.), but also provides post hoc interpretations of various implicit interactions among the included biomarkers with distinct biological/medical insights (Fig. 1a−c). Our distanceless IGNN model based on MPM may motivate the development of similar models based on other imaging and nonimaging tissue assessment technologies (Supplementary Table 1), as long as the biomarkers possess spatial heterogeneity.
Results
Construction of personized IGNN structure
For a breast cancer patient characterized by one histologic formalinfixed paraffinembedded (FFPE) tissue block, the coregistration between H&E and MPM from two consecutive histologic tissue sections (4 µm) ensured the localization of several local MPM regions in global H&Erevealed tumor area and extraction of in situ TACS data^{11}. One section was stained with H&E for whole slide imaging, in which a pathologist confirmed the presence of tumor cells and their borders. Dependent on the size of tumor area for sufficient sampling, several (4–20) ~2.8mmsized nonoverlapping regions of interest (ROI) were located mainly at the tumor invasive front and then labeled (numbered) in the H&E images. The other (unstained) section was deparaffinized by alcohol and xylene to collect labelfree dualmodal MPM of second harmonic generation (SHG) and twophoton excited (intrinsic) fluorescence (TPEF) images^{16} for all labeled ROIs (Fig. 1d).
Based on the resulting personized MPM regions and heterogeneous regional distribution of TACS18 (8 in situ binary biomarkers), we constructed the IGNN to represent the MPM regions (nodes) with specific TACS18 distributions (node attributes present as 8bit vectors) and their interactions (edges) with learnable parameters (Fig. 1d). This structure cast the prognostic prediction of many patients as a supervised Cox proportional hazards regression learning task by establishing a nonlinear mapping between TACS18 spatial heterogeneity and observed prognosis (Supplementary Fig. 1 and Supplementary Fig. 2). It consisted of multiple stacked nonlinear functional layers (Fig. 1d, Supplementary Fig. 3, and Supplementary Table 2), which were trained endtoend by back propagation algorithm. First, graph convolution was used to represent intratumor graph nodes and infer their regional interactions. Two graph convolution layers followed neighborhood aggregation framework^{17} in which the node attribute embeddings of graph structure were alternately updated by implementing messagepassing mechanism^{18}. Then, attention mechanism and optional gated recurrent units (GRU)^{19} were introduced within graph convolution layers to capture more distinguishing features and reduce the gradient disappearance and oversmoothing problems during the model training phase. Later, the convoluted graph representation from all node attribute embeddings was further aggregated within a global pooling layer and abstracted by a fully connection layer. Finally, the graph representation was converted into IGNN score by a prognostic risk prediction layer to predict patient prognosis.
As a framework based on neural network, our IGNN might incorporate other quantifiable prognostic biomarkers as multimodal input by adding the appropriate multilayer perceptron module^{20} and feature fusion layer before the first full connection layer. To evaluate whether combined IGNN score and traditional clinicopathological factors could synergistically improve the performance of cancer prognosis, we developed an extended model (IGNNE) by incorporating clinical information (routine prognostic biomarkers) to the basic IGNN model (Supplementary Fig. 3 and Supplementary Table 3), just like how an extended model (Nomogram) of multivariate Cox proportional hazard regression was developed by incorporating this clinical information to the basic TACS18 model^{11,21}. The cost of biomarker homogenization was assessed by the differential prognostic value of the IGNN (or IGNNE) model over the TACS18 (or Nomogram) model (Fig. 1c, d).
Differential prognostic value from diverse patients
We compared four prognostic models with different biomarkers (Supplementary Table 4). Using a training cohort of 731 patients, we performed the prevalidation^{22} based on 3fold crossvalidation for each of the four models (Supplementary Fig. 1). Personalized prediction scores for each instance held out during crossvalidation were computed, and then all results from single crossvalidation steps were combined as the modelspecific validation results for the training cohort to assess their risk stratification capabilities (Supplementary Fig. 4 and Supplementary Table 5, see Methods). We subsequently retrained these models using the whole training cohort and finally validated the trained models using an independent validation cohort of 264 patients to further assess their prognostic value (Supplementary Fig. 1). For both cohorts in external validation, a correlation analysis was conducted between model scores and diseasefree survival (DFS) through the Pearson correlation, which indicated stronger association of IGNN (or IGNNE) score with DFS than TACS (or Nomogram) score (Supplementary Fig. 5a). The relative strength of various prognostic biomarkers to predict DFS, which was assessed by multivariate Cox proportional hazard regression analysis (Supplementary Table 6), indicated stronger dominance over other prognostic biomarkers by the IGNN score than the TACS score (Supplementary Fig. 6). Just like the TACS score, the IGNN score functioned as an independent prognostic factor along with tumor size, lymph node status, and molecular subtype (Supplementary Table 6).
We next calculated modeldependent survival curves in which patients in the training (or validation) cohort were stratified into low and highrisk groups via the optimal cutoff point dictated by the twosided logrank test statistics in model training phase. In comparison to the TACS18 (or Nomogram) model, the IGNN (or IGNNE) model produced more diverged survivals for low and highrisk patients and more diverged distributions of predicted score for patients with DFS less versus more than 5 years (Supplementary Fig. 5b). The stronger prognostic strength of the IGNN (or IGNNE) model over the TACS18 (or Nomogram) model was also revealed by larger values of hazard ratio (HR) (stratification ability), concordance index (Cindex), integrated cumulative/dynamic AUC (iAUC), the associated area under receiver operating characteristic curves (AUC) (Fig. 2a), and specificity/sensitivity to predict 5year DFS rate (Supplementary Table 7). The relatively large number of patients from rather disparate populations and institutions in southern (training cohort) and northern China (validation cohort) strengthened the statistical significance of the IGNN model free of new biomarkers.
All these results demonstrated the differential prognostic value of the IGNN over TACS18 model by simply accounting for the biomarkerbiomarker interactions from multibiomarker spatial heterogeneity. With no new biomarker beyond TACS18 (8 in situ biomarkers), our IGNN recovered this differential prognostic value from regional interactions among TACS18, which approximated the differential prognostic value from the combination of routine biomarkers (i.e., the Nomogram over TACS18 model) (Fig. 2a), particularly for patients with a small (<2 cm) tumor (Fig. 2b). These two differential prognostic values were largely additive from the TACS18 to IGNNE model (Fig. 2a, b), indicating rather independent prognosis of these regional interactions from the routine prognostic biomarkers. Thus, the combined IGNN score and traditional clinicopathological factors synergistically empower cancer prognosis.
Assessment on patient subgrouping and informed treatment
To examine how the observed differential prognostic value benefited different subgroups of patients, we combined the patients in both cohorts (n = 995) and divided them into different subgroups according to various clinicopathological factors (Supplementary Table 8). Interestingly, the observed differential prognostic value of patients with a small ≤2 cm tumor approximated that from the combination of routine biomarkers, whereas patients with a moderate 2–5 cm or large >5 cm tumor obtained a less significant result (Fig. 2b). The observed disparity in differential prognostic value was likely caused by the representative (nonrepresentative) multiregion sampling of a small (large) tumor using one section from a FFPE tissue block (~2 × 2 × 0.4 cm^{3}). This disparity might be correlated with similar disparities observed from different histological grades, nodal statuses, and TNMstages (Supplementary Fig. 7).
The single most surprising result was the large differential prognostic value from patients with a small tumor (n = 445). Consistently, the dominance of the IGNN score (in contrast to the TACS score) over other prognostic biomarkers was more pronounced in this subgroup of patients than the whole group (Fig. 2c vs. Supplementary Fig. 6). This implied that the recovery of previously neglected regional biomarkerbiomarker interactions among existing biomarkers could empower cancer prognosis more than the addition of new prognostic biomarkers. Thus, for this subgroup of patients, in situ biomarkers should be chosen over homogenized biomarkers for more accurate cancer prognosis.
To evaluate the benefit of the IGNN model beyond risk assessment, we investigated the corresponding utility to inform the postoperative adjuvant therapy of all 995 patients according to the consensus treatment guideline^{23}. Both TACS score and IGNN score reclassified a significant portion of lowrisk (highrisk) patients according to the guideline into highrisk (lowrisk) patients, and thus could minimize undertreatment (overtreatment) (Fig. 3a). Based on the retrospectively determined 5year DFS rate, in contrast to the TACS score that might spare 125/112 patients from overtreatment/undertreatment according to the guideline, the IGNN score would spare 138/106 patients from this undertreatment/overtreatment (Fig. 3b). As to the 445patient subgroup, in contrast to the TACS score that might overtreat 94 patients, the IGNN score would overtreat 52 patients and thus spare 42 patients from this overtreatment (Fig. 3c, d).
To sum up, the shift from the TACS to IGNN model not only empowered cancer prognosis but also could improve subsequent informed treatment, particularly for patients with a small tumor (representative multiregion sampling).
Post hoc interpretations of biomarkerbiomarker interactions
How the IGNN model learns the implicitly prognostic interactions among TACS18 is valuable to understand the TACS18 themselves beyond morphology. The message passing of the graph convolutional layers allows nodes to progressively aggregate the most relevant messages from their neighborhoods via the attention mechanism. In the training process, observable and understandable model responses are important to investigate how the node states evolve in individual patients. Without loss of generality, we randomly interrupt a training process of the IGNN model after sufficient training without any prior knowledge (see Methods), then obtain the corresponding feature vectors from the input and graph convolutional layers of the trained IGNN (Fig. 1d) and compute the similarity between regional pairs through the heat map of Pearson correlation coefficient matrix (Supplementary Fig. 8). The initial node state is encoded according to regional presence of TACS(s). The correlation of the regional pair with the initial state reflects the extent to which they contain the same TACS(s). However, as nodes and their neighbors interact through the graph convolutional layers, the node state vectors are constantly updated, resulting in the correlation changes among the nodes.
We apply the above method to the training cohort and find that node (ROI) state feature output from graph convolutional layer 2 exhibits distinct pattern and order according to regional interactions among TACSs. The heat map clearly divides thousands of MPM regions from 731 patients into three clusters. The regions with TACS5 or TACS6 accounts for 79% in Cluster 1, the regions with TACS4 accounts for 90% in Cluster 2, and the regions with TACS1 accounts for 92% in Cluster 3 (Fig. 4a). Analysis of similarities (ANOSIM) of the output feature vectors from the graph convolutional layer 2 for the nodes (ROIs) shows significant differences between the three clusters (Fig. 4a). Similar results are obtained from the validation cohort (Fig. 4b) and patients with a small tumor (Fig. 4c), indicating that the IGNN model captures prognostic value of TACS18 interactions by learning from a specific task.
A number of post hoc interpretations of TACS18 emerge in consistency with their morphologies (Fig. 4d): (i) high positive correlation between regions with TACS5 and TACS6 indicates their converging function and possibly synergistic interaction in tumor cell invasion; (ii) low correlation between regions with TACS4 and TACS5,6 implies that their effects on tumor cell invasion are relatively independent or different (TACS4 appears to block tumor cell invasion); (iii) low or negative correlations between regions with TACS1 and regions with TACS4,5,6 confirm that TACS1 is characteristic more of tumorigenesis than tumor cell invasion; and (iv) no clusters with high correlation between intracluster regions contain a high percentage of regions with TACS2, TACS3, TACS7, and TACS8, which may be due to the relative rarity of TACS2,3,7,8 in the samples. Additionally, the proportions of TACS2,3,7,8 are unevenly distributed across different clusters (Fig. 4a–c, middle), and ANOSIM shows that ROIs with TACS3 or TACS8 correlate more with Cluster 1 than other clusters while ROIs with TACS2 (TACS7) correlate more with Cluster 3 (Cluster 2) (Supplementary Fig. 9). Considering collagen morphologies, we hypothesize that TACS2,3,7,8 affect survival by interacting with other TACSs. For example, sparsely distributed collagen fibers at the tumor invasion front (TACS8) is conducive to TACS5,6 that enables tumor cell migration, while densely distributed collagen fibers at the tumor invasion front (TACS7) is conducive to TACS4 that restricts tumor cell migration. It should be noted that these insightful interpretations are not available from the TACS18 model with homogenized TACSs^{11}.
Another post hoc interpretation arises from the correlation analysis between individual TACSs and TACS/IGNN score, which shows a generally lower correlation of individual TACSs with IGNN score in comparison to the TACS score (Fig. 4e). Because the IGNN model accounts for the interactions among TACSs, the prognostic importance (absolute values of Spearman’s rank correlation coefficients) of TACSs becomes less in the IGNN model vs. TACS18 model (Fig. 4e).
Discussion
The results demonstrated in this study promote a shift of disease prognosis from conventional (spatially homogenized) biomarkers to in situ biomarkers that reflect intratumor heterogeneity (Fig. 1a, b). Also, for existing homogenized biomarkers (e.g., ER in Fig. 1a), the corresponding in situ biomarkers may be added to TACS18 to further improve the IGNNE model (Fig. 2a). The required coregistration of in situ biomarkers among MPM, IHC, and H&E can be ensured by collecting and examining consecutive FFPE tissue sections. Moreover, the individual biomarkers of IGNN may be expanded from binary biomarkers (TACS18) to those with multiple or continuous values (e.g., regional percentage of ER+ tumor nuclei, Fig. 1a), in order to retain full prognostic information. Finally, further optimization of region size (2.8 mm in this study) for more informative spatial biomarker distribution (Supplementary Fig. 10) and optimization of model depth for a larger patient size may increase the performance of prognosis. Thus, costeffective precision medicine may be obtained by taping current underappreciated biomarkers rather than developing novel biomarkers, because the hidden prognostic value of regional biomarkerbiomarker interactions can be recovered by an IGNN model.
This study exploits intratumor heterogeneity to improve biomarker utility, defying the conventional wisdom that intratumor heterogeneity is dismal for biomarker development^{24}. A few other studies attempted to derive prognostic values from the spatial heterogeneity of cellular^{25} or genetic biomarkers^{26,27}. However, whether biomarker intratumor heterogeneity can serve as a practical prognostic biomarker remains controversial^{28,29}. Here we avoid this controversy by deriving prognostic values from regional biomarker distributions (TACS18 model) and biomarkerbiomarker interactions (IGNN model) originated from the intratumor heterogeneity, rather than from the diversity of regionally distributed TACS18 corresponding to the intratumor heterogeneity itself.
As an early attempt to exploit the benefits of intratumor heterogeneity, we have chosen a size of ~2.8 mm for ROIs, each of which produces ~1.3 distinct features of TACS18 (Supplementary Table 9). For this size, representative sampling of the tumor invasion front of a typical histological section requires 420 ROIs, depending on the dimension of the tumor area. The number of ROIs seem adequate because random removal of 20% of ROIs for individual patients results in rather small inconsistency in prognosis (Supplementary Table 10). Thus, it is unlikely that our particular choice of ROIs (number and position) introduces an arbitrary level of intratumor heterogeneity. However, our combination of ROI size, sampling site (invasion front), and number of ROIs may not be optimal to attain the full potential of the IGNN model. Future efforts to balance the tradeoff between ROI size (content of TCASs in one region) and ROI number (content of TACSTACS interaction among regions), and the tradeoff of overall TACS content versus cost, may yield better overall performance in a clinical setting. The extension of TACS content from tumor invasion front to tumor center may further improve the performance.
Our multibiomarker IGNN enables unique integrative research on tumor microenvironment from the perspective of cancer risk. The corresponding model based on MPMrevealed extracellular matrix remodeling^{30} achieves a high performance (Figs. 2 and 3) among tumor microenvironmentcentered prognostic models based on MPrevealed stromal gene expression^{31}, H&Erevealed stromal architectures^{32}, H&Erevealed tumorinfiltrating lymphocytes^{33} or stromal cells^{34}, and IHCrevealed specific immune cells^{35}. This high performance may arise from the general role of extracellular matrix in modulating the hallmarks of cancer^{36}. Highperformance cancer prognosis has not been attained in previous studies^{14,37} on TACS13 and aligned collagen possibly due to their emphasis on developing single and/or homogenized biomarkers. To further improve the IGNN model, it is reasonable to complement TACS18 with the above tumor microenvironment components by coregistering the corresponding biomarkers (Fig. 1c). A more complete picture of tumor microenvironment should further include tumor cell components (e.g., histological or molecular type discussed above). The crosstalk among various tumor cell and microenvironment components may be interpreted post hoc from the regional biomarkerbiomarker interactions (Fig. 4). It will be interesting to compare the independent generation of prognostic values under the conditions of single versus multiplebiomarker spatial heterogeneity (Fig. 1c).
The graphbased prognostic model improves the performance of its conventional nongraph counterpart, and at the same time, retains or enhances the interpretability of the underlying biomarkers. In contrast to various prognostic morphological biomarkers derived from graphbased^{38} and nongraph machine/deep learning in computational histopathology^{39,40,41,42}, our in situ biomarkers (regional heterogeneity of TACS18) were identified in a discovery process^{11} independent of subsequently developed IGNN prognostic model (Fig. 1b), so that they are more amendable for human interpretations (Fig. 4). Also, the contextual information connecting sampling regions are conveniently represented by the edges connecting the nodes (Fig. 1d), relieving the somewhat challenging task in typical deep learning to model the spatial correlations between neighboring patches (by combining low and highresolution inputs)^{43}. Our IGNN significantly departs from prior GNN research that centers graph nodes on cells^{44}, which assumes that tumor cellcell distances and interactions have the highest prognostic value^{13}. Strikingly, the “functional” interactions among TACSs recover a differential prognostic value larger than that of combined routine prognostic biomarkers for small tumors (Fig. 2b), without accounting for internode distances in the graph edges (see Methods). It is reasonable to assume that future inclusion of these distances may further improve the performance of cancer prognosis, particularly that associated with large tumors after more representative multiregion sampling.
One perceived weakness of our approach to manually recognize TACS18 may be overcome by an automatic recognition procedure after weakly supervised deep learning^{45}. Also, our IGNN model is restricted to adjuvant therapy patients, and is not applicable to neoadjuvant therapy patients whose tumors and TACS features have been perturbed by the neoadjuvant therapy^{46}. Moreover, the insufficient sampling of a tumor with specific selection of ROIs may introduce prognosis uncertainty at individual level (but not necessarily population level) due to uncertain intratumoral heterogeneity such as stroma versus cellularity ratio. On the other hand, our IGNN highlights an overlooked advantage of biospecimen sampling using multiple lasercapture microdissections and tissue microarray cores from one sample over homogenized tissue sampling^{47} and liquid biopsy. Although this study is limited to a specific disease (breast cancer) and prognostic prediction (predicting DFS), similar methodology can be adopted for other diseases and prognostic predictions, e.g., predicting treatment response to a therapy/drug.
Methods
Patients, samples, image acquisition, and TACS recognition
This research used anonymous data for retrospective study and was conducted under a protocol approved by the Institutional Review Boards (IRB) of Fujian Medical University Union Hospital and Harbin Medical University Cancer Hospital. Our study included a training cohort of 731 patients from Fujian Medical University (FMU) Union Hospital and a validation cohort of 264 patients from Harbin Medical University (HMU) Cancer Hospital. The demographic and clinicopathologic characteristics of patients, sample preparation, MPM image acquisition, and TACS recognition have been reported previously^{11}. Only a small portion (~5%) of patients has multifocal tumors, and in this case, the sample preparation was performed on one of the foci with the largest size. Raw data including TACS coding observed from MPM imaging, clinical factors and followup information of patients is available with this paper.
Graph dataset
As typical graph machine learning models, IGNN and IGNNE were implemented with specific irregular graph structures as input. We developed a toolset based on the PyTorchGeometric library to build special graph structures from raw data of patients and generate specialized graph dataset (TACS_G) containing nodelevel attributes and connectivity matrix from a batch of graph structures, and the toolset and preestablished TACS_G dataset for this study are available within the source code.
Personalized IGNN structure
For each given tissue section (patient), TACS distribution is encoded in a graph data structure defined as G = (V, E), which consists of an Nnode set V and an edge set E. Each ROI in MPM image in the set ROI_{set} = {ROI_{1}, ROI_{2}, … , ROI_{N}} is regarded as a node object in V. An 8bit binary coding vector \({{{\bf{v}}}}_{i}\in {\mathbb{R}}^{1\times 8}\) forms the initial feature vector of the ROI_{i}associated node i ∈ V to describe the TACS18 distribution in ROI_{i} (e.g., ROI_{i} including TACS4 and TACS7 resulted in v_{i} = 01001000), and \({{{\bf{F}}}}={[{{{\bf{v}}}}_{1}^{{{\rm{T}}}},\cdots ,{{{\bf{v}}}}_{N}^{{{\rm{T}}}}]}^{{{\rm{T}}}}\in {\mathbb{R}}^{N\times 8}\) is defined as the initial node feature matrix of G. The edge e_{ij} ∈ E between node i and j is then constructed according to the knearest neighbor algorithm
where the inner product \({{{{{{\bf{v}}}}}}}_{i}\cdot {{{{{{\bf{v}}}}}}}_{j}^{{{{{{\rm{T}}}}}}}[0,\cdots ,8]\) indicates the number of TACS types presented in both ROI_{i} and ROI_{j} and KNN(i) indicates knearest neighbors of node i. Finally, personized TACS information is converted into an IGNN structure.
Architectures of IGNN and IGNNE models
The IGNN model takes the IGNN graph structure data as input and outputs a score for prognostic prediction (IGNN score). Given a set of G = {G_{1}, G_{2}, ... ,G_{N}} with corresponding survival status labels Y = {y_{1}, y_{2}, ... ,y_{N}}, the IGNN model constructs a multilayer nonlinear mapping to extract significant representation from graph structure data for prognostic prediction. It contains three components: (a) graph convolution network module; (b) fully connected network modules; and (c) prognostic regression layer.
The core part of graph convolution network module contains multiple stacked graph convolution layers, designed as a “messagepassing” architecture that drives nodes to aggregate information with their neighbors along edges and perceives their interactions. The graph convolution is separated into two steps: node information propagation and node information aggregation. In the first step, node features are propagated as the information of node state along the edge between neighboring nodes. Attention mechanism is incorporated into the propagation step, which follows a selfattention strategy to aggregate the neighboring node states of node i by attending over them with different weights. The propagation scheme conducted in the attention layer is formulized as
where the set \({{{{{\rm{N}}}}}}(i)=\{\,j\in {{{{{\rm{V}}}}}}{e}_{ij}=1\}\) represents the set of 1hop neighbors of node i ∈ V in the graph, ._{.} represents the calculated weighted aggregation result of state information propagated from the neighbors of node i, \({{{\bf{H}}}}_{j}^{(t)}\in {\mathbb{R}}^{1\times {{{\rm{c}}}}^{(t)}}\) represents the c^{(t)}dimensional feature vector of state information for node j ∈ V in the tth graph convolutional layer, a_{ij} is the attentionbased weight coefficient of state information propagated by node j to node i according to the attention mechanism of
where \({\beta }_{i}^{(t)}\in {\mathbb{R}}\) is learnable normalized weight adjustment parameter, \({{{\bf{W}}}}_{{{\rm{H}}}}^{(t)}\in {\mathbb{R}}^{{{{\rm{c}}}}^{(t)}\times {{{\rm{c}}}}^{(t)}}\) is the learnable weight matrix shared among all the nodes in tth graph convolutional layer, 〈·〉 denotes the inner product operator on two vectors. In the second step, each node in the graph updates the current state \({{{{{{\bf{H}}}}}}}_{i}^{(t+1)}\) by aggregating past state information from itself (\({{{{{{\bf{H}}}}}}}_{i}^{(t)}\)) and its neighbors (\({{{{{{\bf{X}}}}}}}_{i}^{(t)}\)). The attention weight a_{ij} are assigned to different nodes and edges according to the underlying dependencies to direct the network to the most prognostic parts of the TACSbased graph structure.
In addition, an optional aggregator based on Gated Recurrent Units (GRU) is designed to alleviate the potential oversmoothing issue during the information aggregation process as the network depth increases
and the basic recursive formula of GRU aggregator is
where ʘ denotes elementwise multiplication, Tanh (∙) is hyperbolic tangent nonlinear activation function, Selu (⋅) is the nonlinear activation function termed as scaled exponential linear units, W_{xh}, W_{xz}, W_{xr}, W_{hh}, W_{hz}, W_{hr} \(\in {{\mathbb{R}}}^{{{{{{{\rm{c}}}}}}}^{(t)}\times {{{{{{\rm{c}}}}}}}^{(t)}}\) and b_{H}, b_{Z}, b_{R} \(\in {{\mathbb{R}}}^{1{{{{{{\rm{c}}}}}}}^{(t)}}\) indicate learnable weight matrixes and bias parameters, respectively. The reset gate denoted as \({{{\bf{R}}}}^{(t)}\in {\mathbb{R}}^{1{{{\rm{c}}}}^{(t)}}\) is used to control how the previous node state is merged into the current candidate node state, and update gate denoted as \({{{\bf{Z}}}}^{(t)}\in {\mathbb{R}}^{1{{{\rm{c}}}}^{(t)}}\) is used to control how node states are updated by combining past states with current candidate state. Node information propagation and aggregation is performed hierarchically along the layers. The \(t\)th graph convolutional layer takes the node state information feature matrix \({{{\bf{H}}}}^{(t)}={[{({{{\bf{H}}}}_{1}^{(t)})}^{{{\rm{T}}}},{({{{\bf{H}}}}_{2}^{(t)})}^{{{\rm{T}}}},\cdots ,{({{{\bf{H}}}}_{N}^{(t)})}^{{{\rm{T}}}}]}^{{{\rm{T}}}}\in {\mathbb{R}}^{N\times {{{\rm{c}}}}^{(t)}}\)as input, and outputs a new feature matrix \({{{\bf{H}}}}^{(t+1)}={[{({{{\bf{H}}}}_{1}^{(t+1)})}^{{{\rm{T}}}},{({{{\bf{H}}}}_{2}^{(t+1)})}^{{{\rm{T}}}},\cdots ,{({{{\bf{H}}}}_{N}^{(t+1)})}^{{{\rm{T}}}}]}^{{{\rm{T}}}}\in {\mathbb{R}}^{N\times {{{\rm{c}}}}^{(t)}}\)updated via node information propagation and aggregation operations. Specially, in the nonlinearity mapping process for converting the initial node state vectors \({{{\bf{F}}}}={[{{{\bf{v}}}}_{1}^{{{\rm{T}}}},\cdots ,{{{\bf{v}}}}_{N}^{{{\rm{T}}}}]}^{{{\rm{T}}}}\in {\mathbb{R}}^{N\times 8}\) into higherlevel node feature embedding \({{{{{{\bf{H}}}}}}}^{(1)}={{{{{\rm{Selu}}}}}}\;({{{{{\bf{F}}}}}}{{{{{{\bf{W}}}}}}}_{{{{{{\rm{f}}}}}}})\), the feature vectors of F are weighted by a learnable weight matrix \({{{\bf{W}}}}_{{{\rm{f}}}}\in {\mathbb{R}}^{8\times {{{\rm{c}}}}^{(t)}}\) shared among all the nodes to highlight their relative importance in H^{(1)}. The node states are propagated and aggregated across the multiple graph convolutional layers, which lead to the highlevel node state features. A global pooling layer follows the last graph convolution layer to generate graphlevel output for a single graph G by averaging all node state features in G across the node dimension
where \({{{\bf{f}}}}^{(l)}={\mathbb{R}}^{1\times {{{\rm{c}}}}^{(t)}}\) is computed as the comprehensive representation of the graph structure, which extracts multiscale localized patterns of substructure overall G. The Global pooling layer is followed by the fully connected network module consisting of multiple fully connected layers. Fully connected network module weights the portions of feature h^{(t)} with different importance levels via a learnable weight matrix \({{{{{{\bf{W}}}}}}}_{{{{{{\rm{h}}}}}}}^{(t)}\) and recombines them into more discriminative higherlevel feature h^{(t+1)} by the nonlinearity mapping of
where \({{{\bf{h}}}}^{(t)}={\mathbb{R}}^{1\times {h}^{(t)}}\) and \({{{\bf{h}}}}^{(t+1)}={\mathbb{R}}^{1\times {h}^{(t+1)}}\)are the input and output feature vectors of the tth full connected layer, respectively, \({{{{{{\bf{W}}}}}}}_{{{{{{\rm{h}}}}}}}^{(t)}\) \(\in {{\mathbb{R}}}^{{{{{{{\rm{h}}}}}}}^{(t)}\times {{{{{{\rm{h}}}}}}}^{(t+1)}}\) and \({{{\bf{b}}}}_{{{\rm{h}}}}^{(t)}\in {\mathbb{R}}^{1\times {h}^{(t+1)}}\) are the learnable connected weight matrix and bias parameter, respectively. In particular, in the original architecture of IGNN, \({{{{{{\bf{h}}}}}}}^{(1)}={{{{{{\bf{f}}}}}}}^{(l)}\) was used as the input of the first full connected layer, while in the IGNNE model, the feature vectors made up of 8 clinicopathological factors are cascaded with \({{{{{{\bf{f}}}}}}}^{(l)}\) as the input of the first fully connected layer
where ⋃ concatenates two feature vectors along an assigned axis, \({{{\bf{W}}}}_{{{\rm{r}}}}\in {\mathbb{R}}^{6\times r}\) is the learnable connected weight matrix, r represents additional clinicopathological vectors, \({{{\bf{h}}}}^{(l)}={\mathbb{R}}^{1\times {h}^{(t)}}\) denotes the output highlevel feature vector of the fully connected layer. Finally, these features are fed into the prognostic risk prediction layer along with survival data to produce the prediction score
\({{{\bf{W}}}}_{{{\rm{cox}}}}\in {\mathbb{R}}^{{{\rm{h}}}^{(t)}\times 1}\) is learnable weight vector.
The IGNN model is currently a smallscale architecture consisting of only 2 graph convolution layers and 2 fully connected layers, where the learnable weight matrixes \({{{{{{\bf{W}}}}}}}_{{{{{{\rm{H}}}}}}}^{(t)},{{{{{{\bf{W}}}}}}}_{{{{{{\rm{xh}}}}}}},{{{{{{\bf{W}}}}}}}_{{{{{{\rm{xz}}}}}}},{{{{{{\bf{W}}}}}}}_{{{{{{\rm{xr}}}}}}},{{{{{{\bf{W}}}}}}}_{{{{{{\rm{hh}}}}}}},{{{{{{\bf{W}}}}}}}_{{{{{{\rm{hz}}}}}}},{{{{{{\bf{W}}}}}}}_{{{{{{\rm{hr}}}}}}}\) in graph convolutional layers have fixed size (8,8) and each fully connected layer consists of 32 neural units. As to the IGNNE model, the weight matrix \({{{{{{\bf{W}}}}}}}_{r}\) applied to the additional feature vector r has a size of (8,16). The number of parameters for the IGNN (IGNNE) model is ~2300 (~2500). The more learnable parameters of the model, the more training data are required to avoid the overfitting in machine learning. It is thus reasonable to design a smallscale architecture for a relatively small patient number (995 in total). However, with increasing number of patients for training, the original models can be extended to more complex depth models by stacking more graph convolutional layers and fully connected layers.
To evaluate prognostic risk, the learning object of IGNN (IGNNE) was to minimize the negative log partial likelihood of Cox proportional hazards regression loss as follow
where survival status label \({y}_{i}=1\) indicates the occurrence of diseaserelated events (recurrence or death) for patient i, D_{i} denotes the DFS of patient i,θ and b are the set of learnable weights and biases, respectively.
Training and validation
In the prevalidation process, 3fold crossvalidation was adopted to train and evaluate the prognostic models within the training cohort (FMU dataset). The training cohort was first divided into 3 folds (\({n}_{{{{{{\rm{fold1}}}}}}}\) = 243, \({n}_{{{{{{\rm{fold2}}}}}}}\) = 244, \({n}_{{{{{{\rm{fold3}}}}}}}\) = 244). Within each crossvalidation, two folds were used for training and the remaining fold for validation. Since there were significantly more longsurvival cases (patients with DFS > 5 years, \({n}_{P}\) = 470) than shortsurvival cases (patients with DFS ≤ 5 years, n_{N} = 261), in each crossvalidation, we randomly selected the same number of long and shortsurvival cases from the twofold data to compose the training data, in order to avoid the adverse effect of unbalanced training data on the training process. To train the IGNN and IGNNE models, the Xavier initialization was used for all the layers. An adaptive moment estimation (ADAM) optimizer was adopted to learn model parameters with a batch size of 16 and initial learning rate of 0.01. The maximum training iteration were set to 325 for the IGNN model and 59 for the IGNNE model. Due to the difficulty to augment a graph dataset as a traditional image dataset, two strategies were used to prevent overfitting: first, dropout layers (P_{drop} = 0.1) were applied before graph convolutional layers during the training phase, and normalized layer were applied before the full connection layer. Second, an adaptive stopping scheduler (Astopper) was developed to automatically select a suitable epoch for interrupting training process. Specifically, three checkpoints were set up across the training process, and the learning rate and weight decay rate of ADAM optimizer would be updated according to the change of training loss at Checkpoint1 or Checkpoint2, respectively. Starting from Checkpoint3 until the preset maximum iteration, the epoch with the lowest training loss would be called back as the best time to interrupt training and freeze model parameters. All results from single crossvalidation steps were combined as the modelspecific validation results for the training cohort.
In the external validation process, the prognostic models with the same architecture as in the prevalidation were retrained with fullscale cases from the training cohort and then evaluated on both the training and validation cohorts. For IGNN (IGNNE), model parameters were again initialized by the Xavier distribution, and optimized using ADAM optimizer with a batch size of 16(128) and an initial learning rate of 0.005. The maximum training iteration were set to 656 for IGNN and 1000 for IGNNE. During training, the best number of epochs to interrupt training was determined by Astopper under the same strategy as in the prevalidation process.
Interpretability
To demonstrate how IGNN revealed the implicit prognostic interactions among TACS features, we trained IGNN based on the training cohort without hyperparameter tuning. Specifically, the model parameters were randomly initialized and optimized using ADAM optimizer with a batch size of 16 and a learning rate of 0.01. During training, instead of choosing the stopping moment by Astopper, the training process was randomly interrupted at the 558th epoch, and the output feature vectors of each graph convolutional layer were extracted as the observed model responses.
To quantify the relative importance of different TACS features within patientspecific TACS graph data fed into IGNN, we introduced integrated gradient (IG), a post hoc gradientbased feature imputation method that attributes the model prediction to their inputs with different contributing factors. For a straightline path from the input F to the associated baseline F′, IG is defined as the path integral of the gradients along the straightline path from F to F′, which can be efficiently approximated numerically. In practice, IG for the ith element of F is computed following numerical Riemman approximation
where F is the initial node feature matrix of graph data, F′ is the starting point to calculate \({{{{{{\rm{IG}}}}}}}_{i}({{{{{\bf{F}}}}}})\) which is set to an allzero matrix, IGNN(·) is the IGNN prediction, \(\frac{\partial {{{{{\rm{IGNN}}}}}}(\cdot )}{\partial {{{{{{\bf{F}}}}}}}_{i}}\) is the gradient of IGNN prediction function relative to the ith element of F, k is the scaled feature perturbation constant, and m is the number of steps in the Riemann approximation of the integral. With the patientspecific TACS graph data as input, IG reveals the relative prognostic importance of different TACS features at the ROI level.
Training and validation of IGNN and IGNNE models were implemented using PyTorch (version 1.6.0) framework and TorchGeometric library (version 1.6.1) with Windows 10 operating system on a computer equipped with one NVIDIA GTX1080 Ti GPU and one INTEL Core i76700K CPU @ 4.0 GHz.
Statistical analysis
The KaplanMeier and twosided logrank tests were used to plot and compare the survival curves with dichotomized predictive score from the prognostic models, which provided risk stratification assignment. The HRs of different risk groups in survival analysis and the association of clinical factors with patient DFS were measured using Cox proportional hazards regression analysis. Timedependent AUC curves and corresponding iAUC (the measure of prognosis appropriate for censored timetoevent data) were used to assess the performance for DFS prediction of different prognostic models. The receiver operating characteristic (ROC) curves and associated AUCs were adopted to reflect the discriminatory accuracy of the prognostic models to predict 5year DFS rate. The relative strengths of multiple prognostic factors to predict DFS was assessed using the Chisquared test. The subgroups were compared using twosided unpaired ttest with no adjustments made for multiple comparisons. All hypothesis analyzed with a twosided p value < 0.05 were treated as statistically significant. The 95% CIs were calculated by percentile method while tSNE technology was used for dimensionality reduction and visual analysis of highdimensional features. Data and statistical analyses were performed using R software (version 3.6.0). The Cox Ridge regression adopted by TACS model was performed using the “glmnet” package and the Multivariate Cox proportional hazard regression adopted by Nomogram model was performed using the “survival” package.
Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Data availability
Raw data (wholeslide images and corresponding multiphoton images) supporting the results of this study are not publicly available due to institutional permission through IRB approval. Please email all requests for academic use of the raw data to the corresponding authors [L.L., J.C. or H.T.]. The requests will be evaluated for intellectual property or patient privacy obligations according to institutional and departmental policies. Other raw and processed data that include TACS, clinical, and followup information of patients are publicly available within the article, supplementary information and at [https://github.com/qldqq1984/IGNN/tree/main/experiments/Patients_Information/DataSets_995] and the raw data for all figures and tables are provided at [https://github.com/qldqq1984/IGNN/tree/main/Source%20Data]. Source data are provided with this paper.
Code availability
The custom code related to the training and evaluation of all the prognostic models are publicly available with a detailed guide at [https://github.com/qldqq1984/IGNN]. To maximize the reproducibility, the analytical procedures with R code for Source Data are also provided to reproduce the experimental results and displayable items. These procedures have been deposited in specific folders for all figures/tables within the Source_Data_analysis folder, which can be accessed from [https://github.com/qldqq1984/IGNN/tree/main/Source_Data_analysis]. Source code has also been placed on the Zenodo platform [https://doi.org/10.5281/zenodo.6808920]^{48}.
References
Davis, J. C. et al. The microeconomics of personalized medicine: today’s challenge and tomorrow’s promise. Nat. Rev. Drug Discov. 8, 279–286 (2009).
Weigelt, B., Peterse, J. L. & van’t Veer, L. J. Breast cancer metastasis: markers and models. Nat. Rev. Cancer 5, 591–602 (2005).
Gerlinger, M. et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N. Engl. J. Med. 366, 883–892 (2012).
Cortés, J. M., De Petris, G. & López, J. I. Detection of intratumor heterogeneity in modern pathology: A multisite tumor sampling perspective. Front. Med. 4, 25 (2017).
Reis, J. S., Weigelt, B., Fumagalli, D. & Sotiriou, C. Molecular profiling: moving away from tumor philately. Sci. Transl. Med. 2, 47ps43 (2010).
Gyanchandani, R. et al. Intratumor heterogeneity affects gene expression profile test prognostic risk stratification in early breast cancer. Clin. Cancer Res 22, 5362–5369 (2016).
Engelhardt, E. G. et al. Predicting and communicating the risk of recurrence and death in women with earlystage breast cancer: a systematic review of risk prediction models. J. Clin. Oncol. 32, 238–250 (2014).
Bedard, P. L., Hansen, A. R., Ratain, M. J. & Siu, L. L. Tumour heterogeneity in the clinic. Nature 501, 355–364 (2013).
Aleskandarany, M. A. et al. Tumour heterogeneity of breast cancer: from morphology to personalised medicine. Pathobiology 85, 23–34 (2018).
Poste, G. Bring on the biomarkers. Nature 469, 156–157 (2011).
Xi, G. et al. Largescale tumorassociated collagen signatures identify highrisk breast cancer patients. Theranostics 11, 3229–3243 (2021).
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. eural. NetwOr 20, 61–80 (2009).
Gurcan, M. N. et al. Histopathological image analysis: A review. IEEE Rev. Biomed. Eng. 2, 147–171 (2009).
Ouellette, J. N. et al. Navigating the collagen jungle: the biomedical potential of fiber organization in cancer. Bioengineering 8, 17 (2021).
Brett, E. A., Sauter, M. A., Machens, H. G. & Duscher, D. Tumorassociated collagen signatures: pushing tumor boundaries. Cancer Metab. 8, 14 (2020).
Zipfel, W. R., Williams, R. M., Christie, R., Nikitin, A. Y., Hyman, B. T. & Webb, W. W. Live tissue intrinsic emission microscopy using multiphotonexcited native fluorescence and second harmonic generation. Proc. Natl Acad. Sci. USA 100, 7075–7080 (2003).
Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn Syst. 32, 4–24 (2020).
Schlemper, J. et al. Attention gated networks: learning to leverage salient regions in medical images. Med. Image Anal. 53, 197–207 (2019).
Cho, K. et al. Learning phrase representations using RNN encoderdecoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. pp. 1724–1734 (2014).
Ruck, D. W., Rogers, S. K., Kabrisky, M., Oxley, M. E. & Suter, B. W. The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Netw. 1, 296–298 (1990).
Wei, L. J., Lin, D. Y. & Weissfeld, L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 84, 1065–1073 (1989).
Tibshirani, R. J. & Efron, B. Prevalidation and inference in microarrays. Stat. Appl. Genet. Mol. 1, 1–20 (2002).
Goldhirsch, A. et al. Progress and promise: highlights of the international expert consensus on the primary therapy of early breast cancer 2007. Ann. Oncol. 18, 1133–1144 (2007).
Junttila, M. R. & De Sauvage, F. J. Influence of tumour microenvironment heterogeneity on therapeutic response. Nature 501, 346–354 (2013).
Natrajan, R. et al. Microenvironmental heterogeneity parallels breast cancer progression: a histology–genomic integration analysis. PloS Med 13, e1001961 (2016).
Morris, L. G. et al. Pancancer analysis of intratumor heterogeneity as a prognostic determinant of survival. Oncotarget 7, 10051–10063 (2016).
Andor, N. et al. Pancancer analysis of the extent and consequences of intratumor heterogeneity. Nat. Med. 22, 105–113 (2016).
Hoefflin, R. et al. Spatial niche formation but not malignant progression is a driving force for intratumoural heterogeneity. Nat. Commun. 7, ncomms11845 (2016).
Abécassis, J. et al. Assessing reliability of intratumor heterogeneity estimates from single sample whole exome sequencing data. PloS one 14, e0224143 (2019).
Cox, T. R. & Erler, J. T. Remodeling and homeostasis of the extracellular matrix: implications for fibrotic diseases and cancer. Dis. Model Mech. 4, 165–178 (2011).
Finak, G. et al. Stromal gene expression predicts clinical outcome in breast cancer. Nat. Med. 14, 518–527 (2008).
Beck, A. H. et al. Systematic analysis of breast cancer morphology uncovers stromal features associated with survival. Sci. Transl. Med. 3, 108ra113 (2011).
Savas, P. et al. Clinical relevance of host immunity in breast cancer: from TILs to the clinic. Nat. Rev. Clin. Oncol. 13, 228–241 (2016).
Yuan, Y. et al. Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling. Sci. Transl. Med. 4, 157ra143 (2012).
Pagès, F. et al. International validation of the consensus Immunoscore for the classification of colon cancer: a prognostic and accuracy study. Lancet 391, 2128–2139 (2018).
Pickup, M. W., Mouw, J. K. & Weaver, V. M. The extracellular matrix modulates the hallmarks of cancer. EMBO Rep. 15, 1243–1253 (2014).
Brett, E. A., Sauter, M. A., Machens, H. G. & Duscher, D. Tumorassociated collagen signatures: pushing tumor boundaries. Cancer Metab. 8, 14 (2020).
Li, R., Yao, J., Zhu, X., Li, Y., & Huang, J. Graph CNN for survival analysis on whole slide pathological images. In International Conference on Medical Image Computing and ComputerAssisted Intervention (MICCAI) https://doi.org/10.1007/9783030009342_20 (Springer, 2018).
Yu, K. H. et al. Predicting nonsmall cell lung cancer prognosis by fully automated microscopic pathology image features. Nat. Commun. 7, 12474 (2016).
Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).
Fu, Y. et al. Pancancer computational histopathology reveals mutations, tumor composition and prognosis. Nat. Cancer 1, 800–810 (2020).
Lu, M. Y. et al. Dataefficient and weakly supervised computational pathology on wholeslide images. Nat. Biomed. Eng. 5, 555–570 (2021).
Srinidhi, C. L., Ciga, O. & Martel, A. L. Deep neural network models for computational histopathology: A survey. Med. Image Anal. 67, 101813 (2021).
Gunduz, C., Yener, B. & Gultekin, S. H. The cell graphs of cancer. Bioinformatics 20, 145–151 (2004).
Campanella, G. et al. Clinicalgrade computational pathology using weakly supervised deep learning on whole slide images. Nat. Med. 25, 1301–1309 (2019).
Selli, C. & Sims, A. H. Neoadjuvant therapy for breast cancer as a model for translational research. Breast CancerBasic. 13, 1178223419829072 (2019).
Litchfield, K. et al. Representative sequencing: unbiased sampling of solid tumor tissue. Cell Rep. 31, 107550 (2020).
Qiu et al. Intratumor graph neural network recovers hidden prognostic value of multibiomarker spatial heterogeneity (Version v1.0.0). Zenodo https://doi.org/10.5281/zenodo.6808920 (2022).
Acknowledgements
We would like to thank Wenjiao Ren, Tingfeng Shen, Wei Wang, Zhong Chen, Xiwen Chen, Ye Fang, Xiahui Han, Na Fang, Xinxing Huang and Zhijun Li for their support in the data acquisition of multiphoton images. This work was supported by the National Natural Science Foundation of China (Grant Nos. 82171991, 82172800, 61972187), Fujian Major Scientific and Technological Special Project for “Social Development” (No. 2020YZ016002), Natural Science Foundation of Fujian Province (Nos. 2019J01269, 2019J01761, 2019J01060044, 2020J011008, 2020J01839), Joint Funds for the Innovation of Science and Technology of Fujian Province (2017Y9038, 2019Y9101), and the special Funds of the Central Government Guiding Local Science and Technology Development (No. 2020L3008). Also, this work was supported, in part, by grants from the National Institutes of Health, U.S. Department of Health and Human Services (R01 CA241618 and R41 GM139528 to H.T.).
Author information
Authors and Affiliations
Contributions
L.Q., L.L., J.C., and H.T. conceived the idea. L.Q., C.W., L.L., Q.W., X.L., and H.T. performed the analysis and wrote the manuscript. D.K., W.G., F.F., and Q.Z. collected and prepared samples. G.X., J.H., and L.Z. performed biological experiments. L.Q., D.K., L.L., J.C., and H.T. obtained funding for this research.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Jakob Kather, Rodrigo de Andrade Natal and the other anonymous reviewer(s) for their contribution to the peer review of this work. Peer review reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Qiu, L., Kang, D., Wang, C. et al. Intratumor graph neural network recovers hidden prognostic value of multibiomarker spatial heterogeneity. Nat Commun 13, 4250 (2022). https://doi.org/10.1038/s4146702231771w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s4146702231771w
This article is cited by

“Patchiness” in mechanical stiffness across a tumor as an earlystage marker for malignancy
BMC Ecology and Evolution (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.