Large-scale pan-cancer analysis reveals broad prognostic association between TGF-β ligands, not Hedgehog, and GLI1/2 expression in tumors

GLI1 expression is broadly accepted as a marker of Hedgehog pathway activation in tumors. Efficacy of Hedgehog inhibitors is essentially limited to tumors bearing activating mutations of the pathway. GLI2, a critical Hedgehog effector, is necessary for GLI1 expression and is a direct transcriptional target of TGF-β/SMAD signaling. We examined the expression correlations of GLI1/2 with TGFB and HH genes in 152 distinct transcriptome datasets totaling over 23,500 patients and representing 37 types of neoplasms. Their prognostic value was measured in over 15,000 clinically annotated tumor samples from 26 tumor types. In most tumor types, GLI1 and GLI2 follow a similar pattern of expression and are equally correlated with HH and TGFB genes. However, GLI1/2 broadly share prognostic value with TGFB genes and a mesenchymal/EMT signature, not with HH genes. Our results provide a likely explanation for the frequent failure of anti-Hedgehog therapies in tumors, as they suggest a key role for TGF-β, not Hedgehog, ligands, in tumors with elevated GLI1/2-expression.

Scientific RepoRtS | (2020) 10:14491 | https://doi.org/10.1038/s41598-020-71559-w www.nature.com/scientificreports/ Clinical studies on HH inhibitors have often relied on increased expression or nuclear localization of GLI1 as a marker of HH pathway activation. It may be argued that in a number of cases, HH activation was stated empirically, with no direct evidence of active upstream HH signaling. Similarly, studies targeting GLI1/2 expression or function in tumor cells in vitro or in mouse models of cancer have shown remarkable anti-tumor efficacy and concluded to a pathogenic role for the HH pathway, although most studies targeted its main downstream effectors, not the pathway itself. Thus, while there is little doubt that GLI transcription factors contribute substantially to cancer progression, direct evidence that would link GLI activity in a given tumor setting to HH ligands activating their receptors is often missing. This may explain the overall lack of anti-tumor therapeutic efficacy of HH inhibitors, most of which target SMO activity 1 .
TGF-β is secreted abundantly by both tumor and stromal cells, allowing tumor evasion from immune surveillance, peri-tumoral angiogenesis and EMT, processes that all contribute to tumor progression 2,3 . TGF-β signals via ubiquitously expressed membrane-bound heteromeric serine-threonine kinase receptor complexes 2 . We identified TGF-β as a powerful inducer of GLI2 and GLI1 expression as well as GLI-dependent transcription, independent from SMO activity 12,13 . We established a role for both TGF-β signaling and GLI2 in driving melanoma invasion and metastasis, that could be targeted with TGF-β receptor inhibitors, the latter inhibiting GLI2 expression in tumor cells, or by knocking down GLI2 expression [14][15][16] . High GLI2 expression in invasive melanoma cells depends largely upon autocrine TGF-β signaling and is associated with a mesenchymal transition and loss of E-cadherin expression, events associated with enhanced cell motility and capacity to metastasize 15,17 . Similar observations have since been reported for other tumor types, including ovarian and oral squamous cell carcinomas, that link GLI2 and TGF-β expression to tumor aggressiveness via various mechanisms such as induction of PTHrP, leading to enhanced osteolytic bone metastases, or that of a stemness-like phenotype that also promotes metastatic progression [18][19][20] .
Herein, we hypothesized that the lack of efficacy of SMO antagonists in numerous tumors occurs because high GLI1 expression and activity may not be linked to HH, but rather to TGF-β, ligand expression, taken as surrogates for HH and TGF-β signaling in tumors, irrespective of the cellular compartment. We compiled data from publicly available gene expression datasets from over 23,500 cancer patients, of which over 15,000 with survival annotations, well above those available from The Cancer Genome Atlas (TCGA). While GLI1/2 expression is correlated with both HH and TGFB expression, their prognostic value is tightly correlated with that of TGFB, not HH. High GLI1/2 and TGFB expression, associated with a mesenchymal/EMT signature, often represent parallel markers of poor clinical outcome. Inversely, high expression of HH is mostly associated with increased survival.

Results
Pan cancer correlation between GLI1, GLI2, TGFB, and HH genes. We hypothesized that the correlation between GLI1 and GLI2 expression with HH (SHH, IHH, DHH) and TGF-β ligands (TGFB1, TGFB2, TGFB3) transcript levels represent adequate surrogates for the respective pathogenic implication of HH and TGF-β ligands in GLI1/2 expression and activity in tumors. Pan-cancer analysis of the correlation of GLI1 and GLI2 with 19,540 genes expressed in 30 tumor types revealed that GLI1 and GLI2 are mutually the most correlated genes (Fig. 1A), and that all HH and TGFB genes are in the top 20 percentiles of most correlated genes with GLI1 and GLI2, with one exception. Expression of at least one of the TGFB genes was more closely related than that of any HH genes to both GLI1 and GLI2 expression.
Positive correlation (arbitrary threshold: r > 0.25) between the GLI1 and GLI2 genes was observed in 33/37 tumor types (Fig. 1B), representing 92.5% (21,825/23,587) of patients. Most tumor types exhibited similar correlation values between GLI1 expression and that of at least one of either HH or TGFB genes (Fig. 1B). The correlation pattern between HH and TGFB genes with GLI2 ( Fig. 1B) was similar to that with GLI1, yet tumors with high GLI2/TGFB correlation could be discriminated into two subgroups: one exhibiting low GLI2/HH correlation (tumor types from ovarian down to bladder), the other exhibiting high GLI2/HH correlation (tumor types from cervix down to thyroid). We did not identify a single neoplasm for which GLI1/2 expression was correlated with that of HH genes without a simultaneous correlation with that of at least one of the TGFB genes.
Expression of GLI1, GLI2, HH and TGFB genes differentially associates with key oncogenic signatures. Cell cycle progression, acquisition of a mesenchymal phenotype through epithelial-to-mesenchymal transition (EMT), and cell stemness are cellular traits considered hallmarks of cancer progression 21 , to which both the HH and TGF-β pathways are linked 2,3 . For each of the 37 tumor types, we generated a multivariate linear model based on the expression of the eight genes of interest taken together (three HH and three TGFB genes, GLI1 and GLI2) to determine whether it may be predictive of these metagenes. To assess the goodness-offit of these models, correlations between the predicted values and observed values for each metagene were calculated in each tumor type. As shown in Fig. 2A, strong correlations were observed in most tumor types for the mesenchymal/EMT and cancer cell stemness metagenes albeit to a lesser extent, while it was seldom observed with the cell cycle metagene.
A simplified multivariate linear model using compounded expression of either GLI1/2, the HH or TGFB genes was next calculated for each metagene. Coefficients for these three predictive variables within each model, presented in Fig. 2B, demonstrate the dominant role of TGFB gene expression, followed by that of GLI1/2, not HH, in predicting mesenchymal and cancer cell stemness metagene expression in a broad array of tumor types. None of them was associated with the cell cycle metagene. Data for each GLI, HH and TGFB gene taken individually in each model are provided in Supplementary Figure S1. It should be noted that the higher correlation observed with the mesenchymal/EMT metagene is consistent with the fact that the latter comprises a number of known TGF-β/SMAD target genes.   (Fig. 3A). At odds with a generalized assumption in the clinical setting that HH signaling is deleterious, expression of all HH genes in tumors was associated with good prognosis (H.R. < 1, green color). Not a single tumor type was found for which expression of any HH gene was associated with morbidity (H.R. > 1, red color) without a parallel pejorative prognostic value for at least one TGFB gene and either GLI1, or GLI2. Also, for each occurrence when high GLI1 or GLI2 expression was of bad prognosis, the same held true for at least one of the TGFB genes. On the other hand, when either GLI1 and GLI2 expression were of good prognosis, the same applied for at least one of the TGFB genes. Noticeably, in bladder, colorectal and kidney papillary carcinoma, GLI2 (and GLI1) expression shared bad prognostic value with that of TGFB genes, while high HH expression was associated with a positive outcome. As expected, the three metagenes were largely of bad prognosis, with mesenchymal/EMT and cancer cell stemness metagenes exhibiting a largely overlapping, yet cancer type-specific pattern of prognostic significance, while the cell cycle metagene was almost universally associated with poor outcome.
In breast cancer, HH expression had no prognostic value while GLI1/2 and TGFB expression were associated with better survival, together with the mesenchymal/EMT and cell stemness metagenes. These results are at odds with most neoplasms where GLI1/2 and TGFB genes share pejorative prognostic value. To understand this discrepancy, a large cohort of breast cancer patients' data 22 was further analyzed. Intra-dataset z-score GLI(1/2), TGFB(1/2/3) and HH(S/I/D) and metagene expression values were sorted according to increasing GLI2 expression, then aligned to the molecular subtypes. High GLI1/2/TGFB and mesenchymal/EMT metagene expression was associated with the normal-like subgroup of tumors with better prognosis, and inversely correlated with luminal-type tumors of poor prognosis (Supplementary Figure S2). A hypothesis may be that the mesenchymal/

Discussion
HH inhibitors have failed to fulfill expectations placed on them as powerful anti-cancer drugs 2 . Despite widespread expression of GLI1 in tumors, considered to be a marker of HH activation, HH inhibitors have solely been granted FDA approval for the treatment of advanced basal cell carcinoma of the skin, a type of tumor that bears activating mutations of the HH pathway 25 . Clinical trials on other solid (or non-solid) tumors have overall failed 2 . Based on our earlier work that identified TGF-β as a potent transcriptional inducer of GLI2, consequently leading to SMO-independent induction of GLI1 12,13 , we hypothesized that GLI1 expression in tumors may be driven by TGF-β, not HH, which would explain the lack of efficacy of anti-HH approaches. Large-scale datamining from gene expression datasets representing 23,500 patients and 37 types of neoplasms addressed the issue of whether GLI1/2 expression in tumors warrants therapeutic approaches targeting upstream HH signaling. While significant correlation was found between the expression of GLI1/2 and HH genes in tumors, correlation with that of TGFB genes was as strong. Thus, not only GLI1 is not a relevant marker to predict HH pathway activity, but neither GLI1 nor GLI2 expression in tumors discriminate between HH and TGF-β ligands as potential upstream inducers of their expression.
Strikingly, the prognostic value of GLI1/2 expression in tumors was mostly at odds with that of HH genes, while paralleling that of one or more TGFB genes, as well as that of mesenchymal/EMT and cell stemness metagenes. Contrary to the pejorative prognosis associated with high GLI1/2/TGFB gene expression, high HH expression was mostly associated with good prognosis. In the rare occurrences when HH expression was associated with poor outcome, at least one TGFB gene shared the prognosis. This is schematized in Fig. 4.
Our broad pan-cancer analysis herein indicates that GLI1/2 functions parallel or match those of TGF-β ligands, not HH, as identified both in linear models of oncogenic functions and in prognostic analyses of tumors. Multivariate analysis demonstrated the dominant role of TGFB, followed by that of GLI1/2, not HH, in predicting mesenchymal/EMT and cell stemness metagene expression in a broad array of tumor types. Noteworthy, these analyses were performed using gene expression data from whole tumors, that do not discriminate between tumor and stromal cells. These data fit a mechanistic model whereby TGF-β, acts in a paracrine or autocrine manner to control GLI2 expression which, in turn and depending upon context, allows for GLI1 expression in either a HH-dependent or -independent fashion, as proposed previously 24 .
Our data hint that HH ligand-driven signaling in tumors leading to GLI1 expression without an overlap and contribution of the TGF-β pathway is not only a rare event but is also unlikely to be pathogenic. The divergence between the prognostic value of GLI1/2 expression and that of HH ligands indicates that there is no sensible justification for targeting HH for cancer treatment if GLI1 expression is used as a prognostic variable. We speculate that patients' selection based upon an inadequate marker of HH pathway activation may therefore contribute to the lack of clinical efficacy of SMO antagonists in various neoplasms. atients' selection based upon an inadequate marker of HH pathway activation may therefore contribute to the lack of clinical efficacy of SMO antagonists in various neoplasms. The uniqueness of the cell cycle metagene prognostic value is independent from GLI1/2, TGFB and HH gene signatures. Together with its broad pan-cancer pejorative prognostic value distinct from that of TGFB/GLI1/2/EMT/Stemness, suggests a potential therapeutic benefit for the combination of cytostatic drugs together with anti TGF-β/GLI inhibitors. Both tumor and stromal cells can express TGF-β ligand with is able to act in a paracrine or autocrine manner. This is highly context-specific and highly variable. Since the data originate from non-dissected tumors, we have to assume that the cellular origin of ligands does not affect the outcome of our analyses. 2. Driver mutations in either HH or TGF-β signaling are likely to influence GLI1/2 expression. Yet, we would like to contend that the outcome of HH signaling from driver mutations would not modify the conclusions drawn from our analyses, as those driver mutations are rare and restricted to a limited number of cancer types. 3. While a number of published mechanistic studies fit with our model, it would be interesting for our analyses to be functionally validated at the protein level in patients' samples.  Table S3. For each of these signatures and each dataset, the average zero-centered expression of the genes, both measured and included in the signature, was calculated for each sample.

Material and
Linear models. Within each dataset, based upon the metagene values for each of the three oncogenic signatures, the lm function from the stats R package was used to perform a linear regression of the metagene variable, using GLI1, GLI2, SHH, IHH, DHH, TGFB1, TGFB2 and TGFB3 expression as predictive variables. To allow for inter-datasets and inter-variables comparisons, all variables were z-scored within each dataset before linear modeling (common unit = standard deviation). For each model in each dataset, correlations between predicted values and observed metagene values (z-scores) was recorded and averaged across datasets by cancer type, then represented as heatmaps. For each linear model in each dataset, coefficients of the predictive variables were recorded and averaged across datasets by cancer type and metagene. Numerical values are provided per series and per cancer type (Supplementary Tables S4A and S4B, respectively).

Survival analyses.
Univariate Cox models of overall survival in 26 tumor types with cohorts of more than 50 patients and at least 10 death events were calculated using the survival R package. Aggregation of Hazard Ratios (H.R.) and related confidence intervals across datasets of a given cancer type for a given variable were calculated using the meta R package. All genes and metagenes were z-scored intra-dataset prior to modeling, to allow for inter-datasets comparisons (common unit = standard deviation). Numerical values are provided per series and per cancer type (Supplementary Tables S5A and S5B, respectively).