Identification of transcription factor co-regulators that drive prostate cancer progression

In prostate cancer (PCa), and many other hormone-dependent cancers, there is clear evidence for distorted transcriptional control as disease driver mechanisms. Defining which transcription factor (TF) and coregulators are altered and combine to become oncogenic drivers remains a challenge, in part because of the multitude of TFs and coregulators and the diverse genomic space on which they function. The current study was undertaken to identify which TFs and coregulators are commonly altered in PCa. We generated unique lists of TFs (n = 2662), coactivators (COA; n = 766); corepressors (COR; n = 599); mixed function coregulators (MIXED; n = 511), and to address the challenge of defining how these genes are altered we tested how expression, copy number alterations and mutation status varied across seven prostate cancer (PCa) cohorts (three of localized and four advanced disease). Testing of significant changes was undertaken by bootstrapping approaches and the most significant changes were identified. For one commonly and significantly altered gene were stably knocked-down expression and undertook cell biology experiments and RNA-Seq to identify differentially altered gene networks and their association with PCa progression risks. COAS, CORS, MIXED and TFs all displayed significant down-regulated expression (q.value < 0.1) and correlated with protein expression (r 0.4–0.55). In localized PCa, stringent expression filtering identified commonly altered TFs and coregulator genes, including well-established (e.g. ERG) and underexplored (e.g. PPARGC1A, encodes PGC1α). Reduced PPARGC1A expression significantly associated with worse disease-free survival in two cohorts of localized PCa. Stable PGC1α knockdown in LNCaP cells increased growth rates and invasiveness and RNA-Seq revealed a profound basal impact on gene expression (~ 2300 genes; FDR < 0.05, logFC > 1.5), but only modestly impacted PPARγ responses. GSEA analyses of the PGC1α transcriptome revealed that it significantly altered the AR-dependent transcriptome, and was enriched for epigenetic modifiers. PGC1α-dependent genes were overlapped with PGC1α-ChIP-Seq genes and significantly associated in TCGA with higher grade tumors and worse disease-free survival. These methods and data demonstrate an approach to identify cancer-driver coregulators in cancer, and that PGC1α expression is clinically significant yet underexplored coregulator in aggressive early stage PCa.

Scientific Reports | (2020) 10:20332 | https://doi.org/10.1038/s41598-020-77055-5 www.nature.com/scientificreports/ to a more invasive phenotype, and profoundly changed gene expression patterns both positively and negatively. Combining these data with differentially regulated genes between tumors with high or low PPARGC1A expression alongside PGC1α ChIP-Seq data identified a network of 60 genes that are both significantly bound and regulated by PGC1α and significantly associated with higher Gleason Grade tumors. In parallel, these approaches establish a relatively generic work-flow for analyses of gene families or functional groupings to allow investigators to identify clinically-significant relationships which in turn support functional analyses.

Materials and methods
Data analyses, integration and code availability. PCa data was downloaded from cBioPortal 40 . Analyses was undertaken using R (version 3.6.2) 48 .
Cell viability and scratch assays. Cellular viability was measured by bioluminescent detection of ATP (CellTitre-Glo assay kit (Promega)). Cells were plated at 5 × 10 3 cells per well in 96-well, white-walled plates, allowed to adhere and treated with ETYA (10 µM) or EtOH (vehicle control) to a final volume of 100 µl for 96 h. Each experiment was performed in triplicate in triplicate wells. For scratch assay, cells were seeded at 1 × 10 6 cells per well in a 6-well plate, allowed to adhere for 48 h to a confluence of about 80% and then wounded by scratching with p200 sterile pipette tip. The debris were removed, and cells washed to make ensure the edges were smoothed with the same dimensions for experimental and control cells. Cells were incubated, and cell migration was assessed by monolayer gap closure after 48 h.

Western immunoblotting. Total cellular protein was isolated from exponentially growing cells and lysed
in ice cold RIPA buffer containing 1 × cOmplete Mini Protease Inhibitor Tablets (Roche). Protein concentration were quantified and 75 ug resolved (SDS-PAGE) using 10% polyacrylamide gels, transferred to PVDF membrane and probed with primary antibody against PGC1α (PA5-72948, Invitrogen) and Beta-Actin (ab8229, abcam) overnight at 4 °C. Primary antibody was detected with HRP-linked goat anti-rabbit IgG (abcam) and signal captured (ChemiDoc XRS + system (Bio-Rad)).

Unique lists of annotated transcription factors and co-regulators.
A comprehensive list of TFs and co-regulators genes was developed by text-mining 49 Gene Ontology (GO) terms that contained phrases including "positive control of transcription", "negative control of transcription", "co-activator", "co-repressor". From these GO terms, the HGNC gene name and ENSEMBL transcript ids were retrieved using biomaRt 50 and combined with canonical lists of TFs from UniProt (n = 1994) and FANTOM 67 (n = 1988). Groups were cross-referenced for uniqueness and annotated as following. TFs (n = 2662); coactivators were genes that exclusively associated with positive regulation of transcription (COA; n = 766); corepressors were genes that exclusively associated with negative regulation of transcription (COR; n = 599); mixed function coregulators were genes with evidence of context dependent negative and positive regulation of gene expression (MIXED; n = 511) (Supplementary Table S1).
Testing family-wide changes in each gene category. Bootstrapping permutation approaches (boot), were used to test if the proportions of each gene group was altered more than predicted by chance. For mRNA and protein expression, Z-scores were calculated for gene expression data. In the cohorts of local tumor (MSKCC, TCGA) gene expression was calculated as Z-scores of tumor-normal comparisons with genes detectible in at least 80% of samples. In the other data sets the median expression data frame of RNA-Seq or Mass-Spec was converted to Z-scores. For copy number alterations (CNA) (via the GISTIC 2 method 51 ) and mutations the data were used as obtained from cBioPortal 40 . To establish mutation frequency, the coding length of all exons for a given gene (including all alternative exons; BioMart) was calculated to yield the total gene CDS length. The mutation frequency was then calculated as the number of gene-length normalized mutations per gene and the square root of the summed mutation rate to control for the number of tumors in a cohort.
To test whether observed alteration frequencies within gene classes (e.g. COA, COR) was altered significantly, a vector of changes for all genes was calculated and the observed values (e.g. mean Z-score of expression) of a given class tested compared to all genes detected in each cancer cohort. Random sampling method was applied 100,000 times to select gene sets of equivalent size to simulate the distribution of changes across the genome for comparison 52 . Empirical p-values were calculated based on the group position relative to the sampling distribution of the genome.
Testing relationships between the most significantly altered transcription factor and co-regulators and clinical outcome. Gene expression levels in each gene family were filtered (genefilter) to select for genes that were commonly and significantly altered e.g. 2 Zscores in x % of tumors (figure legends). The expression and clustering of genes were then visualized with pheatmap, and the intersections visualized with Analyses of the PGC1α-dependent transcriptome and cistrome in cells and TCGA cohorts. LNCaP_shPGC1A and LNCaP_shCtrl cells (1 × 10 6 cells per well in a 6-well plate) adhered for 24 h then dosed with ETYA (10 µM) or EtOH (vehicle control) and RNA extracted. All the samples were prepared in triplicates for RNA sequencing. Paired end sequence reads were aligned to the human genome (hg38) using Rsubread 55 , (> 90% of ~ 25 × 10 6 unique reads/sample mapped) and translated to expression counts via featurecounts, followed by a standard edgeR pipeline 56 to determine differentially expressed genes (DEGs), visualized with volcano plots (ggplot2) and interpreted by gene set enrichment 57 and in particular the enrichment in the Hallmarks, Curated, GO and Reactome sets was analyzed. Similarly, in the TCGA PRAD cohort tumors in the upper and lower quartile by PPARGC1A expression were identified and DEGs established. These DEGs were combined with publicly available PGC1α ChIP-Seq data 58 , and the enriched regions overlapped with PGC1α-dependent DEGs as indicated.
Consent for publication. All authors have read and approved the contents of the manuscript and consent to its publication.

Results
Transcription factor and coregulator groups are significantly down-regulated in localized prostate cancer cohorts. We generated unique gene lists of TFs (n = 2662); coactivators, (COA; n = 766); corepressors, (COR; n = 599); and mixed function coregulators, (MIXED; n = 511) (Supplementary Table S1). These groups were used to test the family-wide significance of changes in expression (mRNA or protein), CNA and mutation of each group in three local cancer (MSKCC, PRAD, OICR) and four advanced (MICH, FH, NEURO and SU2C) cancer cohorts (Table 1).
Hypergeometric tests revealed that COAS, CORS and MIXED genes, but not TFs, were significantly protected (p < 0.01) from loss of function structural variation in normal tissues (GNOMAD 59 ) and significantly enriched in Pan-Cancer fitness genes 60 .
Next, a bootstrapping permutation approach was used to test if observed family-wide changes were more than predicted by chance 61 . We calculated empirical p-values based on the position of the family gene set (e.g. TF, COA etc.) relative to the distribution of random gene sets of the same size. The heatmap in Supplementary Figure S1A shows CORS expression in the TCGA cohort, indicating the genes are commonly down-regulated. To test CORS down-regulation we identified the proportion of these genes altered by > 2 Z-scores (vertical dashed red line, Supplementary Figure S1A, right panel) and compared this to the proportion of all gene families downregulated in sets of the same size (green histogram); a similar analyses was applied to the up-regulated genes (red histogram). The observed proportion of CORS down-regulated by > 2 Z-scores was significantly more (and hence, to the right) than the estimated proportion of down-regulated groups of the same size and randomly sampled (p = 1e -05 ). The CNA data and normalized exome mutation rate were treated in the same manner. Positive controls included gene groups known to be significantly altered by expression (nuclear hormone receptor (NR) down-regulation; HOX family (HOX) 61 up-regulation), mutation (Cosmic_mutant) (Supplementary Figure S1B) or CNA (COSMIC_CNA) (data not shown) 62 .
To visualize significant results, we plotted the negative log 10 of the FDR-corrected empirical pvalues for each test across cohorts (Fig. 1). The mRNA expression in MSKCC and TCGA, and protein expression in OICR were significantly more down-regulated and/or less up-regulated. For example, CORS were significantly more down-regulated (MSKCC, TCGA, FH) and less up-regulated (MSKCC, FH, OICR) than predicted by chance. Indeed, all groups were more down-regulated or less-upregulated in at least 4 cohorts. Only two group/cohort tests revealed significant up-regulation; COR genes in MICH, and MIXED genes in OICR.
Therefore down-regulation of these groups was more common than predicted by chance in local tumors. The OICR cohort displayed significant protein down-regulation of TFs, although the protein detection of all genes is significantly reduced in number (Table 1) and therefore caution is needed to compare between patterns of Table 1. Summary of the prostate cancer cohorts. The number of tumors and the number for genes detected from each category in the unique gene lists (TFs (n = 2662); coactivator (COA; n = 766); corepressor (COR; n = 599); mixed function coregulator (MIXED; n = 511). www.nature.com/scientificreports/ RNA and protein expression. Nonetheless, the correlation between RNA (TCGA or SU2C) and protein (OICR) when considering RNA transcripts for detected proteins ranged from r 0.4 to 0.55 (Supplementary Figure S2). By contrast to the findings on RNA and protein expression, the mutation and CNA data were largely negative; COAS and MIXED genes were more mutated than predicted by chance in the FH cohort only (Supplementary Figure S1C). These findings suggest that changing the stoichiometry of TF and coregulator interactions is potentially more impactful in local tumors than disruption through either mutation or CNA.
Reduced expression of transcription factors and coregulators associates with more aggressive PCa. Next, from each of the 15 significant RNA expression-cohort relationships we identified those genes with the most frequent and greatest expression change. For example, COAS in the TCGA cohort, were filtered for genes altered by > 2 Z-scores in 35% of TCGA tumors. The filtered RNA (Fig. 2) and protein expression (Supplementary Figure S3) were visualized as heatmaps.
Shared targets between the TCGA and MSKCC cohorts were investigated further (Fig. 3A). PubMed analyses (search terms given in Table Legend) revealed an uneven representation of these genes in the context of PCa (Table 2). These genes included those known to be commonly altered, including ERG, a frequently upregulated a TF in PCa 35 and common target of translocations with TMPRRS2 63,64 . By contrast, to date, the  www.nature.com/scientificreports/ RNA-Seq revealed the impact of reduced PGC1α expression in LNCaP cells had a profound impact on gene expression, reflecting the significant impact on proliferation the knockdown of PGC1α (Fig. 4D,E). These changes included 37 microRNA, 243 intra.microRNA, 343 lncRNA and 3600 protein coding genes significantly altered. By contrast the impact of ETYA was very modest. This suggests that PGC1α regulation of ligand-activated PPARγ does not appear to be significant, at least with respect to the impact of ETYA exposure (Fig. 4D).
Supporting a coactivator function, PGC1α knockdown down-regulated more than up-regulated genes. For example, 30 miRNA were down-regulated (e.g. miR17 host gene) and only 7 up-regulated, and similarly 1993 protein-coding genes were down-regulated and 1607 up-regulated. The fold change was also skewed to downregulation, with the mean logFC for down-regulated protein-coding genes being − 1.11, and 0.83 for up-regulated genes and similarly the mean logFC for down-regulated lncRNA was − 1.20, and 0.87 for up-regulated genes  www.nature.com/scientificreports/  www.nature.com/scientificreports/ (Fig. 4D). LISA cistrome analyses 73 of the PGC1α-dependent genes (Supplementary Figure S4) revealed that the top 50 transcription factors associated with these DEGs included 5 nuclear receptors (AR, NR3C1, ESR1, PR, PPARG) as well as known pioneer and lineage-determining transcription factors implicated in PCa, including FOXA1, GATA2, HOXB13, FOXA2, CEBPB and SOX4. We undertook GSEA analyses to classify the PGC1α-dependent changes in gene expression 74 . Supplementary Figure S5 illustrates the top eight positive and negative significant (FDR < 0.1) normalized enrichment scores (NES) in four gene set categories; Hallmarks, Curated, GO and Reactome sets. These highly enriched gene patterns illuminate the cancer biology impact of reduced PGC1α expression. The NES plots and expression of the altered genes in each term is shown in Fig. 4E.
Analyses of the Hallmark gene sets revealed androgen response genes were negatively and peroxisome genes positively enriched, respectively, suggesting that PGC1α expression represses AR signaling and enhances the activity of the peroxisome. AR target genes notably associated with energy production were repressed, including the putative tumor suppressor hydroxyprostaglandin dehydrogenase (HPGD) 75 , as well as Acyl-CoA synthetase long chain family member 3 (ACSL3) 76 and Alpha-2-glycoprotein 1, zinc-binding (AZGP1) 77 . Up-regulated peroxisome genes included retinoic acid anabolizing enzyme, Dehydrogenase/Reductase 3 (DHRS3) 78 , and the steroid catabolizing enzyme UDP Glucuronosyltransferase Family 2 Member B17 (UGT2B17) 79 . These changes in hallmark gene sets suggests that altered PGC1α disrupts the energetic utilization of the cell and distorts AR signaling.
This was also borne out by changes in Curated gene sets, with repression of Interferon signaling genes including retinoic acid inducible DExD/H-Box Helicase 58 (DDX58) 80 , and upregulation of IGF-signaling gene targets including the AR-target genes Homocysteine Inducible ER Protein With Ubiquitin Like Domain 1 (HERPUD1) 81 and Growth Differentiation Factor 15 (GDF15), known to be a prostate mitogen 82 . Similarly, Reactome gene set enrichment revealed repression of Interferon regulated genes including Interferon Induced Protein With Tetratricopeptide Repeats 1 (IFIT1) 83 , which is actually repressed by PPARα 84 . Interestingly, this was accompanied by upregulation of a number of histone modifying enzymes including lysine demethylases 84 (KDM1A, KDM4A, KDM6A) suggesting a potential for significant changes in the methylation status of histone lysine modifications such as H3K4me3 and H3K27me3, and thereby impacting the epigenome. Genes bound and regulated by PGC1α associate with more aggressive PCa. Reduced PGC1α expression impacted gene expression positively and negatively. Whether PGC1α exerts both coactivator and corepressor functions are obscured by direct (cis) and indirect (trans) relationships between PGC1α and the regulated genes captured by RNA-Seq. To identify PGC1α cis-dependent gene expression we examined how genes bound by PGC1α binding relate to changes in gene expression.
PGC1α ChIP-Seq data 58 were combined with the PGC1α-dependent gene expression patterns and shared genes were identified. Given each protein-coding genes is regulated by multiple enhancers 85,86 and promotercapture Hi-C experiments revealed the median distance between enhancer and target gene is 158 kb 87 . Therefore PGC1α ChIP-Seq peaks were segregated into different chromatin states identified in LNCaP cells (e.g. Promoter, Active Enhancer) 88 , and annotated to known genes within 100 kb. Thus, the 1304 PGC1α ChIP-Seq peaks overlapped with 914 chromatin states identified in LNCaP, and annotated to 2381 peak:gene relationships.
In the first instance we examined how PPARGC1A correlated either to genes that were differentially regulated by PGC1α knockdown in LNCaP cells (n = 4187, Fig. 4D) and bound by PGC1α overlapping with LNCaP chromatin states (n = 2381). We compared the relationships between PGC1α binding and gene expression in the TCGA cohort. In the first instance cumulative correlation plots were generated, which revealed that the empirical correlation between the expression of PPARGC1A and genes regulated by PPARGC1A shRNA were significantly more positive than the background of all expressed genes (KS test; p < 1e-9) ( Supplementary Figure S6, green symbol). This suggests that there is a stronger positive correlation between PGC1α expression and PGC1α-dependent genes than predicted by chance, and fits with a model of coactivator function. By contrast, the correlation between the expression of PPARGC1A and genes directly bound by PGC1α were significantly more negative than predicted by chance and suggested the correlations were significantly more negative (KS test; p < 1e-9). This suggests that direct binding of PGC1α is more nuanced in its direct relationship with gene levels, with a more even distribution of negative and positive correlations with expression.
Next, these 2381 PGC1α peak:gene relationships linked to LNCaP chromatin states were filtered to include genes that were: (1) differentially regulated by PPARGC1A shRNA in LNCaP cells (n = 4187, Fig. 4); (2) differentially expressed in the TCGA cohort between the top and bottom quartile expressing PPARGC1A expression (n = 7324). This identified 160 genes bound by PGC1α and significantly altered by PGC1α knockdown in LNCaP Figure 5. Expression of PGC1α-regulated and PGC1α-bound genes in the TCGA cohort. (A) PGC1α ChIP-Seq data (GSE75193) were annotated to genes within 100 kb to yield 2381 PGC1α peak:gene relationships that were filtered to include only those that overlapped by 1 bp with chromatin states in LNCaP derived using ChromHMM to yield PGC1α peak:gene relationships. In turn these further filtered to include only those genes that were; 1. differentially regulated by PPARGC1A shRNA in LNCaP cells (n = 4187, Fig. 4); 2. differentially expressed in the TCGA cohort between the top and bottom quartile expressing PPARGC1A expression (n = 7324), resulting in 160 genes that were significantly bound by PGC1α and whose expression was significantly altered by both shRNA PPARGC1A in LNCaP cells and significantly altered in TCGA tumors in PPARGC1A-dependent manner. The top 63 genes are illustrated (pheatmap) and their expression patterns clustered higher Gleason grade tumors (X-squared = 5.0601, p-value = 0.025). (B) Kaplan-Meier plots of the relationship between tumors with lower and upper quartile expression of the indicated genes and time to biochemical progression.

Scientific Reports
| (2020) 10:20332 | https://doi.org/10.1038/s41598-020-77055-5 www.nature.com/scientificreports/ cells and significantly altered PPARGC1A-dependent expression in TCGA, with the most altered 63 genes in the TCGA cohort shown in Fig. 5A. Expression genes clustered higher Gleason grade tumors (X-squared = 5.0601, p-value = 0.025). Interestingly, amongst the PGC1α -dependent gene signature 25 genes were upregulated, and 38 genes were actually downregulated, again suggesting that the direct impact of PGC1α on gene expression is more nuanced and not exclusively a coactivator. Finally, we screened how expression of these 63 PGC1α-dependent genes related to tumor outcome by generating Kaplan-Meier plots. The top five significant genes were FAM57B, EME2, DDB2, PCP4L1 and PBXIP1 and the plots for an upregulated gene (FAM57B) and a downregulated gene (PCP4L1) are shown in Fig. 5B. Interestingly, FAM57B is an established PGC1α target gene that regulates adipocyte differentiation 89 . Similarly, PCP4L1 is related to obesity induced phenotypes 90 and PBXIP1 plays a role in regulation of differentiation 91 in part by regulating ERα signaling 92 . By contrast EME2 and DDB2 relate to DNA repair 93,94 .

Discussion
The current study was undertaken to identify the most significantly altered and clinically-relevant TF and coregulators in PCa. It was reasoned that this knowledge may help to explain disease progression risks, or to highlight novel therapeutic opportunities. Assessment of family-wide significance across seven PCa cohorts revealed that TFs, COAS, CORs and MIXED were more down-regulated and less up-regulated at the mRNA and protein level, most clearly in local tumor cohorts. In contrast, family-wide CNA and mutation levels were not significant. Filtering those group/cohort relationships identified the most altered genes in the local tumor cohorts (MSKCC and TCGA). Overlapping identified commonly altered genes associated with PCa including ERG 35 (which was also altered at the protein level in the OICR cohort) and FGFR2 95 , but also identified others that were relatively under-investigated such as PPARGC1A 37,38,96 but associated with worse disease free survival, or uninvestigated in PCa, such as the metastasis suppressor PDLIM1 97 . Stable knockdown of PGC1α increased proliferation and invasion of LNCaP cells, and profoundly altered the transcriptome. Finally, PGC1α bound and regulated genes associated with higher grade tumors in the TCGA cohort and individual genes such as the PPARγ target gene FAM57B associated with worse disease-free survival.
The global down-regulation of TFs and co-regulators suggests that an initial event in PCa is disrupting the flexibility of gene regulatory signaling that may limit the permutations of TF co-regulator interactions or lessen the ability of the TF and co-regulators to form effective stoichiometric interactions for correct functioning. This may suggest that although these genes are reduced in expression they are not altered by structural variation, and therefore they remain functional, and in turn this may support the disrupted stoichiometry argument. The current findings would suggest that globally the flexibility of TF actions is disrupted. This may suggest that signaling is more restricted in terms of capacity, and this is seen in the PCa AR transcriptome 27,29 , but also in signaling via MYC and AP1 (reviewed in 73 ).
The stringent filtering of these data and overlap between TCGA and MSKCC revealed commonly and significantly altered genes but are unevenly investigated in PCa ( Table 2). Genes that are only identified in one cohort and whose expression is associated with more aggressive disease (Table 3) reveals a number of genes that are potentially impactful in disease but are under-investigated or have not been investigated at all in the context of PCa. PPARGC1A was selected for mechanistic study, given it was down-regulated and significantly associated with worse disease-free survival in both MSKCC and TCGA cohorts. Knockdown of PGC1α in LNCaP cells increased proliferation and migration, coupled with a profound impact on the transcriptome. The PGC1α transcriptome was enriched for terms associated with AR signaling as well overlaps with regulation of the peroxisome, interferon signaling and epigenetic modifiers, as well as being significantly enriched for the regulatory actions of AR, NR3C1, PPARg and several pioneer factors. These findings were further supported when considering target genes that were bound by PGC1α and differentially expressed in the TCGA cohort. These genes distinguished aggressive tumors and included RARγ/TACC1 24 , which we have previously established to antagonize AR signaling, as well as genes associated PPARγ signaling. Furthermore, motif analyses revealed that basic helix-loop-helix (b-HLH) motifs were enriched in the PGC1α ChromHMM sites, as were sites for LXR binding. These findings suggest that PGC1α regulates signaling by a cohort of nuclear receptors including AR, RARγ, PPARγ as well epigenetic process and interferon signaling. Interestingly, we examined expression of PPARGC1A in a broader panel of PCa cell lines in the Broad Cancer Cell Line Encyclopedia 98 , which revealed that in seven out of eight models the expression was downregulated by more than two Z-scores (Supplementary Figure S7). This is interesting given that the cell lines represent different aspects of PCa (either early late or stage) and specifically, which either express AR (LNCaP), or are AR negative (PC-3). PPARGC1A was also down-regulated in cells lines expressing the AR variant (AR-V7) associated with aggressive disease (22Rv1, VCaP). This suggests that down-regulation of PPARGC1A is both common and sustained in PCa cell line models.
Earlier studies by Carracedo and colleagues identified reduced expression of PGC1α in PCa, and these workers pursued a strategy to over-express PGC1α in advanced PCa cell (PC-3) cells line and analyzed these effects in two impactful studies 37,38 . Their findings support a role for PGC1α to regulate a transcriptome that is regulated by PGC1α and ERRα cross-talk, which regulates MYC signaling to regulate invasiveness; indeed in the current study b-HLH motifs were identified in the PGC1α-binding sites identified in LNCaP cells. The PGC1α-dependent transcriptome in PC-3 cells is ~ 900 genes (GSE75193) and few (n = 119) appear to be shared with the RNA-Seq data generated in the current study, but none of the genes proposed by the authors as a PGC1α-dependent signature overlap with the 63 PGC1α-dependent and PGC1α-bound genes significantly altered in TCGA cohort associated with more aggressive disease.
Thus, whilst the current study and the Carracedo studies both identify important roles for PGC1α to regulate tumor aggressiveness, the mechanisms appear to be different. This may arise for technical reasons (for example, cell background and transcriptomic approach) and may reflect the differential impact of knockdown www.nature.com/scientificreports/ versus over-expression selecting for different transcriptionaly-dependent events. Certainly, the fact that PGC1αdependent and PGC1α-bound genes favor down-regulated over up-regulated genes (~ 2:1) suggests that PGC1α participates in diverse transcriptional events. Outside of PCa, in renal disease it has been proposed that PGC1α determines phenotypic consequences by selecting which TFs to interact with, and that RARs/RXRs, PPARs is associated with fatty acid metabolism 99 . It is tempting to speculate the phenotype is current study is associated with similar energetic changes.

Conclusion
The approach applied in the current study identified both well-established factors, such as ERG, and relatively under-investigated factors, notably PPARGC1A, that associate with PCa progression risks. Indeed, the approaches and data integration are relatively generic and considers all genes in a class or superfamily to identify which may have merit to investigate. This is an important step for many wet-lab approaches, given that it can often take considerable resources to dissect a gene function. For example, the 20,000 ft view approach to cancer genomics in PCa integrates genome-wide data to identify novel networks 18,58 . Traditionally trained wet-lab based investigators often face challenges in assimilating these findings and may rather search for gene(s) in the supplementary data S1 that are from the family on which they study; this can be considered as a 200 ft view. The current approach is neither the 200 ft nor 20,000 ft views, but rather a 2000 ft view. That is, the approach allows an investigator to remain focused in their research arena, for example TF co-regulators, but ask the global question over how these are altered and at what stage of cancer progression. This is a relatively generic approach and may find traction with investigators across disease types.

Data availability
The PPARGC1A shRNA dependent RNA-Seq will be deposited on GEO.
Received: 17 August 2020; Accepted: 5 November 2020 www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.