Pan-cancer analyses of human nuclear receptors reveal transcriptome diversity and prognostic value across cancer types

The human nuclear receptor (NR) superfamily comprises 48 ligand-dependent transcription factors that play regulatory roles in physiology and pathophysiology. In cancer, NRs have long served as predictors of disease stratification, treatment response, and clinical outcome. The Cancer Genome Atlas (TCGA) Pan-Cancer project provides a wealth of genetic data for a large number of human cancer types. Here, we examined NR transcriptional activity in 8,526 patient samples from 33 TCGA ‘Pan-Cancer’ diseases and 11 ‘Pan-Cancer’ organ systems using RNA sequencing data. The web-based Kaplan-Meier (KM) plotter tool was then used to evaluate the prognostic potential of NR gene expression in 21/33 cancer types. Although, most NRs were significantly underexpressed in cancer, NR expression (moderate to high expression levels) was predominantly restricted (46%) to specific tissues, particularly cancers representing gynecologic, urologic, and gastrointestinal ‘Pan-Cancer’ organ systems. Intriguingly, a relationship emerged between recurrent positive pairwise correlation of Class IV NRs in most cancers. NR expression was also revealed to play a profound effect on patient overall survival rates, with ≥5 prognostic NRs identified per cancer type. Taken together, these findings highlighted the complexity of NR transcriptional networks in cancer and identified novel therapeutic targets for specific cancer types.

receptor alpha (ERα) antagonist, was first introduced as a palliative agent for advanced breast cancer during the 1970s, but was later proven to be an effective adjuvant therapy for ERα-positive breast cancer patients 13 . Other pharmaceutical agents have since been FDA-approved for prostate cancer (androgen receptor (AR) antagonists), acute promyelocytic leukemia (retinoic acid receptor (RAR) agonists), AIDS-related Kaposi's sarcoma (RAR and retinoid X receptor (RXR) agonists), and cutaneous T-cell lymphoma (RXR modulators), while others are currently in clinical trials (ER, AR, RAR, RXR, glucocorticoid receptor (GR), RAR-related orphan receptor (ROR), Disease name and pan-organ system Cohort Number of samples RNA-seq data a KM plotter b

Central nervous system (CNS)
Glioblastoma multiforme GBM 166 0 Continued vitamin D receptor (VDR), peroxisome proliferator-activated receptor (PPAR), liver X receptor (LXR), and farnesoid X receptor (FXR)) 10 . Other less well-known NRs, such as NR1I2 (PXR) and NR1I3 (CAR), have been shown to have an effect on the pharmacokinetics and pharmacodynamics of anticancer drugs 14,15 . High-throughput sequencing technologies have been used to develop comprehensive insights into NR function and potential interplay between different NRs in cancer 8,16 . A pan-cancer study in six cancer types recently demonstrated that recurrent downregulation of NRs in cancer is only partially due to deletion or mutation 17 . Yet, our understanding of the impact global NR gene expression patterns have on patient clinical outcome is still limited in most cancer forms. Here, NR gene expression patterns were systematically mapped in relation to prognosis in 33 cancer types for 8,526 patients using genomic and clinical data from The Cancer Genome Atlas (TCGA), thereby pinpointing a number of interesting NR targets for future cancer drug development.

RNA-seq analysis defines four main NR expression patterns in cancer. RNA-seq data for 8,526
TCGA patient samples were used to evaluate mRNA expression patterns for the 48 human NRs across 33 cancer types and 11 pan-organ groups (Tables 1-2). Evaluation of the genome-wide gene expression profiles revealed four main expression patterns in the different neoplastic tissues, i.e. absent (absent to low expression in 100% of tissues), restricted (expressed (defined as moderate to high expression levels) in <50% of tissues), widespread (expressed in >50%, but <100% of tissues), and ubiquitous (expressed in 100% of tissues). In total, five NRs (10%; ESR2, ESRRB, NR2E3, NR6A1, RORB) were not expressed in any tissue, whereas 22 NRs (46%; AR, ESR1, ESRRG, HNF4A, HNF4G, NR0B1, NR0B2, NR1H4, NR1I2, NR1I3, NR2E1, NR2F1, NR3C2, NR4A3, NR5A1, NR5A2, PGR, PPARG, RARB, RORC, RXRG, THRB) showed restricted expression patterns in specific 'Pan-Cancer' organ systems, e.g. gynecologic, endocrine, urologic, central nervous system, gastrointestinal, and thoracic (Supplementary Table 1   Differential gene expression reveals cancer-associated human NRs. To identify cancer-related NRs, differential gene expression was assessed in cancer (n = 5,507) and corresponding normal tissue (n = 627) for 16 of the 33 pan-cancers with available gene expression data for normal samples. On average, 33.9 ± 1.60 ( ± SEM, range 23-42) NRs were differentially expressed per cancer type and 11.4 ± 0.40 (range 5-16) cancer types were associated with each NR ( Fig. 2A-C). In addition, lower NR expression levels were prevalent in cancer compared with normal tissue. Interestingly, NR3C2, PGR, RORA were differentially expressed in all 16 cancer types, while HNF4G was differentially expressed in only 5/16 cancers (31.3%; Fig. 2C). The highest number of cancer-related NRs was found in LUSC (42 NRs, Fig. 3 Fig. 4A). However, individual cancer types were also found to exhibit distinct NR co-expression patterns (Fig. 4B-D). NR expression patterns were generally shown to be weakly to moderately correlated (correlation coefficient values (r) between |0.2| and |0.6|) with the expression of other NRs in most neoplastic tissues. As expected, strong positive correlation (r > 0.6) was observed between the ESR1, AR, PGR, and RARA genes in BRCA. Intriguingly, evidence of NR crosstalk was found between NR class IV genes (NR4A1, NR4A2, NR4A3) in 20/21 cancer types (absent in SKCM). Only 6/21 cancer types (GI pan-organ system: ESCA, PAAD, STAD; Urologic pan-organ system: KIHC, PRAD; Hematologic/lymphatic pan-organ system: THYM) contained ≥20 strongly correlated (r > |0.6|) NR gene pairs (Supplementary Table 2). In total, 32 NR gene pairs were co-expressed in ESCA, several of which were comprised of the HNF4A, HNF4G, NR0B2, NR1I2, NR3C2, NR4A1, and NR5A2 genes. Additionally, KIHC, PAAD, PRAD, and STAD cancers were found to be associated www.nature.com/scientificreports www.nature.com/scientificreports/ with a number of NR gene pairs containing at least one NR class III genes (estrogen receptor-like NRs, e.g. AR, ESR1, ESRRA, ESRRB, ESRRG, NR3C1, NR3C2, and PGR), whereas THYMs were strongly associated with NR class I genes (thyroid hormone receptor-like NRs, e.g. PPAR, RAR, ROR genes).  www.nature.com/scientificreports www.nature.com/scientificreports/ range 6-37) NRs were significantly associated with overall survival per cancer type and 9.2 ± 0.3 (range 5-13) cancer types were associated with each prognostic NR (Fig. 5B). Although NRs were generally found to be underexpressed in cancer compared with corresponding normal tissue, both high and low NR expression correlated with adverse clinical outcome (Fig. 5C). Consequently, the prognostic significance of an individual NR frequently differed for cancer types in the same 'Pan-Cancer' organ system. NR2E1 was the only NR to demonstrate an association between similar expression patterns and shorter overall survival rates in all cancer types within a 'Pan-Cancer' organ system (high NR2E1 expression in BRCA, CESC, OV, and UCEC among gynecologic pan-cancers). In contrast, the prognostic potential of the remaining 47 NRs was frequently found to be connected with diverse expression patterns in different cancer types within a 'Pan-Cancer' organ system and NR class. For example, high PPARG expression was shown to be associated with decreased risk for BRCA and increased risk for CESC in the gynecologic organ system, and decreased risk for READ/STAD and increased risk for LIHC/PAAD cancers in the GI organ system. Furthermore, high PPARG expression was found to have a protective effect in five cancer types, e.g. BLCA (HR = 0.5, 95% CI 0.35-0.7, P = 6.7e-05; Fig. 6A), and an adverse effect in seven cancer types, e.g. LIHC (HR = 2.18, 95% CI 1.51-3.14, P = 2e-05; Fig. 6B). The effect of PPARG expression on patient clinical outcome thereby depended on the cancer type (Fig. 6C).

Discussion
Although the spatial (expression in different tissues) and temporal (circadian regulation) effects of NR function have been studied extensively in normal mouse tissues, no large-scale studies have currently been conducted in human cancers [17][18][19] . Therefore, publicly accessible TCGA data containing genome-wide molecular datasets and matching clinical information offers a unique opportunity to examine the relationship between NR expression and prognosis in a range of cancer forms. This comprehensive analysis of global NR gene expression patterns in 33 TCGA cancer types provides a detailed description of NR expression and potential co-expression in specific neoplastic tissues and pan-cancer organ groups. Here, the vast majority of NRs were shown to either be expressed in specific neoplastic tissues (restricted), most tissues (widespread), or all tissues (ubiquitous). Consistent with a previous report, NR expression was generally down-regulated in cancer compared with corresponding normal tissue 17 . However, this is the first report, to evaluate the clinical utility of the NR superfamily in cancer using survival analysis.
Crystollographic studies have improved our knowledge of how one or more NR polypeptides form dimers (mono-, homo-or heterodimeric NRs) that eventually bind to DNA response elements (DNA direct repeats, palindromic repeats or monomeric sites) in the nucleus 9 . Intriguingly, gene expression analysis showed that approximately 20% of NRs (ESRRA, NR1D2, NR1H2, NR2C1, NR2C2, NR4A1, PPARD, RARA, RXRA, RXRB) were ubiquitously expressed in all 33 pan-cancers. Furthermore, pairwise Pearson correlation for 21/33 cancer types revealed recurrent correlation between the expression patterns for multiple class I-III NRs and retinoid X receptors (RXRs), as well as, strong positive correlation between class IV NRs (NR4A1, NR4A2, NR4A3) in 20/21 pan-cancers. Although less prevalent, negative correlation was also observed, e.g. RARA and PPARA, ESR1 and www.nature.com/scientificreports www.nature.com/scientificreports/ PPARA, NR2E1 and RARA/AR/ESR1/PGR, NR2C2 and NR2F6/NR1H2 in BRCA, which is in line with previous reports 20 . Correlation between expression patterns for RXR genes and other NRs is not surprising since RXRs are common heterodimer partners with class I/II NRs (e.g. RAR, TR, VDR, LXR, PPAR, FXR, PXR, CAR) 7,9 . Indeed, class IV NRs have been previously associated with urologic malignancies such as bladder urothelial carcinoma, kidney renal carcinoma, and prostate adenocarcinoma, but this is the first report of widespread coordinated expression between these NRs in cancer 8 .
Survival analysis demonstrated that the prognostic potential of NR expression is predominantly dependent on cancer type, rather than on NR class. Each cancer type and NR were shown to be associated with ≥6 prognostic NRs and ≥5 different cancers, respectively. However, the expression levels (low or high expression) of individual prognostic NRs frequently differed between cancer types. For example, high PPARG expression correlated with decreased mortality risk for five cancer types (BLCA, BRCA, KIRC, READ, and STAD) and increased risk for eight cancer types (CESC, HNSC, LIHC, LUAD, PAAD, PCPG, SARC, THCA). Surprisingly, NR2E1 was the only NR to display similar expression levels (high expression) in association with overall survival in all cancer types (BRCA, CESC, OV, and UCEC) within a pan-cancer organ system (gynecologic pan-cancers).
In summary, this integrative pan-cancer analysis provides a detailed overview of the effects of NR expression on clinical outcome, thereby highlighting the importance of NRs in cancer. This work confirmed previously identified relationships between individual NRs and specific cancer types and revealed novel clinically relevant NRs. Taken together, these findings may therefore prompt a reevaluation of certain NRs as potential actionable targets for various cancer forms.  Statistical analysis. Statistical analyses were performed using a 0.05 p-value cutoff in R/Bioconductor (version 3.6.0). All p-values are two-sided. The distribution of NR gene expression levels was evaluated in each cancer type by calculating quantile expression (Q1-Q4) using log10-transformed RNA-seq data. Expression levels were then classified as not expressed (Q1 (0-25%: -Inf to 0.98) were defined as absent and Q2 (25-50%: 0.98 to 2.32) Figure 6. Gene expression of the PPARG nuclear receptor is significantly associated with overall survival in cancer. (A,B) Kaplan-Meier analysis of PPARG expression in the BLCA and LIHC cohorts. Estimates of the probability of overall survival according to quantile expression (low or high expression). P-values, hazard ratios (HR), and 95% confidence intervals (95% CI) were calculated using the log-rank test and Cox proportional hazards regression, respectively. The x-axes depict months after initial diagnosis and the y-axes depict overall survival. (C) Forest plots illustrating univariate Cox regression analysis of the prognostic impact of PPARG expression on overall survival in 19 'Pan-Cancer' forms. The x-axis is in log scale. HR <1 depicts the association between high PPARG expression and decreased risk, whereas HR >1 illustrates the association between high PPARG expression and increased risk.