Kidney cancers, or renal cell carcinomas (RCC), are a common group of chemotherapy-resistant diseases that can be distinguished by histopathological features and underlying gene mutations1. Inherited predisposition to RCC has been shown to arise from genes involved in regulating cellular metabolism, making RCC a model for the role of an oncologic-metabolic shift, commonly referred to as the ‘Warburg effect’, leading to malignancy2. The most common type of RCC, clear cell renal cell carcinoma (ccRCC), is closely associated with VHL gene mutations that lead to stabilization of hypoxia inducible factors (HIF-1α and HIF-2α, also known as HIF1A and EPAS1) in both sporadic and familial forms. PBRM1, a subunit of the PBAF SWI/SNF chromatin remodelling complex, as well as histone deubiquitinase BAP1 and histone methyltransferase SETD2, were recently found to be altered in ccRCC3,4,5, implicating major roles for epigenetic regulation of additional functional pathways participating in the development and progression of the disease. Oncogenic metabolism and epigenetic reprogramming have thus emerged as central features of ccRCC.

In the present study, clinical and pathological features, genomic alterations, DNA methylation profiles, and RNA and proteomic signatures were evaluated in ccRCC. We accrued more than 500 primary nephrectomy specimens from patients with histologically confirmed ccRCC that conformed to the requirements for genomic study defined by the Cancer Genome Atlas (TCGA), together with matching ‘normal’ genomic material. Samples were restricted to those that contained at least 60% tumour nuclei (median 85%) by pathological review (clinical data summary provided in Supplementary Table 1). A data freeze representing 446 samples was generated from at least one analytical platform (‘Extended’ data set) and data from all platforms were available for 372 samples for coordinated, integrative analyses (‘Core’ data set) (Supplementary Data 1, Supplementary Table 2). No substantial batch effects in the data that might confound analyses were detected (Supplementary Figs 1–20).

Somatic alterations

The global pattern of somatic alterations, determined from analysis of 417 samples, is shown in Fig. 1a. DNA hybridizations showed that recurrent arm-level and focal somatic copy number alterations (SCNAs) occurred at a fewer sites than is generally observed in other cancers (P < 0.0004; Supplementary Figs 21–22 and Supplementary Table 3). However, SCNAs that were observed more commonly involved entire chromosomes or chromosome arms, rather than focal events (17% vs 0.4%, Fig. 1b). Notably, the most frequent arm-level events involved loss of chromosome 3p (ref. 6; 91% of samples), encompassing all of the four most commonly mutated genes (VHL, PBRM1, BAP1 and SETD2).

Figure 1: Somatic alterations in ccRCC.
figure 1

a, Top histogram, mutation events per sample; left histogram, samples affected per alteration. Upper heat map, distribution of fusion transcripts and VHL methylation across samples (n = 385 samples, with overlapping exome/SCNA/RNA-seq/methylation data); middle heat map, mutation events; bottom heat map, copy number gains (red) and losses (blue). Lower chart, mutation spectrum by indicated categories. b, Left panel, frequency of arm-level copy-number alterations versus focal copy number alterations. Right panel, comparison of the average numbers of arm-level and focal copy-number changes in ccRCC, colon cancer (CRC), glioblastoma (GBM), breast cancer (BRCA) and ovarian cancer (OVCA). c, Circos plot of fusion transcripts identified in 416 samples of ccRCC, with recurrent fusions highlighted.

PowerPoint slide

The data also suggested lower and more variable tumour cellularity7 in the accrued samples, compared to conventional pathological review (median 54% ± 14%). This may reflect stromal or endothelial cell contributions, or tumour cell heterogeneity. A recent study of multiple samples from single tumours has demonstrated significant regional genomic heterogeneity, but with shared mutations in frequently mutated genes and convergent evolution of other common gene level events8. The mutation frequencies of key genes (VHL, PBRM1 and so on), as well as copy number gains and losses found here, were, however, consistent with previous reports. Tumour purity was therefore not determined to be a limitation in the current study.

Arm level losses on chromosome 14q, associated with loss of HIF1A, which has been predicted to drive more aggressive disease9, were also frequent (45% of samples). Gains of 5q were observed (67% of samples) and additional focal amplifications refined the region of interest to 60 genes in 5q35, which was particularly informative as little has been known about the importance of this region in ccRCC since the 5q gain was initially described. Focal amplification also implicated the protein kinase C member PRKCI (ref. 10), and the MDS1 and EVI1 complex locus MECOM at 3p26, the p53 regulator MDM4 at 1q32, MYC at 8q24 and JAK2 on 9p24. Focally deleted regions included the tumour suppressor genes CDKN2A at 9p21 and PTEN at 10q23, putative tumour suppressor genes NEGR1 at 1p31, QKI at 6q26, and CADM2 at 3p12 and the genes that are frequently deleted in cancer, PTPRD at 9p23 and NRXN3 at 14q24 (ref. 11).

Whole-exome sequencing (WES) of tumours from 417 patients identified 36,353 putative somatic mutations, including 16,821 missense mutations, 6,383 silent mutations and 2,999 indels, with an average of 1.1 ± 0.5 non-silent mutations per megabase (Supplementary Figs 23–25). Mutations from 50 genes with high apparent somatic mutation frequencies (Supplementary Table 4) were independently validated using alternative sequencing instrumentation (Supplementary Fig. 26). In tumours from 22 patients, whole-genome sequencing was also used to validate and calibrate the WES data and confirmed 83% of the WES mutation-calls (Supplementary Tables 5 and 6). In line with results of previous studies (Supplementary Tables 7 and 8), the validated mutation data identified nineteen significantly mutated genes (SMGs) (false discovery rate (FDR) < 0.1), with VHL, PBRM1, SETD2, KDM5C, PTEN, BAP1, MTOR and TP53 representing the eight most extreme members (q < 0.00001) (Fig. 1a). Eleven additional SMGs were of considerably lower significance (q < 0.1–0.5) but included known cancer genes. Among all SMGs, only mutation of BAP1 correlated with poor survival outcome (Supplementary Fig. 27)12. Approximately 20% of cases had none of the 19 recorded SMGs, although many contained rare mutations in other known oncogenes or tumour suppressors, involving survival associations, illustrating the genetic complexity of ccRCC8 (Supplementary Figs 28–30 and Supplementary Table 9).

Eighty-four putative RNA fusions were identified in 416 ccRCC samples13. Eleven of thirteen predicted events (Fig. 1c) were validated using targeted methods, consistent with an 85% true-positive rate (Supplementary Table 10 and Supplementary Figs 31–35). A recurrent SFPQ–TFE3 fusion (previously linked to non-clear cell translocation-associated RCC14) was found in five samples, all of which were VHL wild type, indicating either that these tumours are a clear cell variant or that translocation-associated renal tumours may be histologically indistinguishable from conventional ccRCC. Furthermore, the TFE3 protein as well as an X(p11) rearrangement was found in three of those samples, where there were available slides.

DNA methylation profiles

We observed epigenetic silencing of VHL in about 7% of ccRCC tumours, which was mutually exclusive with mutation of VHL (Fig. 1a), reflecting the central role of this locus in ccRCC15. An additional 289 genes showed evidence of epigenetic silencing in at least 5% of tumours. The top-ranked gene by inverse correlation between gene expression and DNA methylation was UQCRH, hypermethylated in 36% of the tumours. UQCRH has been previously suggested to be a tumour suppressor16, but not linked to ccRCC. Interestingly, increasing promoter hypermethylation frequency correlated with higher stage and grade (Fig. 2a, b).

Figure 2: DNA methylation and ccRCC.
figure 2

a, b, Overall promoter DNA hypermethylation frequency in the tumour increases with rising stage (a) and grade (b). The promoter DNA hypermethylation frequency is calculated as the percentage of CpG loci hypermethylated among 15,101 loci which are unmethylated in the normal kidney tissue and normal white blood cells (boxplots, median with 95% confidence interval). c, Volcano plots showing a comparison of DNA methylation for SETD2 mutant versus non-mutant tumours (n = 224, HumanMethylation450 platform). Unshaded area: CpG loci with Benjamini–Hochberg (B–H) FDR = 0.001 and difference in mean beta value > 0.1 (n = 2,557). d, Heat map showing CpG loci with SETD2 mutation-associated DNA methylation (from part c); blue to red indicates low to high DNA methylation. The loci are split into those hypomethylated (top panel; n = 1,251) or hypermethylated (bottom panel; n = 1,306) in SETD2 mutants. Top colour bars indicate SETD2 mRNA expression (red: high, green: low) and SETD2 mutation status. Grey-scale row-side colour bar on left-hand side represents the relative number of overlapping reads, based on H3K36me3 ChIP-seq experiment in normal adult kidney (; black, high read count. DNA methylation patterns include 14 normal kidney samples. Among the tumours without SETD2 mutations, six (arrowhead) have both the signature pattern of SETD2 mutation and low SETD2 mRNA expression.

PowerPoint slide

We also evaluated the global consequences of mutation in specific epigenetic modifiers. Mutations in SETD2, a non-redundant H3K36 methyltransferase, were associated with increased loss of DNA methylation at non-promoter regions (Fig. 2c, d). This discovery is consistent with the emerging view that H3K36 trimethylation may be involved in the maintenance of a heterochromatic state17, whereby DNA methyltransferase 3A (DNMT3A) binds H3K36me3 and methylates nearby DNA18. Thus, reductions of H3K36me3 through SETD2 inactivation could lead indirectly to regional loss of DNA methylation.

RNA expression

Unsupervised clustering methods identified four stable subsets in both mRNA (m1–m4) and miRNA (mi1–mi4) expression data sets (Fig. 3a and Supplementary Figs 36–39). Supervised clustering revealed the similarity of these new mRNA classes to the previously reported ccA and ccB expression subtypes19, with cluster m1 corresponding to ccA and ccB divided between m2 and m3 (Supplementary Table 11). Cluster m4 probably accounts for the roughly 15% of tumours previously unclassified in the ccA/ccB classification scheme. Similarly, the survival advantage previously observed for ccA cases was again identified for m1 tumours (Fig. 3b).

Figure 3: mRNA and miRNA patterns reflect molecular subtypes of ccRCC.
figure 3

a, Tumours were separated into four sample groups (that is, ‘clusters’) by unsupervised analyses, on the basis of either differentially expressed mRNA patterns (left panel, showing 500 representative genes: m1–m4) or differentially expressed miRNA patterns (right panel, showing 26 representative miRNAs: mi1–mi4). b, Significant differences in patient survival were identified for both the mRNA-based clusters (left panel) and the miRNA-based clusters (right panel). c, Numbers of samples overlapping between the two sets of clusters, with significant concordance observed between m1 and mi3 and between m3 and mi2; red, significant overlap (P < 10−5, chi-squared test). d, mRNA–miRNA correlations, for predicted targeting interactions. Rows indicate miRNAs from a (indicated by cluster-specific colour bar); columns, mRNAs (5,000 differentially regulated genes selected for average RPKM > 10 and at least one predicted miRNA interaction); mRNA–miRNA entries with no predicted targeting are white. To the right of the correlation matrix, t statistics (Spearman’s rank) indicate group target enrichment.

PowerPoint slide

The m1 subtype was characterized by gene sets associated with chromatin remodelling processes and a higher frequency of PBRM1 mutations (39% in m1 vs 27% in others, P = 0.027). Deletion of CDKN2A (53% vs 26%; P < 0.0001) and mutations in PTEN (11% vs 1%; P < 0.0001) were more frequent in m3 tumours (Supplementary Fig. 5). The m4 group showed higher frequencies of BAP1 mutations (17% vs 7%; P = 0.002) and base-excision repair; however, this group also harboured more mTOR mutations (12% vs 4%; P = 0.01) and ribosomal gene sets.

Survival differences evident in miRNA-based subtypes (Supplementary Figs 40–44) correlated with the mRNA data (Fig. 3b–d). For example, miR-21, previously shown to demonstrate strong regulatory interactions in ccRCC20 and with established roles in metabolism17,21,22 correlated strongly with worse outcome, and DNA promoter methylation levels inversely correlated with expression of miR-21, miR-10b and miR-30a (Supplementary Tables 12–14). miRNA interactions thus represent a significant component of the epigenetic regulation observed in ccRCC.

Integrative data analyses

We used a combination of approaches for integrative pathway analysis. The HotNet23 algorithm uses a heat diffusion model, to find sub-networks distinguished by both the frequency of mutation in genes (nodes in the network) and the topology of interactions between genes (edges in the network). In ccRCC, HotNet identified twenty-five sub-networks of genes within a genome-scale protein–protein interaction network (Supplementary Table 15 and Supplementary Fig. 45). The largest and most frequently mutated network contained VHL and interacting partners. The second most frequently mutated sub-network included PBRM1, ARID1A and SMARCA4, key genes in the PBAF SWI/SNF chromatin remodelling complex.

We also inferred activities for known pathways, by using the PARADIGM algorithm to incorporate mutation, copy and mRNA expression data, with pathway information catalogued in public databases. This method identified a highly significant sub-network of 2,398 known regulatory interactions, connecting 1,218 molecular features (645 distinct proteins) (Supplementary Figs 46–49 and Supplementary Tables 16 and 17). Several ‘active’ transcriptional ‘hubs’ were identified, by searching for transcription factors with targets that were inferred to be active in the PARADIGM network. The active hubs found included HIF1A/ARNT, the transcription factor program activated by VHL mutation, as well as MYC/MAX, SP1, FOXM1, JUN and FOS. These hubs, together with several other less well-studied transcription factors, interlink much of the transcriptional program promoting glycolytic shift, de-differentiation and growth promotion in ccRCC.

We next searched for causal regulatory interactions connecting ccRCC somatic mutations to these transcriptional hubs, using a bi-directional extension to HotNet (‘TieDIE’) and identified a chromatin-specific sub-network (Fig. 4a and Supplementary Figs 50–52). TieDIE defines a set of transcriptional targets, whose state in the tumour cells is proposed to be influenced by one or more of the significantly mutated genes. The chromatin modification pathway intersects a wide variety of processes, including the regulation of hormone receptors (for example, ESR1), RAS signalling via the SRC homologue (SHC1), immune-related signalling (for example, NFKB1 and IL6)24, transcriptional output (for example, HIF1A, JUN, FOS and SP1), DNA repair (via BAP1) and beta-catenin (CTNNB1) and transforming growth factor (TGF)-β (TGFBR2) signalling via interactions with a SMARC–PBRM1–ARID1A complex. The complexity of these interactions reflects the potential for highly pleiotropic effects following primary events in chromatin modification genes.

Figure 4: Genomically-altered pathways in ccRCC.
figure 4

a, Alterations in chromatin remodelling genes were predicted to affect a large network of genes and pathways (larger implicated network in Supplementary Information). Each gene is depicted as a multi-ring circle with various levels of data, plotted such that each ‘spoke’ in the ring represents a single patient sample (same sample ordering for all genes). ‘PARADIGM’ ring, bioinformatically inferred levels of gene activity (red, higher activity); ‘Expression’, mRNA levels relative to normal (red, high); ‘Mutation’, somatic event; centre, correlation of gene expression or activity to mutation events in chromatin-related genes (red, positive). Protein–protein relationships inferred using public resources. b, For the PI(3)K/AKT/MTOR pathway (altered in 28% of tumours), the MEMo algorithm identified a pattern of mutually exclusive gene alterations (somatic mutations, copy alterations and aberrant mRNA expression) targeting multiple components, including two genes from the recurrent amplicon on 5q35.3. The alteration frequency and inferred alteration type (blue for inactivation and red for activation) is shown for each gene in the pathway diagram.

PowerPoint slide

The mutations in the chromatin regulators PBRM1, BAP1 and SETD2 were differentially associated with altered expression patterns of large numbers of genes when compared to samples bearing a background of VHL mutation (Supplementary Tables 18–21 and Supplementary Fig. 53). Each chromatin regulator had a distinct set of downstream effects, reflecting diverse roles for chromatin remodelling in the transcriptome.

Additionally, an unsupervised pathway analysis using the MEMo algorithm25 identified mutually exclusive patterns of alterations targeting multiple components of the PI(3)K/AKT/MTOR pathway in 28% of the tumours (Fig. 4b and Supplementary Table 22). Interestingly, the altered gene module included two genes from the broad amplicon on 5q35.3: GNB2L1 and SQSTM1. Both these genes have previously been associated with activation of PI(3)K signalling26,27. Furthermore, mRNA expression levels of these two genes were correlated with both DNA copy number increases and alteration status of the PI(3)K pathway (Supplementary Figs 54–55). The mutual exclusivity module also includes frequent overexpression of EGFR, which correlates with increased phosphorylation of the receptor (Supplementary Fig. 56), and which has been previously associated with lapatinib response in ccRCC28.

Correlations with survival

Where unsupervised analyses had indicated that common molecular patterns were associated with patient survival, we sought to further define molecular prognostic signatures at the levels of mRNA, miRNA, DNA methylation and protein. Data were divided into ‘discovery’ (n = 193) and ‘validation’ (n = 253) sets and platform-specific signatures were defined using Cox analyses24. Kaplan–Meier analysis for each signature showed statistically significant associations with survival in the validation subset (Fig. 5a and Supplementary Fig. 57). Multivariate Cox analyses, incorporating established clinical variables, showed that the mRNA, miRNA and protein signatures provided additional prognostic power (Supplementary Table 23). In addition, these signatures could provide molecular clues as to the drivers of aggressive cancers.

Figure 5: Molecular correlates of patient survival involve metabolic pathways.
figure 5

a, Sample profiles were separated into discovery and validation subsets, with the top survival correlates within the discovery subset being defined for each of the four platforms examined (mRNA, microRNA, protein, DNA methylation). Kaplan–Meier plots show results of applying the four prognostic signatures to the validation subset, comparing survival for patients with predicted higher risk (red, top third of signature scores), lower risk (blue, bottom third) or intermediate risk (grey, middle third); successful predictions were observed in each case. b, When viewed in the context of metabolism, the molecular survival correlates highlight a widespread metabolic shift, with tumours altering their usage of key pathways and metabolites (red and blue shading representing the correlation of increased gene expression with worse or better survival respectively, univariate Cox based on extended cohort). Worse survival correlates with upregulation of pentose phosphate pathway genes (G6PH, PGLS, TALDO and TKT), fatty acid synthesis genes (ACC and FASN), and PI(3)K pathway enhancing genes (MIR21). Better survival correlates with upregulation of AMPK complex genes, multiple Krebs cycle genes and PI(3)K pathway inhibitors (PTEN, TSC2). Additionally, specific promoter methylation events, including hypermethylation of PI(3)K pathway repressor GRB10, associate with outcome. c, Heat map of selected key features from the metabolic shift schematic (b) demonstrating coordinate expression by stage at DNA methylation, RNA, and protein levels (data from validation subset).

PowerPoint slide

Top protein correlates of worse survival included reduced AMP-activated kinase (AMPK) and increased acetyl-CoA carboxylase (ACC) (Supplementary Fig. 58). Together, downregulation of AMPK and upregulation of ACC activity contribute to a metabolic shift towards increased fatty acid synthesis29. A metabolic shift to an altered use of key metabolites and pathways was also apparent when considering the full set of genes involved in the core metabolic processes, including a shift towards a ‘Warburg effect’-like state (Fig. 5b). Poor prognosis correlated with downregulation of AMPK complex and the Krebs cycle genes, and with upregulation of genes involved in the pentose phosphate pathway (G6PD, PGLS, TALDO (also known as TALDO1P1), TKT) and fatty acid synthesis (FASN, ACC (also known as ACACA)).

Examination of potential genetic or epigenetic drivers of a glycolytic shift led us to identify methylation events involving MIR21 and GRB10, with decreased promoter methylation of each gene (thereby higher expression) being associated with worse or better outcome, respectively (Fig. 5b, Supplementary Fig. 59 and Supplementary Table 24). Both genes regulate the PI(3)K pathway: miR-21 is inducible by high glucose levels and downregulates PTEN22; whereas the tumour suppressor GRB10 negatively regulates PI(3)K and insulin signalling30. Promoter methylation of MIR21 and GRB10 were coordinated with their mRNA expression patterns, as well as with the mRNA expression of other key genes and protein expression in the metabolic pathways (Fig. 5c and Supplementary Fig. 60). In addition to the PI(3)K pathway (Fig. 5b and Supplementary Fig. 61), molecular survival correlations involved several pro-metastatic matrix metalloproteinases (Supplementary Fig. 62).


Our study sampled a single site of the primary tumour, in a disease with a potentially high level of tumour heterogeneity8. The extent to which convergent evolutionary events are a common theme in ccRCC remains to be determined, but may indicate that critical genes will be represented across the tumour landscape for an individual mass. In general, the large sample size seemed to overcome the intrinsic challenges of studying a genetically complex disease, revealing rare variants at rates similar to what has been described previously3. The samples, taken from primary tumour specimens, were reflective of patients fit for either definitive or cytoreductive nephrectomy, whereas future work could explore the genomic landscape of metastatic lesions.

Pathway and integrated analyses highlighted the importance of the well-known VHL/HIF pathway, the newly emerging chromatin remodelling/histone methylation pathway, and the PI(3)K/AKT pathway. The observation of chromatin modifier genes being frequently mutated in ccRCC strongly supports the model of nucleosome dynamics, providing a key function in renal tumorigenesis. Although the mechanistic details remain to be defined as to how such modulation promotes tumour formation, the data presented here revealed alterations in DNA methylation associated with SETD2 mutations. As an epigenetic process that can potently modify many transcriptional outputs, these mutational events have the potential to change the landscape of the tumour genome through altered expression of global sets of genes and genetic elements. Molecular correlates of patient survival further implicated PI(3)K/AKT as having a role in tumour progression, involving specific DNA methylation events. The PI(3)K/AKT pathway presents a strong therapeutic target in ccRCC, supporting the potential value of MTOR and/or related pathway inhibitor drugs for this cancer31,32.

Cross-platform molecular analyses indicated a correlation between worsened prognosis in patients with ccRCC and a metabolic shift involving increased dependence on the pentose phosphate shunt, decreased AMPK, decreased Krebs cycle activity, increased glutamine transport and fatty acid production. These findings are consistent with the isotopomer spectral analysis of a pair of VHL−/− clear cell kidney cancer cell lines, both of which were notably derived from patients with aggressive, metastatic disease, which revealed a dependence on reductive glutamine metabolism for lipid biosynthesis33. The metabolic shift identified in poor prognosis ccRCC remarkably mirrors the Warburg metabolic phenotype (increased glycolysis, decreased AMPK, glutamine-dependent lipogenesis) identified in type 2 papillary kidney cancer characterized by mutation of the Krebs cycle enzyme, fumarate hydratase33. Further studies to dissect out the role of the commonly mutated chromosome 3 chromatin remodelling genes, PBRM1, SETD2 and BAP1, in ccRCC tumorigenesis and their potential role in the metabolic remodelling associated with progression of this disease will hopefully provide the foundation for the development of effective forms of therapy for this disease.

Methods Summary

Specimens were obtained from patients, with appropriate consent from institutional review boards. Using a co-isolation protocol, DNA and RNA were purified. In total, 446 patients were assayed on at least one molecular profiling platform, which platforms included: (1) RNA sequencing, (2) DNA methylation arrays, (3) miRNA sequencing, (4) Affymetrix single nucleotide polymorphism (SNP) arrays, (5) exome sequencing, and (6) reverse phase protein arrays. As described above and in the Supplementary Methods, both single platform analyses and integrated cross-platform analyses were performed.