Urothelial carcinoma of the bladder is a common malignancy that causes approximately 150,000 deaths per year worldwide. So far, no molecularly targeted agents have been approved for treatment of the disease. As part of The Cancer Genome Atlas project, we report here an integrated analysis of 131 urothelial carcinomas to provide a comprehensive landscape of molecular alterations. There were statistically significant recurrent mutations in 32 genes, including multiple genes involved in cell-cycle regulation, chromatin regulation, and kinase signalling pathways, as well as 9 genes not previously reported as significantly mutated in any cancer. RNA sequencing revealed four expression subtypes, two of which (papillary-like and basal/squamous-like) were also evident in microRNA sequencing and protein data. Whole-genome and RNA sequencing identified recurrent in-frame activating FGFR3–TACC3 fusions and expression or integration of several viruses (including HPV16) that are associated with gene inactivation. Our analyses identified potential therapeutic targets in 69% of the tumours, including 42% with targets in the phosphatidylinositol-3-OH kinase/AKT/mTOR pathway and 45% with targets (including ERBB2) in the RTK/MAPK pathway. Chromatin regulatory genes were more frequently mutated in urothelial carcinoma than in any other common cancer studied so far, indicating the future possibility of targeted therapy for chromatin abnormalities.
Urothelial carcinoma of the bladder is a major cause of morbidity and mortality worldwide, causing an estimated 150,000 deaths per year1. Previous studies have identified multiple regions of somatic copy number alteration, including amplification of PPARG, E2F3, EGFR, CCND1 and MDM2, as well as loss of CDKN2A and RB1 (refs 2, 3). Sequencing of candidate pathways has identified recurrent mutations in TP53, FGFR3, PIK3CA, TSC1, RB1 and HRAS (refs 2, 3). Whole-exome sequencing of nine bladder cancers, followed by a replication analysis of 88 cancers, identified mutations at >10% frequency in several chromatin remodelling genes: KDM6A, CREBBP, EP300 and ARID1A (ref. 4). Focused molecular analyses5,6 have delineated tumour subtypes and identified kinase-activating FGFR3 gene fusions7,8.
We report here a comprehensive, integrated study of 131 high-grade muscle-invasive urothelial bladder carcinomas as part of The Cancer Genome Atlas (TCGA) project. Included are data on DNA copy number, somatic mutation, messenger RNA and microRNA (miRNA) expression, protein and phosphorylated protein expression, DNA methylation, transcript splice variation, gene fusion, viral integration, pathway perturbation, clinical correlates and histopathology to characterize the molecular landscape of urothelial carcinoma. This study identifies a number of mutations and regions of copy number variation that involve genes not previously reported as altered in a significant fraction of bladder cancers. It also identifies potential therapeutic targets in most of the samples analysed.
Demographic, clinical and pathological data
Samples (from 19 tissue source sites) consisted of 131 chemotherapy-naive, muscle-invasive, high-grade urothelial tumours (T2-T4a, Nx, Mx), as well as peripheral blood (n = 118) and/or tumour-adjacent, histologically normal-appearing bladder tissue (n = 23). Cases were retained only if they met the following criteria: tumour nuclei constituted ≥60% of all nuclei; tumour necrosis was ≤20% of the specimen; and variant histologies (squamous or small cell) were ≤50% (Supplementary Information, section ‘Biospecimen collection and clinical data’). Clinical and demographic characteristics are described in Supplementary Data 1.1. Five expert genitourinary pathologists re-reviewed all of the cases for multiple parameters, including the extent of variant histology (Supplementary Fig. 1.1a and Supplementary Information, section ‘Biospecimen collection and clinical data’).
Somatic DNA alterations
The tumours displayed a large number of DNA alterations, slightly fewer than in lung cancer and melanoma, but more than in other adult malignancies studied by TCGA (Fig. 1)9. On average, there were 302 exonic mutations, 204 segmental alterations in genomic copy number and 22 genomic rearrangements per sample. We analysed somatic copy number alterations (CNAs) using both SNP 6.0 arrays and low-pass whole-genome sequencing; the two were strongly concordant (Supplementary Methods 6.1 and Supplementary Fig. 6.1). There were 22 significant arm-level copy number changes (Supplementary Data 6.1.1), and GISTIC (genomic identification of significant targets in cancer) (Supplementary Methods 6.2) identified 27 amplified and 30 deleted recurrent focal somatic CNAs (Supplementary Data 6.2.1 and 6.3.1). Focal amplifications involved genes previously reported to be altered in bladder cancer (Fig. 1c and Supplementary Fig. 6.2.1) and some not previously implicated. The latter included PVRL4, BCL2L1 and ZNF703. The most common recurrent focal deletion, seen in 47% of samples, contained CDKN2A (9p21.3) and correlated with reduced expression (Fig. 1 and Supplementary Fig. 2.7). Other focal deletions containing <10 genes appeared to target PDE4D, RB1, FHIT, CREBBP, IKZF2, FOXQ1, FAM190A (also called CCSER1), LRP1B and WWOX.
Whole-exome sequencing of 130 tumours and matched normal samples targeted 186,260 exons in 18,091 genes (mean coverage 100-fold, with 82% of target bases covered >30×). MuTect10 identified 39,312 somatic mutations (including 38,012 point mutations and 1,138 indels (insertions or deletions)), yielding mean and median somatic mutation rates of 7.7 and 5.5 per megabase (Mb), respectively (Fig. 1a and Supplementary Table 2.1.1). Thirty-two genes showed statistically significant levels of recurrent somatic mutation (Fig. 1b and Supplementary Table 2.1.2) by analysis using MutSig 1.5 (refs 9, 11) (Supplementary Methods 2.2). Three other genes identified by MutSig were not considered further because of low or undetectable expression (Supplementary Fig. 2.1.1). A similar analysis considering only mutations in the COSMIC database2 identified three more significantly mutated genes: ERBB2, ATM and CTNNB1 (Supplementary Table 2.1.3). We validated the mutation findings in three ways: targeted re-sequencing of all significantly mutated gene mutations, comparison with RNA-seq data for 123 samples and comparison with whole-genome sequence data for 18 samples. Overall, the validation rate was >99% in selected mutations by a combination of the methods (Supplementary Methods 2.4).
Nearly half (49%) of the samples had TP53 mutations (Fig. 1b), which were mutually exclusive in their relationship with amplification (9%) and overexpression (29%) of MDM2; hence, TP53 function was inactivated in 76% of samples. Most RB1 mutations were inactivating, were associated with significantly reduced mRNA level (Supplementary Fig. 2.7) and were mutually exclusive with CDKN2A deletions (Supplementary Fig. 2.8 and Supplementary Table 2.8.1). FGFR3 mutations (12%) typically affected known kinase-activating sites. PIK3CA mutations were relatively common (20%), clustering in the helical domain near E545 (Supplementary Fig. 2.4). Most TSC1 mutations (8%) were truncating, and six were homozygous (allele fraction >0.5).
Many of the 32 genes identified in Fig. 1b have not previously been reported as statistically significantly mutated in bladder cancer: MLL2 (also called KMT2D; 27%), CDKN1A* (14%), ERCC2* (12%), STAG2 (11%), RXRA* (9%), ELF3* (8%), NFE2L2 (8%), KLF5* (8%), TXNIP (7%), FOXQ1* (5%), RHOB* (5%), FOXA1 (5%), PAIP1* (5%), BTG2* (5%), ZFP36L1 (5%), RHOA (4%) and CCND3 (4%). The nine genes marked with asterisks have not been reported as significantly mutated genes in any other TCGA cancer type or reported in another study as mutated at >3% frequency2. CDKN1A (p21CIP1), a cyclin-dependent kinase inhibitor12, had predominantly null or truncating mutations, indicating loss of function. Fifteen of sixteen mutations in ERCC2, a nucleotide excision repair gene13, were deleterious missense mutations, suggesting dominant-negative effects. ERCC2-mutant tumours also had significantly fewer C>G mutations than did ERCC2-wild-type tumours (Supplementary Figs 2.3.1 and 2.3.2), and they trended towards higher overall mutation rate (Supplementary Fig. 2.12). Seven of twelve mutations in RXRA (retinoid X nuclear receptor alpha)14 occurred at the same amino acid (five S427F; two S427Y) in the ligand-binding domain. Those seven tumours showed increased expression of genes involved in adipogenesis and lipid metabolism (Supplementary Fig. 2.6 and Supplementary Data 2.6.1–2.6.3), suggesting that the mutations cause constitutive activation.
Eleven tumours (8%) had deleterious missense mutations in the Neh2 domain of NFE2L2, a transcription factor that regulates the anti-oxidant program in response to oxidative stress15. Those tumours showed markedly increased expression of genes involved in genotoxic metabolism and the reactive oxygen species (ROS) response (Supplementary Figs 2.5.1–2.5.3 and Supplementary Data 2.5.2). Furthermore, nine samples had mutations in redox regulator TXNIP (ref. 16) (five of them inactivating) and were mutually exclusive of samples with NFE2L2 mutations, providing another mechanism for dysregulation of redox metabolism. Predominant inactivating mutations were seen in STAG2, an X-linked cohesin complex component required for separation of sister chromatids during cell division17 (Supplementary Fig. 2.4).
Unsupervised clustering by non-negative matrix factorization of mutations and focal somatic CNAs in 125 samples identified three distinct groups (Fig. 1a and Supplementary Fig. 2.1.2). Group A (red), classified as ‘focally amplified’, is highly enriched in focal somatic CNAs in several genes, as well as mutations in MLL2 (Fig. 1 and Supplementary Tables 2.1.4 and 2.1.5). Group B (blue), classified as ‘papillary CDKN2A-deficient FGFR3 mutant’, is enriched in papillary histology. Nearly all group B samples show loss of CDKN2A, and most have one or more alterations in FGFR3. Group C (green), classified as ‘TP53/cell-cycle-mutant’, shows TP53 mutations in nearly all samples, as well as enrichment with RB1 mutations and amplifications of E2F3 and CCNE1 (Fig. 1 and Supplementary Table 2.1.4). These differences in pattern of mutation suggest the possibility of different oncogenic mechanisms.
Seventy-two per cent of the cancers in this study were from current or past smokers, consistent with extensive epidemiological studies indicating an association between smoking and urothelial cancer risk. In contrast with lung cancer, however, there was no statistically significant association between smoking status and the mutational spectrum, frequency of mutation in any significantly mutated gene, occurrence of focal somatic CNAs or expression subtype (Supplementary Tables 2.9.1 and 2.9.2). Never-smokers did have a slightly higher fraction of C>G mutations than did current/former smokers (28.5% versus 23.8%, P = 0.032; Supplementary Figs 2.3.2 and 2.3.3). Unsupervised clustering of promoter CpG island DNA methylation data revealed a major subgroup (34%) of tumours (CIMP) characterized by cancer-specific DNA hypermethylation (Supplementary Fig. 7.1). Multivariate regression analysis with age, sex and tumour stage as covariates identified smoking pack-years as the only significant predictor of CIMP phenotype, as has also been reported for colorectal cancer18.
Fifty-one per cent of mutations overall were Tp*C->(T/G) (Supplementary Table 2.1.1), a class of mutation recently reported to be mediated by one of the DNA cytosine deaminases, APOBEC (refs 19, 20). APOBEC3B was expressed at high levels in all of the tumours, suggesting a major role for APOBEC-mediated mutagenesis in bladder carcinogenesis (Supplementary Figs 12.1 and 12.2).
Four genes involved in epigenetic regulation were significantly mutated genes: MLL2, ARID1A, KDM6A and EP300 (Fig. 1). Truncating mutations were significantly enriched in each of those genes (Supplementary Fig. 2.2 and Supplementary Data 2.2.1–2). Three of the genes had previously been identified as mutated in urothelial cancers4, but mutation of MLL2, which encodes a histone H3 lysine 4 (H3K4) methyltransferase, is a novel finding. Several other chromatin-regulating genes had mutation rates ≥10% but were not statistically significant by MutSig analysis: MLL3, MLL, CREBBP, CHD7 and SRCAP. Many other epigenetic regulators were mutated at lower frequency but were also enriched with truncating mutations, indicating functional significance (Supplementary Fig. 2.2 and Supplementary Data 2.2.1 and 2.2.2). Non-silent mutations in chromatin regulatory genes overall were significantly enriched in bladder cancer in comparison with the entire exome, in contrast with all other epithelial cancers studied so far in the TCGA project (Supplementary Table 2.10). Mutations in MLL2 and KDM6A (the latter encoding a histone H3 lysine 27 (H3K27) demethylase) were mutually exclusive (Supplementary Fig. 2.8 and Supplementary Table 2.8.1), suggesting that mutations in the two genes have redundant downstream effects on carcinogenesis or that the combined loss is synthetically lethal.
mRNA, miRNA and protein expression
Analysis of RNA-seq data from 129 tumours identified four clusters (clusters I–IV) (Fig. 3 and Supplementary Fig. 4.1). Cluster I (‘papillary-like’) is enriched in tumours with papillary morphology (P = 0.0002), FGFR3 mutations (P = 0.0007, q = 0.02), FGFR3 copy number gain (P = 0.04, q = 0.1) and elevated FGFR3 expression (P < 0.0001) (Fig. 3a). It includes all three samples with FGFR3–TACC3 fusions. Cluster I samples also show significantly lower expression of miR-99a and miR-100, miRNAs that downregulate FGFR3 expression (P = 0.0002, Figs 3a and Supplementary Fig. 5.3)22. Cluster I samples also show lower expression of miR-145 and miR-125b, which have been reported as frequently downregulated in bladder cancer23. Tumours with FGFR3 alterations, and perhaps other tumours that share the cluster I expression profile, may respond to inhibitors of FGFR or its downstream targets.
Reverse-phase protein array (RPPA) data indicate that clusters I and II express high HER2 (ERBB2) levels and an elevated oestrogen receptor beta (ESR2) signalling signature, indicating potential targets for hormone therapies such as tamoxifen or raloxifene (Fig. 3d). In fact, HER2 protein levels in a subset of the tumours are comparable to those found in TCGA HER2-positive breast cancers23.
For comparison, we asked whether any of the four clusters show gene signatures similar to those identified in any other tumour type(s) among the first 11 analysed by TCGA. We found that the signature of bladder cancer cluster III (‘basal/squamous-like’) is similar to that of basal-like breast cancers, as well as squamous cell cancers of the head and neck and lung (Supplementary Fig. 4.2)24,25. All four of those cancer types express characteristic epithelial lineage genes, including KRT14, KRT5, KRT6A and EGFR. Basal-like subtype26 and squamous cell subtype27 of urothelial carcinoma have been independently reported. Many of the samples in bladder cluster III express cytokeratins (that is, KRT14 and KRT5) that were recently reported to mark stem/progenitor cells26. Some of those samples also show a level of variant squamous histology (Fig. 3b). Bladder clusters I and II show features similar to those of luminal A breast cancer, with high mRNA and protein expression of luminal breast differentiation markers, including GATA3 and FOXA1 (Fig. 3c). Markers of urothelial differentiation such as the uroplakins (for example, UPK3A) are also highly expressed in clusters I and II, as are the epithelial marker E-cadherin and members of the miR-200 family of miRNAs (which target multiple regulators of epithelial–mesenchymal transition)28 (Fig. 3c). Taken together, these observations indicate that, despite their diverse tissue origins, some bladder, breast, head and neck and lung cancers share common pathways of tumour development.
To determine whether the expression-based clusters could be seen in other data sets, we used the muscle-invasive bladder cancer samples from ref. 27, hierarchically clustering them with the genes used in our analysis. From the sample dendrogram, we identified four groups (Supplementary Fig. 4.3a). The four groups identified in the data set of ref. 27 correlated well with the four clusters identified in our TCGA data (Supplementary Fig. 4.3b).
When we analysed the RNA-seq data for transcript splice variation using SpliceSeq29 (Supplementary Information, section 11), one finding of interest was an average of 3% PKM1 and 97% PKM2 transcripts in the tumour samples. The PKM2 isoform of pyruvate kinase is the principal driver of a shift to aerobic glycolysis in tumours (the Warburg effect)30. Therefore, urothelial bladder cancers (and other cancer types) may prove sensitive to inhibition of glycolysis or related metabolic pathways.
Pathway analysis and therapeutic targeting
Integrated analysis of the mutation and copy-number data revealed three main pathways as frequently dysregulated in bladder cancer: cell cycle regulation (altered in 93% of cases); kinase and phosphatidylinositol-3-OH kinase (PI(3)K) signalling (72%); and chromatin remodelling, including mutations/somatic CNAs in histone-modifying genes (89%) and components of the SWI/SNF nucleosome remodelling complex (64%) (Fig. 4a). To complement these results for well-defined pathways, we applied network analysis methods to examine other possible interactions between genes and pathways (Fig. 4b). In particular, we used the TieDIE algorithm to search for causal regulatory interactions within the PARADIGM network, which connects mutated genes to active transcriptional hubs31,32. The analysis identified a sub-network linking mutated histone-modifying genes to a large array of activated transcription factors, indicating potential far-reaching effects of histone modification on other pathways (Supplementary Fig. 8.2.1) converging on MYC/MAX regulation. Both MYC and MAX showed similar levels of pathway activity, independent of mutations in chromatin genes, suggesting that mutations in histone-modifying genes provide just one mechanism for disruption of the MYC/MAX hub. By contrast, tumours with chromatin-related mutations showed differential activity of transcription factors FOXA2 and SP1, implicating de-differentiation processes as a result of the mutations. Our network analysis also identified HSP90AA1 as a critical signalling hub, indicating that inhibitors of HSP90 may have therapeutic value in urothelial carcinoma. Although the linkages between mutations and transcriptional changes were statistically significant in terms of their proximity in the network (as determined by permutation tests; see Supplementary Fig. 8.2), further studies will be needed to assess the biological relevance of the findings.
Integrated analysis also identified mutations, copy number alterations or RNA expression changes affecting the PI(3)K/AKT/mTOR pathway in 42% of the tumours (Fig. 5a). Included were activating point mutations in PIK3CA (17%; potentially responsive to PI(3)K inhibitors), mutation or deletion of TSC1 or TSC2 (9%; potentially responsive to mTOR inhibitors) and overexpression of AKT3 (10%; potentially responsive to AKT inhibitors). We also observed mutations, genomic amplifications or gene fusions that affect the RTK/RAS pathway in 44% of the tumours (Fig. 5b, c). Included were events that can activate FGFR3 (17%; potentially responsive to FGFR inhibitors or antibodies), amplification of EGFR (9%; potentially responsive to EGFR antibodies or inhibitors), mutations of ERBB3 (6%; potentially sensitive to ERBB kinase inhibitors) and mutation or amplification of ERBB2 (9%; potentially sensitive to ERBB2 kinase inhibitors or antibodies). ERBB3 mutations in bladder cancer have been noted previously4, but statistically significant mutation of ERBB2 in bladder cancer has not been reported. Both genes are potential therapeutic targets in other diseases33,34,35. Notably, ERBB2 alterations were approximately as frequent in this study as in TCGA breast cancers, but with fewer amplifications and more mutations (Fig. 5d)24.
This integrated study of 131 invasive urothelial bladder carcinomas provides numerous novel insights into disease biology and delineates multiple potential opportunities for therapeutic intervention. Treatment for muscle-invasive bladder cancer has not advanced beyond cisplatin-based combination chemotherapy and surgery in the past 30 years36, and no new drugs for the disease have been approved in that time. Median survival for patients with recurrent or metastatic bladder cancer remains 14–15 months with cisplatin-based chemotherapy, and there is no widely recognized second-line therapy37. With the exception of a single case report, there is also no known benefit from treatment with newer, targeted agents38. Several of the genomic alterations identified in this study, particularly those involving the PI(3)K/AKT/mTOR, CDKN2A/CDK4/CCND1 and RTK/RAS pathways, including ERBB2 (Her-2), ERBB3 and FGFR3, are amenable in principle to therapeutic targeting. Clinical trials based on patients with relevant druggable genomic alterations are warranted.
FGFR3 mutation is a common feature of low-grade non-invasive papillary urothelial bladder cancer, but it occurs at a much lower frequency in high-grade invasive bladder cancer. The cluster analysis in Fig. 3 highlights multiple mechanisms of FGFR3 activation, and its strong association with papillary morphology. The data presented here suggest a subset of muscle-invasive cancers that can potentially be targeted through FGFR3. Similarly, ERBB2 amplification may be targetable by strategies used in breast cancer, by small-molecule tyrosine kinase inhibitors or by novel immunotherapeutic approaches (NCT01353222)34. The data here provide further support for several on-going ERBB2-targeted trials in bladder cancer and further define the subpopulation of cancers suited to that approach. Finally, cluster III of the integrated expression profiling analysis reveals the existence of a urothelial carcinoma subtype with cancer stem-cell expression features (including KRT14 and KRT5), perhaps providing another avenue for therapeutic targeting.
The alterations identified in epigenetic pathways also suggest new possibilities for bladder cancer treatment. Ninety-nine (76%) of the tumours analysed here had an inactivating mutation in one or more of the chromatin regulatory genes, and 53 (41%) had at least two such mutations. Overall, the bladder cancers showed a mutational spectrum highly enriched with mutations in chromatin regulatory genes (Supplementary Table 2.10). Furthermore, integrated network analyses revealed a profound impact of those mutations on the activity levels of various transcription factors and pathways implicated in cancer. Drugs that target chromatin modifications—for example, recently developed agents that bind acetyl-lysine binding motifs (bromodomains)—might prove useful for treatment of the subset of bladder tumours that exhibit abnormalities in chromatin-modifying enzymes39. Our findings overall indicate bladder cancer as a prime candidate for exploration of that approach to therapy.
Tumour and normal samples were obtained with institutional-review-board-approved consent and processed using a modified AllPrep kit (Qiagen) to obtain purified DNA and RNA. Quality-control analyses revealed only modest batch effects (Supplementary Information, section ‘Batch effects’). The tumours were profiled using Affymetrix SNP 6.0 microarrays for somatic CNAs, low-pass WGS (HiSeq) for somatic CNAs and translocations, RNA-seq (HiSeq) for mRNA and miRNA expression, Illumina Infinium (HumanMethylation450) arrays for DNA methylation, HiSeq for exome sequencing and RPPA for protein expression and phosphorylation. Statistical analysis and biological interpretation of the data were spearheaded by the TCGA genome data analysis centres. Sequence files are in CGHub (https://cghub.ucsc.edu/). All other molecular, clinical and pathological data are available through the TCGA Data Portal (https://tcga-data.nci.nih.gov/tcga/). The data can be explored through a compendium of next-generation clustered heat maps (http://bioinformatics.mdanderson.org/TCGA/NGCHMPortal/), the cBio Cancer Genomics Portal (http://cbioportal.org), TieDIE (http://sysbiowiki.soe.ucsc.edu/tiedie), SpliceSeq (http://bioinformatics.mdanderson.org/main/SpliceSeq:Overview), MBatch batch effects assessor (http://bioinformatics.mdanderson.org/tcgambatch/) and Regulome Explorer (http://explorer.cancerregulome.org/). Also see Supplementary Information.
We are grateful to all of the patients and families who contributed to this study, as well as C. Gunter and L. Chastain for scientific editing and M. Sheth, J. Zhang and C. Ron Bouchard for administrative support. This work was supported by the following grants from the United States National Institutes of Health: U54 HG003273, U54 HG003067, U54 HG003079, U24 CA143799, U24 CA143835, U24 CA143840, U24 CA143843, U24 CA143845, U24 CA143848, U24 CA143858, U24 CA143866, U24 CA143867, U24 CA143882, U24 CA143883, U24 CA144025 and P01 CA120964. Additional personnel and funding sources are acknowledged in the Supplementary Information.
This file contains Supplementary Text and Data, Supplementary Figures, Supplementary Tables and additional references – see Contents for details.