Abstract
Distinguishing bladder urothelial carcinomas from prostate adenocarcinomas for poorly differentiated carcinomas derived from the bladder neck entails the use of a panel of lineage markers to help make this distinction. Publicly available The Cancer Genome Atlas (TCGA) gene expression data provides an avenue to examine utilities of these markers. This study aimed to verify expressions of urothelial and prostate lineage markers in the respective carcinomas and to seek the relative importance of these markers in making this distinction. Gene expressions of these markers were downloaded from TCGA Pan-Cancer database for bladder and prostate carcinomas. Differential gene expressions of these markers were analyzed. Standard linear discriminant analyses were applied to establish the relative importance of these markers in lineage determination and to construct the model best in making the distinction. This study shows that all urothelial lineage genes except for the gene for uroplakin III were significantly expressed in bladder urothelial carcinomas (p < 0.001). In descending order of importance to distinguish from prostate adenocarcinomas, genes for uroplakin II, S100P, GATA3 and thrombomodulin had high discriminant loadings (> 0.3). All prostate lineage genes were significantly expressed in prostate adenocarcinomas(p < 0.001). In descending order of importance to distinguish from bladder urothelial carcinomas, genes for NKX3.1, prostate specific antigen (PSA), prostate-specific acid phosphatase, prostein, and prostate-specific membrane antigen had high discriminant loadings (> 0.3). Combination of gene expressions for uroplakin II, S100P, NKX3.1 and PSA approached 100% accuracy in tumor classification both in the training and validation sets. Mining gene expression data, a combination of four lineage markers helps distinguish between bladder urothelial carcinomas and prostate adenocarcinomas.
Similar content being viewed by others
Introduction
Histological examination of a carcinoma from transurethral resection specimens, especially from the bladder neck, always triggers diagnostic consideration for the origin of the carcinoma as either bladder or prostate. The distinction is crucial as it impacts further management and prognosis. For advanced bladder urothelial carcinomas, the treatment options include neoadjuvant chemotherapy followed by cystectomy1, whereas for advanced prostate adenocarcinomas, the treatment options include radiotherapy and androgen deprivation therapy2.
For low-grade carcinomas, distinction between bladder urothelial carcinomas and prostate adenocarcinomas is usually possible based on morphological features. However, for high-grade bladder urothelial carcinomas and prostate adenocarcinomas, conclusive distinction based on morphology alone is difficult due to overlapping morphological features between these two types of carcinomas. In such cases, immunohistochemistry is performed, employing a panel of antibodies to interrogate the presence of certain proteins that act as urothelial lineage or prostate lineage markers3. A number of urothelial lineage markers such as GATA3 and p63, and prostate lineage markers such as prostate-specific antigen (PSA) and prostate acid phosphatase (PAP) are routinely used, acknowledging the variable sensitivities and specificities of these markers4,5.
For the past decades, the joint effort between the National Cancer Institute and the National Human Genome Research Institute has uncovered the genomic profiles of different types of cancers via large-scale genome sequencing and integrated multi-dimensional analyses. In particular, the Pan-Cancer analysis project under The Cancer Genome Atlas (TCGA) research network incorporates datasets across tumor types as well as across platforms by broad normalization efforts, enabling analyses for commonalities, differences and emergent themes6. Capitalizing on the publicly available transcriptomic data for bladder urothelial carcinomas and prostate adenocarcinomas, firstly, this study aims to verify that genes corresponding to urothelial lineage and prostate lineage markers employed in diagnostic immunohistochemistry are indeed significantly expressed in the corresponding groups of carcinomas. Secondly, this study aims to establish the relative importance of expressions of these genes in distinguishing between bladder urothelial carcinomas and prostate adenocarcinomas. Lastly, a model incorporating expressions of urothelial lineage and prostate lineage genes is constructed to best distinguish between bladder urothelial carcinomas and prostate adenocarcinomas.
Methods
Using the Xena Browser online portal (https://xenabrowser.net/)7, TCGA Pan-Cancer database was filtered on primary tumor sites of bladder urothelial carcinoma or prostate adenocarcinoma. Lineage markers of contemporary diagnostic immunohistochemistry were pre-determined: GATA3, uroplakin III, thrombomodulin, p63, CK5/6, S100 calcium-binding protein P (S100P) and uroplakin II for urothelial lineage5, and prostate specific antigen (PSA), prostate-specific acid phosphatase (PSAP), prostein (P501S), prostate-specific membrane antigen (PSMA), NKX3.1, androgen receptor (AR), and alpha-methylacyl-CoA racemase (AMACR) for prostate lineage4. Gene expressions of these corresponding markers were downloaded, excluding cases without gene expression data. Relevant clinical data were downloaded from TCGA Prostate Cancer and TCGA Bladder Cancer databases.
Heat maps of these genes were drawn in Xena Browser. Differential gene expression analyses with RNA-seq data in unit log(TPM + 0.001) for these genes were performed between these two groups of carcinomas. Graphical display was done in R version 4.0.3 with the ggplot2 and ggpubr packages8,9. Welch-t test was applied in SPSS version 24.0. To address the multiple tests problematic, the significance level α was adjusted by the Bonferroni correction (α corrected = 0.05/14 tests = 0.003)10.
The cases were randomly divided into about 70% as the training set and the remaining as the validation set by randomly generated Bernoulli variates with probability parameter 0.7. To determine which gene expressions best distinguish between bladder urothelial carcinomas and prostate adenocarcinomas, standard linear discriminant analysis was performed in the training set and then validated in the validation set by SPSS version 24.0.
Results
A total of 407 bladder urothelial carcinoma samples and 495 prostate adenocarcinoma samples were included in this study. Relevant clinical data of these bladder and prostate carcinoma samples are summarized in Table 1.
Heat map was drawn for expressions of genes corresponding to the urothelial lineage markers for both bladder urothelial carcinomas and prostate adenocarcinomas (Fig. 1A). The corresponding genes for GATA3, uroplakin III, thrombomodulin, p63, CK5/6, S100P and uroplakin II are GATA3, UPK3A, THBD, TP63, KRT5, S100P and UPK2, respectively. For CK5/6, only KRT5 gene expression was included. Similarly, heat map for expressions of genes corresponding to the prostate lineage markers was drawn (Fig. 1B). The corresponding genes for PSA, PSAP, P501S, PSMA, NKX3.1, AR and AMACR are KLK3, ACPP, SLC45A3, FOLH1, NKX3-1, AR and AMACR, respectively.
Figure 2 displays the boxplots of urothelial and prostate lineage gene expressions, comparing between bladder urothelial carcinomas and prostate adenocarcinomas. All urothelial lineage genes had significantly higher expressions in bladder urothelial carcinomas except UPK3A, which was significantly expressed in the prostate adenocarcinomas as compared to bladder urothelial carcinomas (all p < 0.001). All prostate lineage genes had significantly higher expressions in prostate adenocarcinomas as compared to those in bladder urothelial carcinomas (all p < 0.001).
Standard discriminant analysis was used to see if the model could predict the group membership of the dependent variable of either bladder urothelial carcinoma or prostate adenocarcinoma based on urothelial lineage gene expressions except UPK3A. This was first analyzed in the training set and then validated in the validation set. Table 2 shows the hit ratios for the training set and the validation set; predictive accuracies of the model for the training set and the validation set were 93.1% and 93.6% respectively. In descending order of importance for the urothelial lineage gene expressions, UKP2, S100P, GATA3 and THBD were the most important predictors for bladder urothelial carcinoma based on the discriminant loading > 0.3 (Tables 3, 4).
Similarly, standard discriminant analysis was performed based on prostate lineage gene expressions to see if the model could predict the group membership of the dependent variable of either bladder urothelial carcinoma or prostate adenocarcinoma. Table 5 shows the hit ratios for the training set and the validation set; predictive accuracies of the model for the training set and the validation set were 99.8% and 100.0% respectively. In descending order of importance for the prostate lineage genes, NKX3-1, KLK3, ACPP, SLC45A3 and FOLH1 were the most important predictors for prostate adenocarcinoma based on the discriminant loading > 0.3 (Tables 6, 7).
Standard discriminant analysis was performed based on two most important urothelial lineage genes and two most important prostate lineage genes to see if the model could predict the group membership of the dependent variable of either bladder urothelial carcinoma or prostate adenocarcinoma. Table 8 shows the hit ratios for the training set and the validation set; predictive accuracies of the model for the training set and the validation set were 99.8% and 100.0% respectively. Prostate lineage genes of NKX3-1 and KLK3 appeared to be more important predictors as compared to urothelial lineage genes of UPK2 and S100P (Tables 9, 10).
Discussion
To distinguish urothelial carcinomas from prostate adenocarcinomas, many studies have employed immunohistochemistry to investigate the use of several lineage markers. GATA3, Uroplakin III, Thrombomodulin, S100P, and Uroplakin II are commonly recommended as urothelial lineage markers5. Apart from that, urothelium expresses squamous cell-associated markers such as CK5/6 and p63; expressions of these markers are of value to distinguish from adenocarcinomas5. This study showed that genes corresponding to these urothelial lineage markers with the exception of UPK3A were indeed significantly expressed in the urothelial carcinomas as compared to those in prostate adenocarcinomas. Surprisingly, gene for uroplakin III, UPK3A, was highly expressed in prostate adenocarcinomas as compared to urothelial carcinomas. Contradictorily, by immunohistochemistry method, no expression of uroplakin III was observed in prostate adenocarcinomas across many studies11,12,13,14, yielding specificity of 100% in determining the origin as the bladder. This discrepancy between transcripts of UPK3A gene and uroplakin III protein expression in the prostate has been previously documented in a study15. Presence of UPK3A transcripts in the absence of uroplakin III protein is likely related to interactions between UPK1B gene expression and translation of UPK3A transcripts15.
Standard discriminant analysis of this study demonstrated that, in descending order of importance for the urothelial lineage markers, UKP2, S100P, GATA3 and THBD were the most important predictors for urothelial carcinoma by gene expression. These results corroborate to the studies whereby expressions of these urothelial lineage markers have been studied immunohistochemically12,14,16,17. Among these, GATA3 has been widely studied as a urothelial lineage marker and has a wide range of sensitivities (67–100%) across different studies16. Although most studies reported 0% staining in prostate adenocarcinomas, GATA3 generally lacks specificity because a variety of other tumors express this protein, especially breast carcinomas, cutaneous basal cell carcinomas, and trophoblastic and endodermal sinus tumors18. The corresponding protein for UKP2, uroplakin II, is a relatively new marker for urothelial lineage. The reported sensitivities and specificities for uroplakin II to differentiate urothelial carcinomas from prostate adenocarcinomas were 66–78% and 95–100%, respectively12,19,20,21. For S110P, the sensitivities and specificities were 71–100% and > 95% respectively in cases whereby antibody clone 16 was used 16. Thrombomodulin has been used as a urothelial lineage marker with sensitivities of 46–81% and specificity of 95–100% to differentiate from prostate adenocarcinomas16,17. Thrombomodulin also stains a small number of carcinomas from the lung, breast, ovary, and pancreas14.
On the other hand, recommended prostate lineage markers are PSA, PSAP, P501S, PSMA, NKX3.1, AR, and AMACR4. This study confirms that genes corresponding to these prostate lineage markers were indeed significantly expressed in the prostate adenocarcinomas as compared to those in urothelial carcinomas. Standard discriminant analysis of this study demonstrated that many of the prostate lineage markers genes were important predictors for prostate adenocarcinomas i.e. NKX3-1, KLK3, ACPP, SLC45A3 and FOLH1, corresponding to NKX3.1, PSA, PSAP, P501S, and PSMA respectively. Among these, PSA is a sensitive and specific marker for the prostatic lineage with its sensitivities and specificities of 85–100% and 88–100%, respectively to differentiate from urothelial carcinomas17. PSAP is another conventional prostate lineage marker with high sensitivities and specificities of 92–95% and 81–100% respectively17. PSMA also has a similar range of sensitivities (87–100%) and specificities (83–100%) as a prostate lineage marker3,17,22. However, PSMA is also expressed in a few other tumor tissues such as squamous cell carcinomas and adenocarcinomas from stomach, colon and pancreas22. NKX3.1 and P501S are relatively newer prostate lineage markers. Sensitivities and specificities for NKX3.1 were 69–100% and 99–100%, and for P501S were 94–100% and 99–100%, respectively3,17,23. NKX3.1 is especially useful as it is expressed in many PSA-negative prostate adenocarcinomas24.
This study showed that by combination of four lineage markers with the highest discriminant loadings, i.e. UKP2 and S100P for urothelial lineage and NKX3-1 and KLK3 for prostate lineage, classifications of training set and validation set approached 100% accuracies. Importantly, the prostate lineage genes took precedence over urothelial lineage genes as major predictors. Combination of NKX3.1, PSA, uroplakin II and S100P is therefore proposed to be the favored immunohistochemical test to resolve the dilemma of distinguishing between bladder urothelial carcinomas and prostate adenocarcinoma. This is in line with the recommendations provided by International Society of Urologic Pathology that combination of both lineage markers should be applied in such scenario with the weightage inclined towards prostate lineage markers4.
A few limitations of this study are acknowledged. Although findings of this study generally support the results of the previous studies, this study employed gene expression data of tumor tissue as compared to the visual evaluation of the lineage markers expressed on tumor cells by immunohistochemistry. Thus, discrepancy in expression between gene transcripts and proteins may arise as quantification of transcripts is dependent on tumor cellularity in the tumor tissue. Furthermore, in this study, 5.2% of bladder urothelial carcinomas were low grade and 9.1% of prostate adenocarcinomas had Gleason score of six. Inclusion of these low-grade carcinomas in this study as retrieved from the public databases differs from those studies focusing on high-grade carcinomas. Nevertheless, the findings of this study shall remain valid as total loss of expressions of all lineage markers in high-grade carcinomas is a rare event. Although this study readily provides combination of four lineage gene expressions as an algorithm to resolve the distinction between bladder urothelial carcinomas and prostate adenocarcinomas, transition to application by immunohistochemistry in routine diagnostic practice requires future validation.
Conclusions
Data mining TCGA expression data for urothelial and prostate lineage markers, this study establishes that in descending order of importance, genes for uroplakin II, S100P, GATA3 and thrombomodulin are the most important urothelial lineage markers to distinguish a carcinoma as bladder urothelial carcinoma from prostate adenocarcinoma. In descending order of importance, genes for NKX3.1, PSA, PSAP, P501S and PSMA are the most important prostate lineage markers. Classification of a carcinoma of either bladder urothelial carcinoma or prostate adenocarcinoma reaches 100% accuracy by a combination of gene expressions of uroplakin II, S100P, NKX3.1, and PSA. This combination is readily applied in clinical diagnostic immunohistochemistry to resolve the dilemma in assigning the origin of a carcinoma as either bladder or prostate.
Data availability
The data of this study are available on public databases at Xena Browser online portal (https://xenabrowser.net/).
References
Witjes, J. A. et al. European Association of Urology guidelines on muscle-invasive and metastatic bladder cancer: Summary of the 2020 guidelines. Eur. Urol. 79, 82–104 (2021).
Gillessen, S. et al. Management of patients with advanced prostate cancer: Report of the advanced prostate cancer consensus conference 2019. Eur. Urol. 77, 508–547 (2020).
Sanguedolce, F. et al. Morphological and immunohistochemical biomarkers in distinguishing prostate carcinoma and urothelial carcinoma: A comprehensive review. Int. J. Surg. Pathol. 27, 120–133 (2019).
Epstein, J. I., Egevad, L., Humphrey, P. A. & Montironi, R. Best practices recommendations in the application of immunohistochemistry in the prostate: Report from the International Society of Urologic Pathology consensus conference. Am. J. Surg. Pathol. 38, e6–e19 (2014).
Amin, M., Trpkov, K., Lopez-Beltran, A. & Grignon, D. Best practices recommendations in the application of immunohistochemistry in the bladder lesions: Report from the International Society of Urologic Pathology consensus conference. Am. J. Surg. Pathol. 38, 20–34 (2014).
Weinstein, J. N. et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Goldman, M. J. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 38, 675–678 (2020).
Wickham, H. GGPLOT2: Elegant Graphics for Data Analysis 2016 (Springer, 2016).
Kassambara, A. ggpubr: ‘ggplot2’ Based Publication Ready Plots. R Package Version 0.4.0. https://CRAN.R-project.org/package=ggpubr. (2020).
Sedgwick, P. Multiple significance tests: The Bonferroni correction. BMJ 344, 1–2 (2012).
Kaufmann, O., Volmerig, J. & Dietel, M. Uroplakin III is a highly specific and moderately sensitive immunohistochemical marker for primary and metastatic urothelial carcinomas. Am. J. Clin. Pathol. 113, 683–687 (2000).
Smith, S. C. et al. Uroplakin II outperforms uroplakin III in diagnostically challenging settings. Histopathology 65, 132–138 (2014).
Moll, R., Laufer, J., Wu, X. R. & Sun, T. T. Uroplakin III, a specific membrane protein of urothelial umbrella cells, as a histological markers for metastatic transitional cell carcinomas. Verh. Dtsch. Ges. Pathol. 77, 260–265 (1993).
Parker, D. C. et al. Potential utility of uroplakin III, thrombomodulin, high molecular weight cytokeratin, and cytokeratin 20 in noninvasive, invasive, and metastatic urothelial (transitional cell) carcinomas. Am. J. Surg. Pathol. 27, 1–10 (2003).
Olsburgh, J. et al. Uroplakin gene expression in normal human tissues and locally advanced bladder cancer. J. Pathol. 199, 41–49 (2003).
Suryavanshi, M. et al. S100P as a marker for urothelial histogenesis: A critical review and comparison with novel and traditional urothelial immunohistochemical markers. Adv. Anat. Pathol. 24, 151–160 (2017).
Oh, W. J. et al. Differential immunohistochemical profiles for distinguishing prostate carcinoma and urothelial carcinoma. J. Pathol. Transl. Med. 50, 345–354 (2016).
Miettinen, M. et al. GATA3: A multispecific but potentially useful marker in surgical pathology: A systematic analysis of 2500 epithelial and nonepithelial tumors. Am. J. Surg. Pathol. 38, 13–22 (2014).
Hoang, L. L., Tacha, D., Bremer, R. E., Haas, T. S. & Cheng, L. Uroplakin II (UPII), GATA3, and p40 are highly sensitive markers for the differential diagnosis of invasive urothelial carcinoma. Appl. Immunohistochem. Mol. Morphol. AIMM 23, 711–716 (2015).
Tian, W. et al. Utility of uroplakin II expression as a marker of urothelial carcinoma. Hum. Pathol. 46, 58–64 (2015).
Li, W. et al. Uroplakin II is a more sensitive immunohistochemical marker than uroplakin III in urothelial carcinoma and its variants. Am. J. Clin. Pathol. 142, 864–871 (2014).
Mhawech-Fauceglia, P. et al. Prostate-specific membrane antigen (PSMA) protein expression in normal and neoplastic tissues and its sensitivity and specificity in prostate adenocarcinoma: An immunohistochemical study using mutiple tumour tissue microarray technique. Histopathology 50, 472–483 (2007).
Chuang, A. Y. et al. Immunohistochemical differentiation of high-grade prostate carcinoma from urothelial carcinoma. Am. J. Surg. Pathol. 31, 1246–1255 (2007).
McDonald, T. M. & Epstein, J. I. Aberrant GATA3 staining in prostatic adenocarcinoma: A potential diagnostic pitfall. Am. J. Surg. Pathol. 45, 341–346 (2021).
Author information
Authors and Affiliations
Contributions
E.S.C. conceived the idea, analyzed the data, wrote the main manuscript, and prepared the figures.
Corresponding author
Ethics declarations
Competing interests
The author declares no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ch’ng, E. Mining The Cancer Genome Atlas gene expression data for lineage markers in distinguishing bladder urothelial carcinoma and prostate adenocarcinoma. Sci Rep 11, 6765 (2021). https://doi.org/10.1038/s41598-021-85993-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-021-85993-x
This article is cited by
-
Bridging Health Disparities: a Genomics and Transcriptomics Analysis by Race in Prostate Cancer
Journal of Racial and Ethnic Health Disparities (2024)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.