Predicting biochemical recurrence of prostate cancer with artificial intelligence

Pinckaers, Hans; van Ipenburg, Jolique; Melamed, Jonathan; De Marzo, Angelo; Platz, Elizabeth A.; van Ginneken, Bram; van der Laak, Jeroen; Litjens, Geert

doi:10.1038/s43856-022-00126-3

Download PDF

Article
Open access
Published: 08 June 2022

Predicting biochemical recurrence of prostate cancer with artificial intelligence

Communications Medicine volume 2, Article number: 64 (2022) Cite this article

59k Accesses
7 Citations
15 Altmetric
Metrics details

Subjects

Abstract

Background

The first sign of metastatic prostate cancer after radical prostatectomy is rising PSA levels in the blood, termed biochemical recurrence. The prediction of recurrence relies mainly on the morphological assessment of prostate cancer using the Gleason grading system. However, in this system, within-grade morphological patterns and subtle histopathological features are currently omitted, leaving a significant amount of prognostic potential unexplored.

Methods

To discover additional prognostic information using artificial intelligence, we trained a deep learning system to predict biochemical recurrence from tissue in H&E-stained microarray cores directly. We developed a morphological biomarker using convolutional neural networks leveraging a nested case-control study of 685 patients and validated on an independent cohort of 204 patients. We use concept-based explainability methods to interpret the learned tissue patterns.

Results

The biomarker provides a strong correlation with biochemical recurrence in two sets (n = 182 and n = 204) from separate institutions. Concept-based explanations provided tissue patterns interpretable by pathologists.

Conclusions

These results show that the model finds predictive power in the tissue beyond the morphological ISUP grading.

Plain language summary

To determine the prognosis of patients with prostate cancer, several clinical factors are taken into account. One of these is the cancer grade, assigned by a pathologist based on the cancer’s appearance under a microscope. The grade ranges from 1 to 5, where 5 is the most aggressive tumour type. This study explored whether deep learning—a technique in which computer software learns patterns from multiple examples—can learn to predict the risk of patients’ cancers recurring from microscopic images of the tumours. We show, on two clinical datasets from different institutions, that such a system can help to better predict prognosis, beyond the information provided by grade alone. In the future, this type of method could help clinicians to predict the prognosis of individual prostate cancer patients.

A systematic review and meta-analysis of artificial intelligence diagnostic accuracy in prostate cancer histology identification and grading

Article 25 April 2023

Andrey Morozov, Mark Taratkin, … Young Academic Urologists (YAU) Working Group in Uro-technology of the European Association of Urology

Artificial intelligence unravels interpretable malignancy grades of prostate cancer on histology images

Article Open access 06 March 2024

Okyaz Eminaga, Fred Saad, … Sami-Ramzi Leyh-Bannurah

Artificial intelligence system shows performance at the level of uropathologists for the detection and grading of prostate cancer in core needle biopsy: an independent external validation study

Article 29 April 2022

Minsun Jung, Min-Sun Jin, … Han Suk Ryu

Introduction

Prostate cancer is a common malignancy among men, affecting 1.4 million per year¹. A significant proportion of these men will receive the primary curative treatment of a prostatectomy. This surgery’s success can partly be judged by the concentration of prostate-specific antigen (PSA) in the blood. While it has a dubious role in prostate cancer screening^2,3, this protein is a valuable biomarker in PCa patients’ follow-up post-prostatectomy. In a successful surgery, the concentration will mostly be undetectable (<0.1 ng/mL) after 4–6 weeks⁴.

However, in ~30% of the patients^5,6,7, PSA will rise again after surgery, called biochemical recurrence, pointing to regrowth of prostate cancer cells. Biochemical recurrence is a prognostic indicator for subsequent progression to clinical metastases and prostate cancer death⁸. Estimating chances of biochemical recurrence could help to better stratify patients for specific adjuvant treatments.

The risk of biochemical recurrence of prostate cancer is currently assessed in clinical practice through a combination of the ISUP grade⁹, the PSA value at diagnosis and the TNM staging criteria. In a recent European consensus guideline, these factors were proposed to separate the patients into a low-risk, intermediate-risk and high-risk group¹⁰. A high ISUP grade independently can, independently of other factors, assign a patient to the intermediate (grade 2/3) or high-risk group (grade 4/5).

Based on the distribution of the Gleason growth patterns¹¹, which are prognostically predictive morphological patterns of prostate cancer, pathologists assign cancerous tissue obtained via biopsy or prostatectomy into one of five groups. They are commonly referred to as International Society of Urological Pathology (ISUP) grade groups, the ISUP grade, Gleason grade groups, or just grade groups.^9,12,13,14. Throughout this paper, we will use the term ISUP grade. The ISUP grade suffers from several well-known limitations. For example, there is substantial disagreement in the grading using the Gleason scheme^15,14. Furthermore, although the Gleason growth patterns have seen significant updates and additions since their inception in the 1960s, they remain relatively coarse descriptors of tissue morphology. As such, the prognostic potential of more fine-grained morphological features has been underexplored. We hypothesize that artificial intelligence, and more specifically deep learning, has the potential to discover such information and unlock the true prognostic value of morphological assessment of cancer. Specifically, we developed a deep learning system (DLS), trained on H&E-stained histopathological tissue sections, yielding a score for the likelihood of early biochemical recurrence.

Deep learning is a recent new class of machine learning algorithms that encompasses models called neural networks. These networks are optimized using training data; images with labels, such as recurrence information. From the training data, relevant features to predict the labels are automatically inferred. During development, the generalization of these features is tested on separated training data, which is not used for learning. Afterwards, a third independent set of data, the test set, is used to ensure generalization. Since features are inferred, handcrafted feature engineering is not needed anymore to develop machine learning models. Neural networks are the current state-of-the-art in image classification¹⁶.

Deep learning has previously been shown to find visual patterns to predict genetic mutations from morphology, for example, in lymphoma¹⁷ and lung cancer¹⁸. Additionally, deep learning has been used for feature discovery in colorectal cancer¹⁹ and intrahepatic cholangiocarcinoma²⁰ using survival data. Although deep learning has been used with biochemical recurrence data on prostate cancer, Leo et al.²¹. assumed manual feature selection beforehand, strongly limiting the extent of new features to be discovered. Yamamoto et al.²². used whole slide images and a deep-learning-based encoding of the slides to tackle the slides’ high resolution. They leverage classical regression techniques and support-vector machine models on these encodings. The deep learning model was not directly trained on the outcome, limiting the feature discovery in this work as well.

A common critique of deep learning is its black-box nature of the inferred features²³. Especially in the medical field, decisions based on these algorithms should be extensively validated and be explainable. Besides making the algorithms’ prediction trustworthy and transparent, from a research perspective, it would be beneficial to visualize the data patterns which the model learned, allowing insight into the inferred features. We can visualize the patterns learned by the network leveraging a new technique called Automatic Concept Explanations (ACE)²⁴. ACE clusters patches of the input image using their intermediate inferred features showing common patterns inferred by the network. We were interested in finding these common concepts over a range of images to unravel patterns that the model has identified.

This study aimed to use deep learning to develop a new prognostic biomarker based on tissue morphology for recurrence in patients with prostate cancer treated by radical prostatectomy. As training data, we used a nested case-control study²⁵. This study design ensured we could evaluate whether the network learned differentiating patterns independent of Gleason patterns. The prognostic biomarker provides a strong correlation with biochemical recurrence in two sets (n = 182 and n = 204) from separate institutions. Furthermore, the Automatic Concept-based Explanations provided tissue patterns interpretable by our pathologist.

Methods

Cohorts

Two independent cohorts of patients who underwent prostatectomy for clinically localized prostate cancer were used in this study. Patients were treated at either the Johns Hopkins Hospital in Baltimore or New York Langone Medical Centre. Both cohorts were accessed via the Prostate Cancer Biorepository Network²⁶. The Johns Hopkins University School of Medicine Institutional Review Board and The New York University School of Medicine Institutional Review Board provided ethical regulatory approval for collection and disbursement of data and materials from the respective institutions. The need for acquiring informed consent was waived by the institutional ethical review boards.

For the development of the novel deep-learning-based biomarker (further referred to as DLS biomarker), we used a nested case-control study of patients from Johns Hopkins. This study consists of 524 matched pairs (724 unique patients) containing four tissue spots per patient. They were sampled from 4860 prostate cancer patients with clinically localized prostate cancer who received radical retropubic prostatectomy between 1993 and 2001. Men were routinely checked after prostatectomy at 3 months and at least yearly thereafter. Surveillance for recurrence was conducted using digital rectal examination and measurement of serum PSA concentration. Patients were followed for outcome until 2005, with a median follow-up of 4.0 years. The outcome was defined as recurrence, based on biochemical recurrence (serum PSA > 0.2 ng/mL on 2 or more occasions after a previously undetectable level after prostatectomy), or events indicating biochemical recurrence before this was measured; local recurrence, systemic metastases, or death from prostate cancer. Controls were paired to cases with recurrence using incidence density sampling²⁷. For each case, a control was selected who had not experienced recurrence by the date of the case’s recurrence and was additionally matched based on age at surgery, race, pathologic stage, and Gleason sum in the prostatectomy specimen based on the pathology reports. Given the incidence density sampling of controls, some men were used as controls for multiple cases, and some controls developed recurrence later and became cases for that time period.

The TMA spots were cores (0.6 mm in diameter) from the highest-grade tumour nodule. Random subsamples were taken in quadruplicate for each case. The whole slides were scanned using a Hamamatsu NanoZoomer-XR slide scanner at 0.23 μm/px. TMA core images were extracted using QuPath (v0.2.3²⁸,). We discarded analysis of cores with <25% tissue. The cores were manually checked (HP) for prostate cancer, excluding 535 cores without clear cancer cells present in the TMA cross-section, resulting in a total of 2343 TMA spots. The nested case-control set was split based on the matched pairs into a development set (268 unique pairs), and a test set (91 pairs); the latter was used for evaluation only. We leveraged cross-validation by subdividing the development into three folds to tune the models on different parts of the development set. We divided paired patient, randomly, keeping into account the distribution of the matched variables. The random assignment was done using the scikit-multilearn package²⁹, specifically the ‘IterativeStratification’ method in ‘skmultilearn.model_selection’. After splitting the dataset into training and test, we split the training dataset into three folds using the same method for the cross-validation.

To validate the DLS biomarker on a fully independent external set, we used the cohort from New York Langone Medical Centre. This external validation cohort consists of 204 patients with localized prostate cancer treated with radical prostatectomy between 2001 and 2003. Patients were followed for outcome until 2019, with a median follow-up of 5 years. Biochemical recurrence was defined as either a single PSA measurement of ≥0.4 ng/m or PSA level of ≥0.2 ng/ml followed by increasing PSA values in subsequent follow-up. Cores were sampled from the largest tumour focus or any higher-grade focus (>3 mm). Subsamples were taken in quadruplicate for each case. Images were scanned using a Leica Aperio AT2 slide scanner at 0.25 μ/px.

Model details

For developing the convolutional neural networks (CNNs) we used PyTorch³⁰. As an architecture, we used ResNet50-D³¹ pretrained on ImageNet from PyTorch Image Models³². We used the Lookahead optimizer³³ with RAdam³⁴, with a learning rate of 2e-4 and mini-batch size of 16 images. We used weight decay (7e-3), and a drop-out layer (p = 0.15) before the final fully-connected layer. We used EfficientNet-style³⁵ dropping of residual connections (p = 0.3) as implemented in PyTorch Image Models. We used Bayesian Optimization to find the optimal values (See Supplementary Notes 1 for details about the searchspace).

We resized the TMAs to 1.0 mu/pixel spacing and cropped to 768 × 768 pixels. Extensive data augmentations were used to promote generalization. The transformations were: flipping, rotations, warping, random crop, HSV colour augmentations, jpeg compression, elastic transformations, Gaussian blurring, contrast alterations, gamma alterations, brightness alterations, embossing, sharpening, Gaussian noise and cutout³⁶. Augmentations were implemented using albumentations³⁷ and fast.ai³⁸.

TMA spots from cases experiencing recurrence were assigned a value of 0–4, depending on the year on which the first event, either biochemical recurrence, metastases, or prostate cancer-related death, was recorded, with 0 meaning recurrence within a year, four meaning after 4+ years. TMA spots from cases without an event were also assigned the label 4.

We validated the model on the development validation fold each epoch with a moving average of the weights from five subsequent epochs. We used the concordance index as a metric to decide which model performed the best.

As the final prediction at the patient level, the TMA spot with the highest score was used. The final DLS consists of an ensemble of 15 convolutional neural networks. Using cross-validation as described above, 15 networks were trained for each fold, of which the five best performing were used for the DLS. See Fig. 1 for a graphical overview of the methods, further details can be found in the Supplementary Methods.

**Fig. 1: Overview of the methods summarizing the biomarker development and the Automatic Concept Explanations (ACE) process.**

Statistics and reproducibility

For primary analysis of the nested case-control study, odds ratios (OR) and 95% confidence intervals (CI) were calculated using conditional logistic regression, following Dluzniewski et al.³⁹. Due to the study design, calculating hazard ratios using a Cox proportional hazard regression is not appropriate. For the primary analysis, the continuous DLS marker was given as the only variable. For a secondary analysis, we added the non-matched variables PSA, positive surgical margins, and a binned indicator variable for year of surgery. Since matching was done on Gleason sum, and our goal was to identify patterns beyond currently used Gleason patterns, we corrected for the residual differences of the ISUP grade between cases and control (see Table 1). A correction was performed by adding a continuous covariate since, due to the small differences, an indicator covariate did not converge. Analysis was done using the lifelines Python package (v. 0.25.10)⁴⁰ with Python (v. 3.7.8). P-values were calculated as a Wald test per single parameter. Since the DLS predicts the time-to-recurrence, high values indicate a low probability of recurrence. We multiplied the DLS output by −1 to make the analysis more interpretable. For three patients (1 from the Johns Hopkins cohort and 2 from the New York Langone cohort), PSA values were missing and were therefore replaced by the median.

Table 1 Baseline characteristics of test set and development set from the John Hopkins Hospital, prostate cancer recurrence cases and controls, men who underwent radical prostatectomy for clinically localized disease between 1993 and 2001.

Full size table

For primary analysis of the New York Langone cohort, we calculated hazard ratios (HR) using a Cox proportional hazards regression. We report a secondary multivariable analysis including indicator variables for relevant clinical covariates, Gleason sum, pathological stage, and surgical margin status. We tested the proportional hazards assumption as satisfactory (every p-value > 0.01) using the Pearson correlation between the residuals and the rank of follow-up time. Kaplan–Meier plots were generated for the New York Langone cohort. Due to the nested case-control design for the Johns Hopkins set, this set could not be visualized in a Kaplan–Meier plot.

Automatic concept explanations

To generate concepts, we picked the best performing single CNN from the DLS based on its validation set fold. We used a combination of the methods of Yeh et al., 2020⁴¹ and Ghorbani et al., 2019²⁴.

We tiled the TMA images into 256 × 256 patches within the tissue, discarding patches with >50% whitespace. These patches were padded to the original input shape of the CNN (768 × 768 pixels). The latent space of layer 42 of 50 was saved for each tile. Afterwards, we used PCA (50 components) to lower the dimensionality and then performed k-means (k = 15) to cluster the latent spaces.

In contrast to Yeh et al. and Ghorbani et al., we did not sort the concepts on completeness of the explanations or importance for prediction of individual samples. We sorted the concepts to find interesting new patterns related to recurrence across images by ranking the concepts based on the DLS score of the TMA spot from which they originated.

For each concept, 25 examples were randomly picked and visually inspected by a pathologist (JvI), with a special interest in uropathology, blinded to the case characteristics and prediction of the network.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Results

The DLS system was developed on the Johns Hopkins cohort with 2343 TMA spots of 685 included unique patients (39 patients were excluded due to insufficient tumour amount in the cores). Four hundred ninety-two patients were recurrence cases (72%). The 685 included patients were split into a development set of 503 unique patients and a test set of 91 matched pairs of cases and controls (182 unique patients).

In the external validation cohort, 38 out of the 204 patients (19%) had biochemical recurrence after complete remission, PSA nadir after 3 months post-prostatectomy. From the 204 patients, 620 TMA spots were included. Clinical characteristics of the cohorts can be found in Table 1 and Table 2.

Table 2 Baseline characteristics of the cohort from New York Langone hospital, prostate cancer recurrence cases and controls, men who underwent radical prostatectomy between 2001 and 2003.

Full size table

The DLS marker showed a strong association in the primary analyses on the test set of the Johns Hopkins cohort with an OR of recurrence of 3.28 (95% CI 1.73–6.23; p < 0.005) per unit increase, with DLS system continuous output ranging from 0–3, with two cases below 0 (−0.27 and −0.24) (Table 3).

Table 3 Conditional logistic regression analyses of the Johns Hopkins test set.

Full size table

In addition, for the John Hopkins cohort, we checked for confounding by ISUP grade, PSA level at diagnosis, positive surgical margins, and year of prostatectomy. Neither covariate was found to bias the estimates of effect substantially. The biomarker maintained a strong correlation of OR 3.32 (CI 1.63–6.77; p = 0.001) per unit increase, adjusting for these factors and the continuous term for the residual difference between cases and controls in the ISUP grade.

In the univariable analysis, the DLS marker was strongly associated with recurrence in the New York Langone external validation cohort with an HR of 5.78 (95% CI 2.44–13.72; p < 0.005) per unit increase. In the multivariate model, including ISUP grade and the other prognostic indicators in addition to the DLS biomarker, the DLS biomarker was still strongly associated with recurrence with an HR of 3.02 (CI 1.10–8.29; p = 0.03) per unit increase (Table 4). Kaplan–Meier curves based on a median cut-off, and four-group categorization, show a clear separation of the low-risk and high-risk groups (Fig. 2).

Table 4 Cox proportional hazard analyses of New York Langone external validation cohort.

Full size table

**Fig. 2: Kaplan–Meier plot for New York Langone external validation cohort.**

Automatic Concept Explanations provided semantically meaningful concepts (Fig. 3). Concepts were identified that correlated with either a relatively rapid or slow biochemical recurrence. Visual inspection by JvI reveals that generally, the concepts with adverse behaviour show mainly Gleason pattern 4 and some Gleason pattern 5, with cribriform configuration in TMAs within the concepts with most adverse behaviour. The two intermediate concepts show mainly stroma and less aggressive growth patterns. The two concepts predicted to be part of late recurrence cases show mainly Gleason 3 patterns, with readily recognizable well-formed glands. See the Supplementary Notes 2 for a detailed analysis.

**Fig. 3: Examples of automatic concepts explanations.**

Discussion

We have developed a deep-learning-based morphological biomarker for the prediction of prostate cancer biochemical recurrence based on prostatectomy tissue microarrays. Using a nested case-control study, we trained convolutional neural networks end-to-end with biochemical recurrence data. The DLS marker provides a continuous score based on the speed of biochemical recurrence it perceived. The DLS marker had an OR of 3.32 (CI 1.63–6.77; p = 0.001) per unit increase for the test set, and an HR of 3.02 (CI 1.10–8.29; p = 0.03) per unit increase for the external validation set. These findings support our hypothesis that there is more morphological information in the tissue besides the ISUP grade.

In the Kaplan–Meier plot (Fig. 2), the biomarker especially seems able to separate men with relatively rapid recurrence from men without (<5 years). However, we hypothesize that the decreased long-term separation in those survival curves is less due to the training cohort containing a median follow-up for 4 years. Furthermore, we choose to group patients together with >4 years of no biochemical recurrence, this limits the model’s capabilities to differentiate patients with very late recurrence. Additionally, due to the limitations of the morphology of the present tumour to inform about long-term outcomes (e.g., cells that escaped the primary tumour may subsequently acquire genomic changes that influence recurrence). Furthermore, it should be noted that the number of at-risk patients was small at these long-term time points.

The nested case-control study contained follow-up information in timespans of years, this limited the use of survival based loss functions⁴². When more granular follow-up information is at hand, future work could investigate usage of Cox regression based loss functions to better leverage the information of the clinical cohort.

The DLS marker showed strong and similar association in both cohorts prepared at different pathology laboratories, which supports the robustness to differences in tissue preparation, staining protocols and scanners.

We showed that Automatic Concept Explanation may be helpful to find concepts correlated with good and poor prognosis. The most discriminatory concepts followed the morphological patterns of Gleason grading. Well-defined prostate cancer glands were predicted to undergo biochemical recurrence later than disorganized sheets of prostate cancer cells. These concepts support the DLS system capturing the expected morphological patterns in support of the validity of the DLS approach.

This study focused on the use of deep learning to automatically discover features relevant for biochemical recurrence prediction. Compared to before-mentioned studies on prostate cancer prognostics models^21,22, as far as we know, we report the first paper to directly optimize a neural network from prostatectomy tissue towards biochemical recurrence. Additionally, we report that training towards the biochemical recurrence endpoint results in patterns in the networks’ features aligning with the ISUP grading.

In the increasing digitalisation of pathology labs, our DLS marker may be applied on digitally chosen regions of interest. Our marker is trained on tissue microarray spots that were selected at the highest-grade cancer focus. Furthermore, it has to be noted that a TMA core allows for only limited assessment of the overall prostate cancer growth patterns. Since these tissue cores represent only limited samples from what is usually a much larger tumour lesion, the potential more aggressive patterns may still be present outside of the chosen regions, including regions of potential extraprostatic extension and perineural invasion. Validation will need to be done on entire prostatectomy sections and across cancer foci.

There have been improvements to prostate cancer grading^11,13, and recently the cribriform pattern is suggested to be important for prognostics^14,43. However, the evaluation of this pattern can show a range of inter-observer variability⁴⁴, although a recent consensus approach could help decrease this variability⁴⁵. Although we certainly have to keep in mind all the before-mentioned limitations, our findings are in line with outcomes concerning adverse behaviour in earlier work. The DLS system identified a concept that consisted of fields with cribriform-like growth patterns. This cribriform-like growth pattern was found to be part of the concept that was most associated with early recurrent cases.

The results in this study are limited to newer insights of prostate cancer growth, information on cribriform-growth and intraductal carcinoma were not readily available for use in the multivariate analysis, although the external validation cohort was graded using the 2005 ISUP consensus⁴⁶ partly encoding the presence of cribriform growth inside the ISUP grade.

Although biochemical recurrence is a common endpoint to study prostate cancer progression, a clinical utility would be mostly found in assessing time-to-metastases or death. However, time-wise, they are typically significantly further separated from the surgical event, making it harder to identify relationships between tissue morphology and these endpoints. Nevertheless, we would like to investigate them in the future.

Conclusions

In summary, we have developed a deep-learning-based visual biomarker for prostate cancer recurrence based on tissue microarray hotspots of prostatectomies. The DLS marker provides a continuous score predicting the speed of biochemical recurrence. We obtained an odds ratio of 3.32 (CI 1.63–6.77; p = 0.001) for a nested case-control study from Johns Hopkins Hospital, matched on Gleason sum on other factors. Additionally, we obtained an HR of 3.02 (CI 1.10–8.29; p = 0.03) for an external validation cohort from the New York Langone hospital, adjusted for ISUP grade, pathological stage, preoperative PSA concentration, and surgical margins status. Thus, this visual biomarker may provide prognostic information in addition to the current morphological ISUP grade.

Data availability

The data that support the findings of this study are available from the Prostate Cancer Biorepository Network²⁶ but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the Prostate Cancer Biorepository Network²⁶. Source data for Figs. 2 a, b, and 3 and Supplementary Fig. 1 can be accessed as Supplementary Data 1, 2, 3 and 4, respectively.

Code availability

The code to replicate the DLS biomarker can be found at https://zenodo.org/record/6480481⁴⁷.

References

Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. https://doi.org/10.3322/caac.21660 (2021).
Grossman, D. C. et al. Screening for Prostate Cancer: US preventive services task force recommendation statement. JAMA 319, 1901–1913 (2018).
Article Google Scholar
Heijnsdijk, E. A. M. et al. Summary statement on screening for prostate cancer in Europe. Int J Cancer 142, 741–746 (2018).
Article CAS Google Scholar
Goonewardene, S. S., Phull, J. S., Bahl, A. & Persad, R. A. Interpretation of PSA levels after radical therapy for prostate cancer. Trends Urol. Men S Health 5, 30–34 (2014).
Article Google Scholar
Amling, C. L. et al. Long-term hazard of progression after radical prostatectomy for clinically localized prostate cancer: continued risk of biochemical failure after 5 years. J Urol. 164, 101–105 (2000).
Article CAS Google Scholar
Freedland, S. J. et al. Risk of prostate cancer–specific mortality following biochemical recurrence after radical prostatectomy. JAMA 294, 433–439 (2005).
Article CAS Google Scholar
Han, M., Partin, A. W., Pound, C. R., Epstein, J. I. & Walsh, P. C. Long-term biochemical disease-free and cancer-specific survival following anatomic radical retropubic prostatectomy. The 15-year Johns Hopkins experience. Ur. Clin. North Am. 28, 555–565 (2001).
Article CAS Google Scholar
Van den Broeck, T. et al. Prognostic value of biochemical recurrence following treatment with curative intent for prostate cancer: a systematic review. Eur. Urol. 75, 967–87. (2019).
Article Google Scholar
Epstein, J. I. et al. The 2014 International Society of Urological Pathology (ISUP) consensus conference on Gleason grading of prostatic carcinoma. Am. J. Surg. Pathol. 40, 244–252 (2016).
Article Google Scholar
Mottet, N. et al. EAU-EANM-ESTRO-ESUR-SIOG Guidelines on Prostate Cancer—2020 Update. Part 1: screening, diagnosis, and local treatment with curative intent. Eur. Urol. 79, 243–62. (2021).
Article CAS Google Scholar
Epstein, J. I. An update of the Gleason grading system. J. Urol. 183, 433–440 (2010).
Article Google Scholar
Pierorazio, P. M., Walsh, P. C., Partin, A. W. & Epstein, J. I. Prognostic Gleason grade grouping: data based on the modified Gleason scoring system. BJU Int. 111, 753–60. (2013).
Article Google Scholar
Epstein, J. I. et al. A Contemporary Prostate Cancer Grading System: a validated alternative to the Gleason score. Eur. Urol. 69, 428–35. (2016).
Article Google Scholar
van Leenders, G. J. L. H. et al. The 2019 International Society of Urological Pathology (ISUP) consensus conference on Grading of prostatic carcinoma. Am. J. Surg. Pathol. 44, e87–e99 (2020).
Article Google Scholar
Ozkan, T. A. et al. Interobserver variability in Gleason histological grading of prostate cancer. Scand. J. Urol. 50, 420–424 (2016).
Article CAS Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).
Google Scholar
Swiderska-Chadaj, Z., Hebeda, K. M., van den Brand, M. & Litjens, G. Artificial intelligence to detect MYC translocation in slides of diffuse large B-cell lymphoma. Virchows Arch. https://doi.org/10.1007/s00428-020-02931-4 (2020).
Coudray, N. et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 24, 1559–67. (2018).
Article CAS Google Scholar
Wulczyn, E. et al. Interpretable survival prediction for colorectal cancer using deep learning. NPJ Digit. Med. 4, 71 (2021).
Article Google Scholar
Muhammad, H. et al. EPIC-Survival: End-to-end part inferred clustering for survival analysis, featuring prognostic stratification boosting. arXiv https://doi.org/10.48550/arXiv.2101.11085 (2021).
Leo, P. et al. Computer extracted gland features from H&E predicts prostate cancer recurrence comparably to a genomic companion diagnostic test: a large multi-site study. Npj Precis. Oncol. https://doi.org/10.1038/s41698-021-00174-3 (2021).
Yamamoto, Y. et al. Automated acquisition of explainable knowledge from unannotated histopathology images. Nat. Commun. 10, 5642 (2019).
Article CAS Google Scholar
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–15. (2019).
Article Google Scholar
Ghorbani, A., Wexler, J., Zou, J. & Kim, B. Towards automatic concept-based explanations. arXiv https://doi.org/10.48550/arXiv.1902.03129 (2019).
Toubaji, A. et al. Increased gene copy number of ERG on chromosome 21 but not TMPRSS2-ERG fusion predicts outcome in prostatic adenocarcinomas. Mod. Pathol. 24, 1511–1520 (2011).
Article CAS Google Scholar
PCBN. Prostate Cancer Biorepository Network https://prostatebiorepository.org/ (2021).
Wang, M.-H., Shugart, Y. Y., Cole, S. R. & Platz, E. A. A simulation study of control sampling methods for nested case-control studies of genetic and molecular biomarkers and prostate cancer progression. Cancer Epidemiol. Biomarkers Prev. 18, 706–711 (2009).
Article CAS Google Scholar
Bankhead, P. et al. QuPath: Open source software for digital pathology image analysis. Sci. Rep. 7, 16878 (2017).
Article Google Scholar
Szymanski, P. & Kajdanowicz, T. Scikit-multilearn: a scikit-based Python environment for performing multi-label classification. J. Mach. Learn. Res. 20, 209–230 (2019).
Google Scholar
Paszke, A. et al. PyTorch: An imperative style, high-performance deep learning library. arXiv https://doi.org/10.48550/arXiv.1912.01703 (2019).
He, T. et al. Bag of tricks for image classification with convolutional neural networks. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition. 558–567 (IEEE, 2019).
Wightman, R. PyTorch image models. GitHub https://doi.org/10.5281/ZENODO.4414861 (2021).
Zhang, M. R., Lucas, J., Hinton, G. & Ba J. Lookahead optimizer: k steps forward, 1 step back. arXiv https://doi.org/10.48550/arXiv.1907.08610 (2019).
Liu L., et al. On the variance of the adaptive learning rate and beyond. arXiv https://doi.org/10.48550/arXiv.1908.03265 (2019).
Tan, M. & Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv https://doi.org/10.48550/arXiv.1905.11946 (2019).
DeVries, T. & Taylor, G. W. Improved regularization of convolutional neural networks with cutout. arXiv https://doi.org/10.48550/arXiv.1708.04552 (2017).
Buslaev, A. et al. Albumentations: fast and flexible image augmentations. Information 11, 125 (2020).
Article Google Scholar
Howard, J. & Gugger, S. Fastai: A layered API for deep learning. Information 11, 108 (2020).
Article Google Scholar
Dluzniewski, P. J. et al. Variation in IL10 and other genes involved in the immune response and in oxidation and prostate cancer recurrence. Cancer Epidemiol. Biomarkers Prev. 21, 1774–1782 (2012).
Article CAS Google Scholar
Davidson-Pilon, C. et al. CamDavidsonPilon/lifelines: 0.25.10. Zenodo https://doi.org/10.5281/ZENODO.4579431 (2021).
Yeh, C.-K. et al. On completeness-aware concept-based explanations in deep neural networks. Adv. Neural Inf. Process. Syst. https://doi.org/10.48550/arXiv.1910.07969 (2020).
Kvamme, H., Borgan, Ø. & Scheel, I. Time-to-event prediction with neural networks and cox regression. J. Mach. Learn. Res. 20, 1–30 (2019).
Google Scholar
Hollemans, E. et al. Cribriform architecture in radical prostatectomies predicts oncological outcome in Gleason score 8 prostate cancer patients. Mod. Pathol. 34, 184–93. (2021).
Article Google Scholar
van der Slot, M. A. et al. Inter-observer variability of cribriform architecture and percent Gleason pattern 4 in prostate cancer: relation to clinical outcome. Virchows Arch. 478, 249–56. (2021).
Article Google Scholar
van der Kwast, T. H. et al. ISUP consensus definition of cribriform pattern prostate cancer. Am. J. Surg. Pathol. https://doi.org/10.1097/PAS.0000000000001728 (2021).
Epstein, J. I., Allsbrook, W. C. Jr, Amin, M. B. & Egevad, L. L., ISUP Grading Committee. The 2005 International Society of Urological Pathology (ISUP) consensus conference on Gleason grading of prostatic carcinoma. Am. J. Surg. Pathol. 29, 1228–1242 (2005).
Article Google Scholar
Pinckaers, H. Source Code for “Predicting Biochemical Recurrence of Prostate Cancer with Artificial Intelligence”. https://doi.org/10.5281/zenodo.6480481 (2022).

Download references

Acknowledgements

This work was supported by the Dutch Cancer Society under Grant KUN 2015-7970. This work was additionally supported by the Department of Defense Prostate Cancer Research Program, DOD Award No W81XWH-18-2-0013, W81XWH-18-2-0015, W81XWH-18-2-0016, W81XWH-18-2-0017, W81XWH-18-2-0018, W81XWH-18-2-0019 PCRP Prostate Cancer Biorepository Network (PCBN), DAMD17-03-1-0273, and supported by Prostate Cancer NCI-NIH grant (P50 CA58236).

Author information

Authors and Affiliations

Department of Pathology, Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, The Netherlands
Hans Pinckaers, Jolique van Ipenburg, Bram van Ginneken, Jeroen van der Laak & Geert Litjens
Department of Pathology, New York University Langone Medical Center, New York, NY, USA
Jonathan Melamed
Departments of Pathology, Urology and Oncology, The Brady Urological Research Institute and the Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins, Baltimore, MD, USA
Angelo De Marzo
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Elizabeth A. Platz
Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden
Jeroen van der Laak

Authors

Hans Pinckaers
View author publications
You can also search for this author in PubMed Google Scholar
Jolique van Ipenburg
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Melamed
View author publications
You can also search for this author in PubMed Google Scholar
Angelo De Marzo
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth A. Platz
View author publications
You can also search for this author in PubMed Google Scholar
Bram van Ginneken
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen van der Laak
View author publications
You can also search for this author in PubMed Google Scholar
Geert Litjens
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.P. developed the study design, and drafted the manuscript. H.P. and E.A.P. analyzed and interpreted the data. J.v.I., J.M., A.D.M., and E.A.P. assisted in the acquisition of data. G.L., J.v.I, and B.v.G supervised the development of the study design and assisted with writing the manuscript.

Corresponding author

Correspondence to Hans Pinckaers.

Ethics declarations

Competing interests

B.v.G. receive funding and royalties from MeVis Medical Solutions AG, (Bremen, Germany), and reports grants and stock/royalties from Thirona, and grants and royalties from Delft Imaging Systems, all outside the submitted work. J.v.d.L. is a member of the advisory boards of Philips, the Netherlands, and ContextVision, Sweden; and received research funding from Philips, the Netherlands; ContextVision, Sweden; and Sectra, Sweden, all outside the submitted work. G.L. reports research grants from the Dutch Cancer Society, the Netherlands Organization for Scientific Research (NWO), and HealthHolland during the conduct of the study, and grants from Philips Digital Pathology Solutions, and consultancy fees from Novartis and Vital Imaging, outside the submitted work. J.M. is supported by Department of Defense Prostate Cancer Research Program, DOD Award No W81XWH-18-2-0016, PCRP Prostate Cancer Biorepository Network A.D.M. is a paid consultant to Cepheid LLC, and Merck & Co., A.D.M has also received research support from Myriad Genetics and Janssen R&D for other studies. All other authors have no competing interests.

Peer review

Peer review information

Communications Medicine thanks Patrick Leo and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Information

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Pinckaers, H., van Ipenburg, J., Melamed, J. et al. Predicting biochemical recurrence of prostate cancer with artificial intelligence. Commun Med 2, 64 (2022). https://doi.org/10.1038/s43856-022-00126-3

Download citation

Received: 01 September 2021
Accepted: 18 May 2022
Published: 08 June 2022
DOI: https://doi.org/10.1038/s43856-022-00126-3

This article is cited by

Artificial intelligence unravels interpretable malignancy grades of prostate cancer on histology images
- Okyaz Eminaga
- Fred Saad
- Sami-Ramzi Leyh-Bannurah
npj Imaging (2024)
Updates on Management of Biochemical Recurrent Prostate Cancer
- Lauren Folgosa Cooley
- Abhishek Srivastava
- Neal D. Shore
Current Treatment Options in Oncology (2024)

Subjects

Abstract

Background

Methods

Results

Conclusions

Plain language summary

Similar content being viewed by others

Introduction

Methods

Cohorts

Model details

Statistics and reproducibility

Automatic concept explanations

Reporting summary

Results

Discussion

Conclusions

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links