Integrating inflammatory serum biomarkers into a risk calculator for prostate cancer detection

Improved prostate cancer detection methods would avoid over-diagnosis of clinically indolent disease informing appropriate treatment decisions. The aims of this study were to investigate the role of a panel of Inflammation biomarkers to inform the need for a biopsy to diagnose prostate cancer. Peripheral blood serum obtained from 436 men undergoing transrectal ultrasound guided biopsy were assessed for a panel of 18 inflammatory serum biomarkers in addition to Total and Free Prostate Specific Antigen (PSA). This panel was integrated into a previously developed Irish clinical risk calculator (IPRC) for the detection of prostate cancer and high-grade prostate cancer (Gleason Score ≥ 7). Using logistic regression and multinomial regression methods, two models (Logst-RC and Multi-RC) were developed considering linear and nonlinear effects of the panel in conjunction with clinical and demographic parameters for determination of the two endpoints. Both models significantly improved the predictive ability of the clinical model for detection of prostate cancer (from 0.656 to 0.731 for Logst-RC and 0.713 for Multi-RC) and high-grade prostate cancer (from 0.716 to 0.785 for Logst-RC and 0.767 for Multi-RC) and demonstrated higher clinical net benefit. This improved discriminatory power and clinical utility may allow for individualised risk stratification improving clinical decision making.

Prostate cancer (PCa) is the most common non-cutaneous cancer in men in the Western world 1 . A prostate tissue biopsy is a key step in the diagnosis of PCa. However the decision to refer a patient for a biopsy is challenging, as TRUS biopsies are associated with significant morbidity 2 . Clinicians usually base this decision on serum prostate specific antigen (PSA), abnormal digital rectal exam (DRE) 3 and increasingly multiparametric magnetic resonance imaging (mpMRI) as well as other factors, such as family history and previous biopsy results. PSA lacks specificity 1 and has led to over-diagnosis and over-treatment of clinically indolent disease and a large number of unnecessary biopsies in men 4 and the mpMRI PROMIS trial did show that there is still a chance of missing clinically significant disease at PI-RADS scores of 1 or 2 5 . Improved detection methods for high-grade significant disease that would reduce unnecessary biopsies are highly sought. Risk stratification of men suspected of PCa and high-grade significant disease would allow clinicians and patients to make a more informed decision on whether or not to biopsy. Risk calculators that utilise patient clinical data have already been developed for cardiovascular disease 6 and stroke 7 . There are several guidelines which suggest a risk adapted approach that considers clinical information along with serum PSA should be used to predict PCa risk 8 . Previous risk calculators have been developed and assessed such as the European Randomised Study of Screening for Prostate Cancer Risk Calculator (ERSPC-RC) 9 and the Prostate Cancer Prevention Trial Risk Calculator (PCPT-RC) 10,11 in large multi-institutional cohorts. However, the accuracy of the risk score nomograms to detect high grade cancer (Gleason score ≥ 7) are ~ 69-79% for both PCPT and the ERSPC.
The use of multiple serum biomarkers may be the key to improving the identification of significant disease. Commercial tests that utilise serum biomarkers are already on the market, such as the Prostate Health Index www.nature.com/scientificreports/ (PHI) 12 , which comprises total PSA, free PSA and [-2] proPSA 13 and the 4 K score, which assesses total PSA, free PSA, intact PSA, and human kallikrein-related peptidase 2 14 . Although promising, these commercial tests are currently not widely routinely used, as there is still uncertainty as to their utility and interpretation in a clinical setting. Previous studies have shown that the inclusion of serum and Urine biomarkers improves on risk calculators. Our own studies have shown that the addition of the PHI score to an Irish clinical risk calculator improved the accuracy of the Irish risk calculator 15 . The urinary biomarkers prostate cancer antigen 3 and the gene fusion product of transmembrane protease serine 2 with the transcription factor v-ets erythroblastosis virus E26 oncogene homolog (TMPRSS2-ERG) also improved the accuracy of ERSPC-RC 16 . It is clear that the use of multiple biomarkers improves on the accuracy of risk calculators. It is well recognised that inflammation plays a causal role in the development of several types of cancer 17 and there is direct evidence of an inflammatory microenvironment 18 and higher inflammatory marker levels effecting a greater PCa risk 19 . This environment is associated with impaired differentiation of prostate epithelial cells 20 and aberrant basal to luminal differentiation promoting cancer initiation 21 .
The aim of this study was to investigate the utility of inflammatory serum biomarkers combined with clinical information for the detection of (i) PCa and (ii) high-grade PCa in patients that are suspected of having PCa.

Materials and methods
Patient cohort and sample collection. The study cohort consisted of 436 Caucasian Irish men referred for a TRUS biopsy on the basis of an elevated PSA and/or abnormal DRE between April 2012 and June 2016. Blood samples (9 mL) were collected in a serum separator tube prior to biopsy and processed within 3 h of collection. Samples were centrifuged at 1500×g for 15 min at room temperature. Serum (~ 3 mL) was removed and stored at − 80 °C until further analysis. Patients were classified as either biopsy-negative (having no detectable PCa) or biopsy-positive (having detectable PCa) and further sub-divided into low-grade (Gleason score 6) and high-grade (Gleason score 7 or above) 15 disease. The clinicopathologic details of the cohort are summarised in Table 1.
Sample collection and processing were ethically approved by the St James Hospital and Mater Misericordiae University Hospital ethics committees. The patient information leaflet and consent form were written and constructed in line with best practice and the EU Data protection Directive and Data protection Acts 1988 and 2018 and approved by the two ethics committees. All patients gave written informed consent agreeing to participate in the study. All steps were carried out in accordance with national guidelines and regulations.  23 . Biochip analyses were performed according to the manufacturer's instructions. Briefly, serum samples diluted where appropriate, reconstituted calibrators and assay specific quality controls utilised in duplicate were incubated on the biochip. Following washing, detector conjugate solution was applied to the biochip and binding was revealed using chemiluminescent detection. Concentration of all analytes in the samples were calculated using a nine-point calibration curve by the Evidence Investigator analyser. Individual assay runs were deemed to have passed if the measured values for the quality control samples were within the specified range for each of the target values for each analyte, as per kit instructions. Values measured below the lower limit of detection were taken as zero. IL-18 was measured in all serum samples by ELISA (Cat. No. ILE10068, Randox) according to the manufacturer's instructions. Serum samples, calibrators and quality control samples were added to each well in duplicate. The IL-18 standard provided in the kit was reconstituted in deionised water to make up the calibrators and the two quality control samples (312.5 pg mL −1 and 56.25 pg mL −1 ). Total PSA and Free PSA were determined in all samples using the Roche COBAS 8000 system according to manufacturer's instructions at Randox Clinical Laboratory Services (Antrim, UK).
Even though PSA values were available for each patient we included its analysis as the patients were recruited from different clinics and the pre-biopsy PSA values attained using various platforms. Therefore, serum PSA of the patients were reanalysed on the single platform.
Clinical information: Age, Family history, DRE and prior negative biopsy were collected as part of the study from the patients chart or at time of recruitment.

Statistical analysis. Basic analysis of patient information.
Basic statistical analysis of the study population's characteristics was performed using GraphPad Prism (ver. 5.0). Descriptive statistics were performed in the dataset, which was divided into those with and without a PCa diagnosis and high-grade PCa (> = Gleason 7) versus all other patients. The unpaired Student's t-test and the Wilcoxon Rank test were used to investigate the significant difference in means and medians of continuous variables, respectively. Pearson's chi-squared test was also performed to studying the significant difference for categorical variables.
Risk calculator model development and performance. Development of the risk calculator for the prediction of PCa and high-grade PCa were performed in R software version 3.4.3 24 . Logistic and multinomial regression methods were used to model the linear and nonlinear effects of serum biomarkers combined with clinical information (age, DRE, family history of PCa, previous negative biopsy). These two modelling strategies are considered as relevant approaches to stratify patients to high-grade PCa, low-grade PCa or those without PCa. The stepwise method was applied as the variable selection technique to integrate potentially relevant biomarkers into the risk calculator. In both methods, the probabilities for each patient were modelled through the log odds of risk factors which were then transformed into probabilities and assigned a percentage risk for each patient. Internal validation is built into the cross-validation approach to prevent overfitting of the data by using tenfold cross validation.
The final models for diagnosis of PCa and/or high-grade PCa were compared to the Irish prostate risk calculator (IPRC) which has been previously developed and outperformed the available risk calculators in the Irish population 25 . Accuracy of the models was determined using the area under the curve (AUC) calculated from the Receiver Operator Curve (ROC) by plotting the sensitivity and specificity at each of its risk thresholds. Comparison of ROC curves took place via the method described by DeLong et al. 26 . Decision-curve analysis was undertaken to examine the potential net benefit of the application of each model over the benefit offered by the strategies of performing a biopsy in all patients and performing a biopsy in none 27 . Calibration plots were plotted to represent the agreement between the observed incidence of cancer visually and predicted risk 28 . The Chi-Square Hosmer-Lemeshow test was used to assess the goodness of fit of models, where a p < 0.05 indicates a poor agreement between the predicted risk and observed incidence of cancer and a poorly calibrated model.

Results
Baseline cohort characteristics. This was a retrospective biomarker study intended to improve the detection of clinically relevant disease. The clinical endpoint of the study was the histopathological findings from the TRUS biopsy. The study cohort consisted of 436 patient biopsies, of which 211 (48%) were diagnosed with PCa with different Gleason scores (Table 1).
In Table 1, the univariate effects of 'DRE' and 'Previous negative biopsy' were statistically significant in detecting PCa, and 'age' for detecting high-grade PCa. The effect of 'PSA' was also significant in both cases. This implies that if patients are older, have higher PSA, abnormal DRE or did not have a previous negative biopsy, (on average) they have more chance of PCa and high-grade PCa. Table 2.

Statistical modelling. Descriptive analysis of all biomarkers assessed is presented in
Integrating serum biomarkers with the clinical risk factors using multinomial and logistic models identified 8 biomarkers (TNFα, VEGF, IL1α, IL1β, ICAM-1, E-selectin, P-selectin, L-selectin) with 4 (IL1α, IL1β, E-selectin, P-selectin) biomarkers identified in both models, to confer significant additional predictive ability. Two separate risk calculators were developed to predict PCa and high-grade PCa using a multinomial model (Multi-RC) and a  Table 3 presents the models and Table 4 evaluates the model performances. We also built models using the biomarkers alone for both the multinomial model (Multi-bio) and a logistic model (Logst-bio) presented in Table 4 which showed no significant improvement over the clinical model (IPRC). The odd ratios of the Multi-RC model for detecting low-grade (column A) and high-grade PCa (column B) compared to not detecting PCa are presented in Table 3. We combined odd ratios of low grade and high grade PCa to evaluate the performance of the Multi-PC model for detecting PCa and High-grade PCa and these are presented in Table 4 and show a significant improvement above the IPRC model. The model variables consist of Age, DRE, Family History, previous biopsy, PSA, TNF-a, IL-1a, IL-1b, ICAM-1, E-Selectin, P-Selection, Free PSA (FPSA) and Free to total PSA (FTPSA).
The odd ratios of the Logst-RC model for detecting PCa compared to not detecting PCa (column C) and detecting high-grade PCa compared to low-grade or not detecting PCa (column D) are presented in Table 3. The model performance for detecting PCa and High grade PCa are presented in Table 4 and showed a significant improvement above the IPRC model. The model variables consist of Age, DRE, Family History, previous Biopsy, PSA, VEGF, IL-1a, IL-1b, E-Selectin, P-Selectin, L-Selectin and FPSA.
To give some insight into the clinical significance of the study, we selected thresholds manually based on the Youden index criteria. Using the threshold of 0.3 for high-grade Logst-RC (and the threshold of 0.275 for high-grade Multi-RC) with would have resulted in saving 71.2% (72.6%) of the biopsies at the cost of delaying the diagnosis of 27.9% (33.8%) of the high-grade cancers. The negative predictive value of the test results below this threshold would be 0.833 (0.828). Table 4 represents the discriminative abilities of both risk calculators for the diagnosis of PCa and high-grade PCa using AUC. Multi-RC showed an AUC of 0.7126 and 0.7671 and Logst-RC an AUC of 0.7308 and 0.7847 for diagnosis of PCa and high-grade PCa respectively. This significantly improved the predictive ability of the IPRC model, as demonstrated in the ROC in Fig. 1A,D. Figure 1 shows the decision curve analyses of the clinical utility of both models in detecting PCa (Fig. 1B) or high-grade PCa (Fig. 1E). For detecting PCa there was an improved net benefit for the threshold ranges of 0.35 to 1.0 and for detecting high-grade PCa there was an improved net benefit for the threshold ranges of 0.15 to 1.0 compared to the IPRC-clinical model alone. The calibration curves (Fig. 1C,F) show good agreements between predicted probabilities and the actual outcome indicating that all models are well calibrated, which have been confirmed by the (non-significant) Hosmer-Lemeshow results.

Model performance.
Integrating the panel of serum biomarkers with the clinical risk factors outperform the previously developed Irish risk calculator 25 . Logst-RC has shown slightly higher improvement when internally validated; however, further validation in an independent cohort will be required in order to confirm improvements and identify the most appropriate model and could be employed to select the best clinically accepted threshold to be used in practice. Table 2. Descriptive analysis of serum biomarkers (median and interquartile range) grouped by biopsy and grading outcomes. The p-value of Wilcoxon Rank test indicates whether the observed differences in median for each biomarker is significant.

Discussion
In this study, we have utilised a retrospective approach to show that the integration of inflammatory serum biomarkers into the clinical risk factors significantly improves the discriminatory power and clinical utility of the clinical risk factors alone for PCa and high-grade PCa (Gleason Score ≥ 7). This suggests that the Multi-RC or Logst-RC models would improve the detection rate and/or reduce unnecessary biopsies compared to the IPRC Table 3. Summary of Multi-RC and Logst-RC models using odds ratio, standard error and p-value for each risk factor in the model. ¥ The non-linear effect of the predictor using a log transformation.  www.nature.com/scientificreports/ risk calculator based on clinical features alone. These models demonstrated consistently higher net benefits over different preferences of wanting or avoiding a biopsy 29 and following further validation and threshold selection could have clinical utility. Chronic inflammation is associated with the development of many cancers including PCa and is possibly playing a role in its formation and development 30 . In the current study we identified a number of inflammatory mediators that increased the prediction of PCa and high-grade PCa when integrated with the current clinical features compared to clinical features alone. These included TNF-α, VEGF, IL-1α, IL-1β, ICAM-1, E-Selectin, P-Selectin and L-selectin. There is evidence in the literature that some of these mediators are associated with tumour development and progression including PCa. VEGF has been shown to be overexpressed in patients with colorectal 31 and PCa. Fryczkowski et al. demonstrated that VEGF concentrations were significantly higher in the PCa groups compared to the BPH patient group however on multiple logistic regression analysis VEGF was not an independent predictor of PCa and did not add to the clinical features alone 32 . Soluble adhesion molecules ICAM-1 and the selectins have been shown to be increased in Breast 33 and colorectal cancer 31 but there is no evidence in PCa to date. TNF-α levels have also been correlated with disease stage in breast cancer 34 but there is no evidence that TNF-α serum levels are associated with high grade of lethal PCa at the time of diagnosis of localised disease as well as IL-1α and IL-1β 35 . The power of our study is that we evaluated a number of inflammatory serum mediators and built a model selecting the biomarkers that gave the best prediction of PCa and high-grade PCa.

Odds ratio Std. Error p-value Odds ratio Std. Error p-value Odds ratio Std. Error p-value Odds ratio Std. Error p-value
The multinomial regression modelling approach identified a single combination of biomarkers for the risk assessment of PCa and high-grade PCa. However, two different sets were selected to estimate the risk of PCa and high-grade PCa in the logistic regression approach. The use of logistic regression helps to access the partial effect of the biomarkers on either detecting PCa or high-grade PCa, while the use of multinomial regression reduces the standard errors 36 . Both of these methods are employed in previous studies, including, the European risk calculator (ERSPC-RC 9 ) used the logistic regression approach, and two American risk calculators (PCPT-RC 10 and PBCG-RC 11 ) are developed using the multinomial regression.
The use of a logarithm transformation for some biomarkers in the model (e.g. IL-1β) represents a nonlinear effect of the biomarker on the risk, which indicates that a small change in the biomarker is critical. In contrast, the linear effect of some biomarkers in the model (e.g. E-selectin) represents that a change in any value of the biomarker has the same importance. The use of both linear and logarithm effects of the biomarkers in the model www.nature.com/scientificreports/ (e.g. IL-1α) indicates that, although any change in the biomarker is important, a small change in the values of the biomarkers are more critical. A limitation of the study is not having PSA density as a variable which is part of the ERSPC-PC 8 . We did not have access to the prostate volume data for this study at the time of patient recruitment as the Irish health care setting did not facilitate the collection of prostate volume until the TRUS biopsy was carried out.

Conclusion
Our study has demonstrated that as both models are well calibrated and utilise variables that are available from the patient (Age, Family history, DRE and Previous Biopsy) and assessed from their blood sample they are appropriate for individualized risk assessment. Both models show a statistically significant improvement above the IPRC justifying the addition of the serum biomarkers and their clinical use. Selecting the best model requires additional validation cohorts which would be used to independently validate and identify the best model and select the appropriate thresholds which are clinically accepted and maximize their discrimination and clinical benefit.

Data availability
Data is available to other researchers on written request to the corresponding author.