Clinical characteristics and overall survival nomogram of second primary malignancies after prostate cancer, a SEER population-based study

Prostate cancer (PCa) is the most prevalent cancer among males and the survival period of PCa has been significantly extended. However, the probability of suffering from second primary malignancies (SPMs) has also increased. Therefore, we downloaded SPM samples from the SEER database and then retrospectively analyzed the general characteristics of 34,891 PCa patients diagnosed between 2000 and 2016. After excluding cases with unknown clinical information, 2203 patients were used to construct and validate the overall survival (OS) nomogram of SPM patients after PCa. We found that approximately 3.69% of PCa patients were subsequently diagnosed with SPMs. In addition, the three most prevalent sites of SPM were respiratory and intrathoracic organs, skin, and hematopoietic system. The top three histological types of SPMs were squamous cell carcinoma, adenoma and adenocarcinoma, nevi and melanoma. Through univariate and multivariate Cox regression analysis, we found that the site of SPM, age, TNM stage, SPM surgery history, and PCa stage were associated with the OS of SPM. By virtue of these factors, we constructed a nomogram to predict the OS of SPM. The C-index in the training set and validation set were 0.824 (95CI, 0.806–0.842) and 0.862 (95CI, 0.840–0.884), respectively. Furthermore, we plotted the receiver operating characteristic curve (ROC) and the area under curve (AUC) which showed that our model performed well in assessing the 3-year (0.861 and 0.887) and 5-year (0.837 and 0.842) OS of SPMs in the training and validation set. In summary, we investigated the general characteristics of SPMs and constructed a nomogram to predict the prognosis of SPM following PCa.

www.nature.com/scientificreports/ (e.g., smoking) 10 , genetic factors 11 and treatment-related exposures (e.g., radiotherapy (RT)) 12,13 . Although the mechanism of SPMs is vague, the survival period of patients will be shortened once they are diagnosed with SPMs, and a former study has proved that adolescents and young adults with SPMs have worse survival than those with only primary cancer 14 .
Nomogram created by regression analysis has been widely employed to predict the prognosis of diverse cancers 15 because of its simplicity, intuitiveness, and practicality. It has been used for bladder cancer 16 , cervical cancer 17 , primary gliosarcoma 18 , and many other diseases. The efficiency of nomogram has been proved and has even become a new standard.
We have realized that it is of great significance for treatment providers and PCa survivors to understand the incidence and prognosis of SPMs after PCa. Therefore, in this study, we aimed to investigate the general characteristics of SPMs and construct a nomogram to predict the 3-year and 5-year survival of SPMs following PCa.

Materials and methods
Data source and study design. We extracted SPM cases from 18 population-based registries (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) in the Surveillance, Epidemiology, and End Results (SEER) database using SEER* Stat version 8.3.6. Clinicopathological data of interest were extracted, including age, race, TNM stage, site of SPM, histological type of SPM and PCa, surgery history of SPM and PCa, marital status, follow-up time, and latency time between PCa and SPM. To make our results more accurate, we adopted the Warren criterion to identify SPM. SPMs were identified as cancers histologically different from the initial primary cancer (IPC), with a latency period of not less than 6 months to exclude errors caused by metastasis and recurrence 19 .
First, we downloaded a total of 68,954 PCa cases from the SEER database. The inclusion criteria were as follows: 1. diagnosed age greater than 18 years; 2. A record of malignant behavior; 3. patients with complete survival data and follow-up information. The exclusion criteria were as below: 1. latency period between IPC and SPM shorter than 6 months; 2. patients with only autopsy or death certificate records. Then, after excluding 33,702 patients with the same histology as PCa, there remained 34,891 patients diagnosed with SPM. Patients with unknown information were also excluded, including no TNM stage: n = 24,452, unknown history of surgery: n = 184, unknown marital status: n = 121, unknown lymph node removed (LNR): n = 10, and no stage of PCa: n = 7921. Ultimately, we identified 2203 qualified cases, which were then divided into the training set (n = 1543) and the validation set (n = 660). The training set was used to identify prognostic factors and built a nomogram based on these factors. The training set and validation set were used for internal and external validation, respectively.

Statistical analysis.
To explore the association between clinicopathological variables and OS of SPM, we performed univariate and multivariate Cox proportional hazards regression analysis in the training set to identify the significant factors. Using these screened factors, we calculated the risk score of each patient according to the following formula: risk score = β1 × 1 + β2 × 2 + ⋯ + βnXn (β, regression coefficient; X, prognostic factor) 18 . According to the median score of the risk score, patients were divided into the high-risk group and low-risk group. Next, we chose factors with p value < 0.001 to develop a nomogram to predict the 3-and 5-year survival rates of SPM patients. To evaluate the prognostic ability of our model, we calculated the concordance index (C-index). Meanwhile, the receiver operating characteristic curve (ROC) was plotted and the area under the curve (AUC) was assessed. The calibration curves were drawn to estimate whether the actual result was consistent with the predicted probability. Each cohort was divided into three groups according to sample size. Bootstrapping with 1,000 resamples was used to evaluate discrimination and calibration. Kaplan-Meier curves were plotted and Log-rank analysis was applied to compare the OS on account of different prognostic factors.
Ethical statement. The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. Institutional review board approval was waived for this study because SEER database is a public anonymized database. The author Y Liu has gotten the access to the SEER database (accession number: 16704-Nov2018). The authors are accountable for all aspects of the work.

Results
Characteristics of SPM. We downloaded 68,954 PCa patients diagnosed during 2000-2016 from the SEER database. In order to exclude the bias caused by PCa recurrence and metastasis, we ruled out cases with the same histological type as PCa. Cases with a latency period of less than 6 months between PCa and SPMs were also excluded. Finally, a total of 34,891 patients diagnosed with SPMs were identified. Using the SEER database, we found that 945,196 men were diagnosed during 2000-2016 and approximately 3.69% of PCa patients were subsequently diagnosed with SPMs in this period. We concluded that the median interval between diagnosis of PCa and SPM was 57.0 months and the median diagnosed age of SPM was 74.0 years. We listed the sites and histological types of SPM that exceeded 1% in Fig. 2A,B. The three most prevalent sites of SPM were respiratory and intrathoracic organs, skin, and hematopoietic system (Table 1). In addition, bronchial and lung cancers accounted for the majority of cancers in respiratory and intrathoracic organs (Table S1). As shown in Table 2 Baseline characteristics of patients. A total of 34,891 cases diagnosed with SPMs were identified from the original data downloaded from the SEER database. After excluding patients with unknown clinical information, 2203 cases were ultimately enrolled for further analysis. These cases were randomly divided into the training set (n = 1543) and the validation set (n = 660). There were no significant differences (p > 0.05) in the site of SPM, SPM histology, age, race, T stage, M stage, LNR, PCa surgery, PCa stage, and marital status ( Table 3). The training set was used to construct nomogram and validate the model internally, while the validation set was used for external validation. In the entire cohort, we found that approximately 32.73% (n = 721) of SPM patients died after a median follow-up of 56 months.

Prognostic factors for the overall survival of SPM.
Intending to reveal the associated factors with the OS of SPM, we applied univariate and multivariate Cox regression analysis. The results were listed in Table 4. Univariate Cox regression analysis demonstrated that age (p < 0.001), race (p < 0.001), TNM stage (p < 0.001), LNR (p < 0.001), histology of SPM (p < 0.001), site of SPM (p < 0.001), marital status (p < 0.001), SPM surgical history (p < 0.001), PCa surgical history (p < 0.001), and PCa stage (p < 0.001) were associated with the OS of SPM. Next, using the factors identified by univariate Cox regression analysis, multivariate Cox regression analysis revealed that age (p < 0.001), TNM stage (p < 0.001), histology of SPM (p = 0.002), site of SPM (p < 0.001), marital status (p = 0.038), PCa surgical history (p < 0.001), and PCa stage (p < 0.001) were independent prognostic factors for the OS of SPM.

Kaplan-Meier analysis for prognostic factors.
We first calculated the risk score of each case according to the following formula: risk score = β1 × 1 + β2 × 2 + ⋯ + βnXn (β, regression coefficient; X, prognostic factor). Then, we divided samples into the high-risk group and low-risk group based on the media risk score. Kaplan-Meier (K-M) analysis showed significant differences in the prognosis between these two groups in the training set and validation set (Fig. 3A,B) and patients with high risks tended to have worse survival than those with low risk (p < 0.001). Significant differences were also observed in site of SPM (p < 0.001), age, TNM stage (p < 0.001), SPM surgery history (p < 0.001), and PCa stage (p < 0.001). Patients with higher age, TNM stage, PCa stage  www.nature.com/scientificreports/ had better survival (Fig. 3C,G). Also, patients who received surgery for SPM tended to have increased survival (Fig. 3H). SPM of skin had significantly better survival than other kinds of SPM (Fig. 3I).

Construction and validation of OS nomogram. According to the results of univariate and multivariate
Cox analysis, we chose the factors with p value < 0.001 to establish a nomogram to predict the 3-year and 5-year survival rate (Fig. 4). Seven clinical indicators, including site of SPM, age, TNM stage, SPM surgical history, and PCa stage were enrolled in our nomogram. In order to evaluate the discriminative ability of the nomogram constructed by us, we calculated the C-index in the training set (0.824, 95% CI: 0.806-0.842) and validation set www.nature.com/scientificreports/ (0.862, 95% CI: 0.840-0.884). The ROC was plotted and AUC was analyzed for both the training set and validation set (Fig. 5A-D). The AUCs in the training set used for 3-year and 5-year OS predication were 0.861 and 0.837, respectively. In the validation set, values of AUCs for 3-year and 5-year OS predication were 0.887 and 0.842. Both the C-index and the ROC indicated that the nomogram we constructed performed well in predicting the OS of SPM. In order to evaluate the accuracy of our model, we also used the calibration plots to judge the consistency of our predictions with actual outcomes (Fig. 6A-D). Figures presented an acceptable agreement in the training cohort and an excellent agreement in SEER validation cohort between the nomogram predictions and actual observations for 3-year and 5-year OS.

Discussions
As the most common cancer among males, the survival time of PCa patients has been significantly extended due to early detection and effective therapeutic strategies. PSA screening is helpful for early diagnosis, and can significantly reduce the mortality rate of PCa 20 . For decades, ADT through surgical or medical castration has been part of the standard treatment for PCa. Newly launched second-generation androgen receptor (AR) inhibitors for castration-resistant prostate cancer (CRPC), such as enzalutamide, also show significant capacities of improving the prognosis of PCa 21 . Recently, the use of ADT in combination with second-generation AR targeting agents or chemotherapy has significantly prolonged the longevity of metastatic hormone sensitive prostate cancer (mCRPC) patients. The addition of abiraterone acetate to ADT has shown a survival advantage compared to using ADT alone 22,23 . Two clinical trials have shown that, compared with alone, ADT plus docetaxel can improve the survival rate for adequately fit men 24,25 . All these advanced treatments have together contributed to the prolonged survival of PCa patients. Previous studies have shown that patients with in situ melanoma have an increased risk of developing PCa 26 and young men among colorectal cancer survivors have an excessively high risk of developing SPMs 27 . These evidences indicate that cancer patients had chances of developing SPMs. Studies in South Korea and Taiwan show that compared to the general population, PCa patients have a lower risk of SPMs, but once they got SPMs, the survival time of PCa patients will be greatly shortened 28,29 . For the reason of better insight into SPMs after PCa, we investigated the characteristics of SPM following PCa, and constructed a model based on clinicopathologic characteristics to predict the prognosis of SPM following PCa.
As a result of the extended survival period of PCa patients, recurrence, metastasis, and SPMs are expected to increase. In clinical practice, SPMs or multiple primary malignancies are very frequently indistinct from the metastasis of initial malignancy, leading to misdiagnosis and improper treatment of patients. In contrast to multiple primary malignancies, SPMs can affect the same organ but are anatomically distinct from the primary tumor, and represent neither a metastatic nor recurrent tumor from the initial malignancy. Via a strict screening process, we distinguished between SPMs from multiple primary malignancies, metastasis, and recurrence. After accurate identification, 3.69% of PCa patients were diagnosed with SPMs, which was much lower than previous 11.3% 8 . Compared with previous studies, our investigation enrolled a much larger population, containing 945,196 SPM samples. Our study showed that the three most prevalent sites of SPM were skin, hematopoietic system, bronchus and lung. Similar to our results, previous studies in Sweden reported that the most common SPMs were colorectal cancer, skin cancer, bladder cancer, lung cancer, melanoma, and non-Hodgkin lymphoma 8 . Another study also showed that the most common cancers of SPMs after PCa were lung and colon cancer 30 . In addition to these three most prevalent sites, a significant increase of SPM in the urinary tract was also observed in our study. It has been reported that there is an increased risk of developing SPMs in the bladder 13,29 . Shared etiology of the urinary system, such as common carcinogenic pathways, chronic inflammatory stimulation, and genetic mutations 31 , may be the reasons for this trend. The top three histological types of SPM were squamous cell carcinoma, adenoma and adenocarcinoma, nevi and melanoma, consistent with the histology of epidemic sites. These results indicated that these prone sites should be cautiously monitored.
Former researches have established nomograms to predict the probability of getting SPMs, including lung cancer survivors 32,33 , esophageal adenocarcinoma and squamous cell carcinoma patients 34 . However, as far as we know, there is no literature on the prognosis across the spectrum of PCa patients subsequently diagnosed with SPMs. In order to explore the outcome of SPM following PCa, we identified 7 parameters, including the site of SPM, age, TNM stage, SPM surgical history, and PCa stage, to predict the 3-year and 5-year OS of SPM patients. According to our assessment, our model performed well in predicting the outcomes of SPM patients. Of all these factors, surgical history of PCa and histological type of PCa presented a weak correlation with the      www.nature.com/scientificreports/ outcome of SPM, which might suggest that SPM mainly accounted for the death of SPM following PCa. Besides, researchers also found that most causes of death were caused by SPM not PCa 11,35 . PCa stage was enrolled in our nomogram, and was used to construct a predicting model for metastatic PCa together with TNM stage 36,37 . We could conclude that PCa still had its impact on the OS of SPM. However, we did not investigate the relationship between RT and SPM. According to earlier reports, PCa patients receiving RT have a higher risk of getting SPMs [38][39][40] . A meta-analysis also reveals that PCa patients receiving RT had an increasing risk of developing SPM of the bladder, colon, and rectum 41 . Some studies have shown that there is no difference in the incidence of SPM among patients receiving RT or other therapies 13,42 . The role of RT in the initiation of SPM still needs more exploration and the effect of RT on the survival of PCa patients remains unclear. Gene is another important internal factor of tumorigenesis of SPM, but the genotype-phenotype correlation of SPMs is still unclear. A significantly increased risk of SPM has been observed in survivors of hereditary retinoblastoma with high RB1 mutations 43 . P53 gene whose polymorphisms are associated with an increased risk of SPM is another gene extensively researched [44][45][46] . On the contrary, Anette. E et.al believed the correlation between P53 mutation and the incidence of SPM was doubtful 47 . Only limited evidence about the SPM genotype were explored and more studies are needed to explain the relationship between SPMs and gene mutations. Some other factors, such as smoking and obesity, were not investigated due to the nature of the SEER database. We are trying to explore the association between cancer-specific survival and clinicopathologic characteristics, but the causes of tumor death in many patients are still vague. Despite these limitations, our study still has its implications for PCa survivors.
In conclusion, we described the general characteristics of SPM following PCa and identified 7 clinical indicators to build a nomogram to predict the survival of SPM. The model we constructed performed well in assessing the prognosis of SPM but its actual efficiency should be evaluated with more large-scale researches. In addition, more studies should focus on the initiation, development, and prognosis of SPM.  www.nature.com/scientificreports/