Development and validation of nomograms to predict survival of primary adrenal lymphoma: a population-based retrospective study

While it is known that accurate evaluation of overall survival (OS) and disease-specific survival (DSS) for patients with primary adrenal lymphoma (PAL) can affect their prognosis, no stable and effective prediction model exists. This study aimed to develop prediction models to evaluate survival. This study enrolled 5448 patients with adrenal masses from the SEER Program. The influencing factors were selected using the least absolute shrinkage and selection operator regression model (LASSO) and Fine and Gray model (FGM). In addition, nomograms were constructed. Receiver operating characteristic curves and bootstrap self-sampling methods were used to verify the discrimination and consistency of the nomograms. The independent influencing factors for PAL survival were selected by LASSO and FGM, and three models were built: the OS, DSS, and FGS (DSS analysis by FGM) model. The areas under the curve and decision curve analyses indicated that the models were valid. This study developed survival prediction models to predict OS and DSS of patients with PAL. The FGS model was more accurate than the DSS model in the short term. Above all, these models should offer benefits to patients with PAL in terms of the treatment modality choice and survival evaluation.


Statistical analysis.
The data were processed using R 4.1.4(Vienna Statistical Computing Foundation, Austria) and survival 14 packages.The decision to use either the Shapiro-Wilk Normality Test or Kolmogorov-Smirnov Test was based on whether the sample size was less than 3000.As all continuous variables did not conform to a normal distribution, they were represented by the median (interquartile range).Categorical variables were expressed as frequencies and percentages (%).Survival between PAL and other adrenal tumors was analyzed using the Log-rank test and Kaplan-Meier (K-M) curves before and after propensity score matching (PSM).Patients with PAL were randomly divided into training and validation sets in a ratio of 7:3.Using the survival and glmnet packages, the univariate CoxPH model and LASSO were used to verify the independent factors influencing OS and DSS, respectively.The impact of each variable was analyzed using the Log-rank test and K-M curves.The nomogram model was then established according to the multivariate CoxPH, and the receiver operating characteristic (ROC) curve and the AUC were used to verify the prediction efficiency of the model.The bootstrap resampling method and calibration curve were used to evaluate the consistency of the model, and the net benefit to patients was assessed using clinical decision curve analysis (DCA).Statistical significance was set at P < 0.05 and marked with * in the tables.
Ethics declarations.This study is not including human or animal subjects.
Patients with PAL were older at diagnosis than those with other masses, had a higher incidence of male sex, were more frequently white, had a higher proportion of bilateral masses, and had lower SEER stage grades.Additionally, fewer patients underwent surgery and radiotherapy, whereas more patients underwent chemotherapy and systemic therapy.
A comparison of the OS and DSS of patients with PAL and others (Supplementary Fig. S2), the OS of PAL was lower than that of the others, while the difference in DSS between the two groups was not statistically significant.After PSM, both OS and DSS of PAL were higher than those of others, which might be because patients with PAL were older at diagnosis and had a higher proportion of bilateral masses than others, and PSM eliminated these effects.
www.nature.com/scientificreports/Baseline data for PAL (Table 2) showed that patients who died were generally older, male, married, in poverty, living in rural areas, had bilateral masses, underwent treatment less frequently (including surgery, chemotherapy, radiotherapy, and systemic therapy), and had higher grades of SEER and AAC stages.According to the World Health Organization (WHO) classification criteria for lymphoma 15 , eight (2.7%) cases were Hodgkin's lymphoma (HL), 280 (94.3%) were non-Hodgkin's lymphoma (NHL), and nine (3%) were not otherwise specified (NOS).B-cell lymphoma (BCL) was predominant in NHL, of which the most common type was diffuse large B-cell lymphoma (DLBCL), with 223 cases (75.1%) (Table 3).

LASSO.
We converted the multi-categorical variables into binary-categorical variables by using dummy variables.Table 4 presents the final variable assignments.A tenfold cross-validation was performed for all variables using LASSO CoxPH regression with the log (lambda) value of the harmonic parameter.The partial likelihood deviance of the model changed with a change in lambda.The corresponding number of variables filtered by the OS model is shown in Supplementary Fig. S7a, whereas that of the DSS is shown in Supplementary Fig. S7b.We www.nature.com/scientificreports/constructed an influencing factor classifier using the LASSO regression model (Supplementary Fig. S7c,d).After LASSO, 11 factors influencing OS were selected, including age, sex, side, surgery, systemic, income, pathology, SEER stage = regional, SEER stage = distant, AAC stage = stage II, and AAC stage = stage IV.Six influencing factors of DSS were selected: age, side, surgery, income, SEER stage (distant), and AAC stage (stage IV) (Table 5).
Nomogram.The models (OS, DSS, and FGS models) built by LASSO and FGM are represented by nomograms (Figs.1a, 2a, and 3a).We utilized the "regplot" package in R to generate nomograms, with the "center" parameter set to "T".The nomograms were created based on the results of the LASSO and multivariable CoxPH.The principle underlying nomograms involves converting the variable with the most extensive product of the coefficient and variable span (Age) to 100 and converting other variables to the same ratio.This approach enables simplified calculations while maintaining the prediction accuracy.For example, in Fig. 1a, the maximum and minimum values of Age (91 and 36) were collapsed to 100 and 0, respectively; HL and NHL in Pathology corresponded to 61 and 65, respectively; the presence or absence of Systemic therapy corresponded to 51 and   NHL without surgery or systemic therapy.The SEER stage was distinct, and the AAC stage was stage IV.Thus, the scores corresponding to each index of the OS analysis were 65, 61, 61, 61, 61, 82, 75, and 71; the total score was 537; and the corresponding 1-, 3-, and 5-year OS rates were 27.2%, 8.84%, and 4.16%, respectively.The scores corresponding to each analysis index of DSS were 61, 83, 61, 82, and 71; the total score was 358; and the corresponding 1-, 3-, and 5-year DSS were 16.00%, 4.55%, and 3.18%, respectively.The scores corresponding to each FGS index were 93, 61, 61, 99, and 71, with a total score of 385, and the corresponding 1-, 3-, and 5-year FGS were 28.90%, 9.24%, and 6.45%, respectively.

Validation and performance of nomogram.
The C-index of the predicted model for OS was 0.727 (95% CI 0.680-0.774),and after 1000 resampling internal validations, the calibration curve fitted well with the ideal curve, indicating that the predicted probability of the model had good uniformity and stability compared to the actual model (Fig. 1b-d).ROC curves were drawn according to the model-fitting results.The AUC of the 1-year OS were 0.797 (95% CI 0.748-0.846)and 0.848 (95% CI 0.787-0.908)for the training and validation sets, respectively.The AUC values of 3-year OS were 0.775 (95% CI 0.727-0.824)and 0.868 (95% CI 0.813-0.923).
The subgroup was divided according to the risk predicted by the model, and the median OS was 237 months (95% CI:151-not reached [NR]) for the low-risk group and 14 months (95% CI 8-26) for the high-risk group (Fig. 1g).In addition, the model had a sensitivity of 0.832, specificity of 0.696, and Youden Index of 0.528, resulting in a high prediction accuracy (Fig. 1h).
The C-index of the predicted model for DSS was 0.732 (95% CI 0.679-0.785),and after 1000 resampling internal validations, the calibration curve fitted well with the ideal curve, indicating that the predicted probability of the model had good uniformity and stability with the actual probability (Fig. 2b-d).ROC curves were drawn according to the model-fitting results.The AUC of the 1-year DSS were 0.805 (95% CI 0.758-0.852)and 0.801 (95% CI 0.719-0.883)for the training and validation sets, respectively.The AUC of 3-year DSS were 0.802 (95% CI 0.755-0.849)and 0.804 (95% CI 0.831-0.877).The AUC of 5-year DSS were 0.868 (95% CI 0.830-0.906)and 0.861 (95% CI 0.799-0.924).These results indicate that the model has high predictive ability (Fig. 2e).DCA according to the model showed that the model could improve the net benefit to patients by up to 18%, 31%, and 40% at 1, 3, and 5 years, respectively (Fig. 2f).The subgroup was divided according to the risk predicted by the model, and the median DSS was NR (95% CI NR-NR) for the low-risk group and 13 months (95% CI 7-26) for the high-risk group (Fig. 2g).In addition, the model had a sensitivity of 0.861, specificity of 0.604, and Youden Index of 0.465, resulting in a high prediction accuracy (Fig. 2h).
The C-index of the predicted model for FGS was 0.727 (95% CI 0.674-0.780),and after 1000 resampling internal validations, the calibration curve fitted the ideal curve well, indicating that the predicted probability of the model was in good agreement with the actual probability (Fig. 3b-d).ROC curves were drawn according to the model-fitting results.The AUC values of the 1-year FGS were 0.819 (95% CI 0.782-0.856)and 0.742 (95% CI 0.668-0.815)for the training and validation sets, respectively.The AUC values of 3-year FGS were 0.805 (95% CI 0.767-0.843)and 0.765 (95% CI 0.701-0.830).The AUC values of 5-year FGS were 0.857 (95% CI 0.823-0.890)and 0.762 (95% CI 0.697-0.828).These results indicate that the model has a high predictive ability (Fig. 3e).DCA, according to the model, showed that the model could improve the net benefit of patients by approximately 20%, 32%, and 38% at 1, 3, and 5 years, respectively (Fig. 3f).The subgroup was divided according to the risk predicted by the model, and the median FGS was NR (95% CI NR-NR) for the low-risk group and 8 months (95% CI 5-14) for the high-risk group (Fig. 3g).In addition, there was a significant difference in survival between patients who died due to PAL in the low-and high-risk groups (F = 39.616,P < 0.001), but not in those who died due to other causes (F = 0.192, P = 0.661).Moreover, the model had a sensitivity of 0.871, specificity of 0.679, and Youden index of 0.550, indicating a high prediction accuracy (Fig. 3h).

Discussion
PAL, a rare malignant mass, is a hot topic in the medical community.The prognosis of the patients was poor, the median OS was less than 3 years, and the 10-year OS was less than 30%.At present, most English language publications are case reports of DLBCL, and survival analysis is limited 6 .
In recent years, analyses based on large public databases have become a trend in the medical field 17 .Public database analysis has several advantages.First, it provides shared data resources, saving time and costs in collecting and generating data.Second, public databases validated and verified the data, ensuring the reliability of the research results.By analyzing these databases, new research directions and associations can be discovered.It also promotes collaboration and data sharing among researchers, thereby accelerating research progress.Additionally, public database analysis saves research costs and time, and improves research efficiency 13,18 .
This study compared the survival of PAL with that of other adrenal masses using PSM based on SEER data.Factors influencing the OS and DSS of patients were analyzed using Cox regression, and the prognosis of patients was analyzed using LASSO and FGM to construct a prediction model of OS, including age, sex, side, surgery, systemic therapy, income, pathology, and AAC stage (SEER stage), and compared with a DSS model including age, side, surgery, income, and AAC stage.The accuracy of the visible model was high based on ten-fold cross-validation.
Age is a significant factor that influences the prognosis and survival of almost all patients with carcinomas.Older patients tend to have poor nutritional status and tolerability.In contrast, masses at diagnosis tended to be at a higher stage [19][20][21] .There is some controversy regarding the prognosis of carcinoma according to sex.Some researchers suggest that men may have more risky lifestyles (e.g., smoking) and that men and women differ in hormone levels, leading to a worse prognosis [22][23][24] .A systematic review and meta-analysis of cancer immunotherapy efficacy and patient sex showed that men and women have different sensitivities to immune checkpoint inhibitors; thus, their prognosis may differ 25 .A review found that poor individuals had a lower OS than MCA individuals.This is not only due to higher rates of smoking, obesity, and substance abuse, but also because of unequal access to technological innovation, increased geographic isolation by income, reduced economic mobility, mass incarceration, and increased healthcare costs 26 .
In addition to the basic characteristics mentioned above, oncological characteristics also affect patient survival.Most extranodal lymphomas are NHL, whereas HL tends to progress to extranodal lymphoma with very few extreme malignancies.Therefore, among the pathological types of PAL, HL has a significantly worse prognosis than NHL 27 .Patients with bilateral masses tended to have lower survival rates, which is similar to other carcinomas and consistent with many reported results 6,7,28,29 .AAC stage is a lymphoma stage classification system approved by the WHO for both HL and NHL.A higher grade represents a greater extent of infiltration, and more sites are involved 30 .Based on the extent of invasion, the SEER stage classifies masses into three grades: localized, regional, and distant.The two classification systems were consistent.
Treatment modality similarly affected prognosis.In this study, treated patients had a much better prognosis than untreated patients, showing that surgery and systemic therapy are particularly important.Surgery included local tumor excision, simple/partial surgery, total surgery, and radial surgical removal of the primary site.Early surgery can improve patient survival.In addition, for adrenal masses, minimally invasive surgery can be of benefit [31][32][33][34] .Systemic therapy is a form of psychotherapy that focuses on how individuals' personal relationships, behavioral patterns, and life choices relate to the problems they face in their lives.This reduces the unintended risk to the patient and significantly improves OS; however, the impact of systemic therapy on DSS is not obvious.Radiotherapy does not seem to play a role in PAL prognosis.A clinical trial comparing chemotherapy with or without radiotherapy in DLBCL showed that the OS and DSS did not differ between the two groups 35 .In this study, chemotherapy appeared to improve the median OS (28 months [95% CI 18-41] vs. 23 months [95% CI 9-53]), but the difference was not significant.In the course of further analysis, it was found that patients with DLBCL had significantly improved OS after chemotherapy (27 months [95% CI 17, 50] vs. 9 months [95% CI 2, 53], P = 0.033).DLBCL was more sensitive to chemotherapy than other tumors, a result similar to those of other studies 6 .
Currently, only a few survival studies have been conducted on PAL.The innovation of this study is that the factors influencing survival in PAL were analyzed using LASSO, and FGM excluded the interference of other events.The prediction models for OS and DSS were established and validated, making this model more meaningful.According to the predicted prognostic model designed by the FGM, the predicted FGS was not statistically different from the DSS model, and the predicted survival accuracy was higher in the short term (less than 3 years).Therefore, it is more accurate to exclude the influence of death from other factors.
The limitations of this study are as follows: (1) as a rare malignant mass, the number of cases was small; (2) the period of cases was extensive, and the living environments and treatment conditions experienced by patients in different periods varied; and (3) LASSO and FGM were adopted in this study, which will be compared with other machine learning algorithms in a subsequent study, leading to the best model.
In conclusion, age, sex, side, surgery, systemic therapy, income, pathology, and AAC stage (SEER stage) affected OS in patients with PAL.Age, side, surgery, income, and AAC stage affected DSS.The prediction models for OS and DSS built using LASSO and FGM showed good predictive performance.The model created by FGM is more accurate in the short term (less than 3 years).Using this model, the survival expectations of patients with PAL can be effectively evaluated, enabling clinicians to individualize the design of treatment regimens, improve the expected survival of patients, and further benefit patients.

Figure 1 .
Figure 1.Nomogram of OS in patients with PAL.(A) Nomogram; (B) calibration curves in 1 year; (C) calibration curves in 3 years; (D) calibration curves in 5 years; (E) ROC curves of the nomogram in the training and validation sets.(F) Clinical decision curve analysis of prediction model.(G) Kaplan Meier curves for predicting risk subgroups according to the nomogram; (H) the calculated risk scores for each patient within the combined training and validation sets.

Figure 2 .
Figure 2. Nomogram of DSS in patients with PAL.(A) Nomogram; (B) calibration curves in 1 year; (C) calibration curves in 3 years; (D) calibration curves in 5 years; (E) ROC curves of the nomogram in the training and validation sets.(F) Clinical decision curve analysis of prediction model.(G) Kaplan Meier curves for predicting risk subgroups according to the nomogram; (H) the calculated risk scores for each patient within the combined training and validation sets.

Figure 3 .
Figure 3. Nomogram of survival according to the Fine and Gray model (FGS) in patients with PAL.(A) Nomogram; (B) calibration curves in 1 year; (C) calibration curves in 3 years; (D) calibration curves in 5 years; (E) ROC curves of the nomogram in the training and validation sets.(F) Clinical decision curve analysis of prediction model.(G) Fine and Gray model for predicting risk subgroups according to the nomogram; (H) the calculated risk scores for each patient within the combined training and validation sets.

Table 1 .
Baseline characteristics of the patients with adrenal tumors.

Table 2 .
Baseline characteristics of the patients with primary adrenal lymphomas.

Table 5 .
Risk factors selected by LASSO.