Introduction

Brain tumors refer to abnormal proliferation of cells in the brain and are the most common malignant tumors of the central nervous system1,2. Clinically divided into primary brain tumors and metastatic brain tumors3. Primary brain tumors refer to tumors caused by cells in the central nervous system, primary brain tumors account for about 1% of new cancers in the United States, about 2% of dead cancers in the United States, and their main primary tumor is glioma4. Previous studies have also found that brain tumors in childhood have a great impact on both morbidity and mortality in children5. At present, the traditional method of clinical treatment of brain tumors is surgery, radiotherapy and chemotherapy6,7,8. Astrocytoma, an aggressive tumor with the worst prognosis, can slightly improve survival with reasonable treatment, but the risk factors for this tumor have rarely been clearly studied9. This study used data from the U.S. National Public Database to analyze risk factors affecting the survival of patients with brain astrocytoma. The SEER database is currently the largest public cancer database, covering approximately 28% of the U.S. population, and the SEER database includes basic information about the U.S. population and information about relevant cancer characteristics10.

In recent years, nomogram have been widely used in the prediction of various diseases, especially tumors. It meets the needs of integrated models and plays a very important role in the current "digital medicine" environment, using nomogram to facilitate prognosis predictions for clinicians11,12,13. Therefore, this study aims to use the data from the SEER database to screen for risk factors affecting the survival of patients with brain astrocytoma, and to establish nomogram model of the survival rate of patients at 3 and 5 years, so as to guide doctors in predicting the prognosis of patients and provide assistance to clinicians.

Materials and methods

Data source

The data for this study were selected from the SEER database established by the National Cancer Institute, and we selected the database containing 13 registries with radiotherapy data, which provided data that could support the completion of this study. A total of 6154 patients diagnosed with astrocytoma of the brain from 2004 to 2015 were extracted from the database, and a total of 2214 patients were screened according to the inclusion and exclusion criteria. The types of astrocytoma include diffuse astrocytoma, anaplastic astrocytoma, pilocytic astrocytoma, unique astrocytoma variants, and astrocytoma, NOS above five types.

Inclusion and exclusion criteria

Inclusion criteria

(i) Patients with astrocytoma of the brain diagnosed in 2004–2015; (ii) The international tumor code ICD-0-3 is C70.0-C75.3, including the brain, frontal lobe, parietal lobe, temporal lobe, occipital lobe, etc. (iii) Those with complete clinical information.

Exclusion criteria

(i) Baseline information (e.g., race) is unknown; (ii) tumor size and tumor number are missing; (iii) survival time is unknown; (iv) proven only at autopsy or death.

Grouping methods

For a more intuitive and standardized study, the study data were transformed into dichotomous or multi-categorical variables. Age was classified into five age groups: < 20, 20–39, 40–59, 60–79, and ≥ 80; Race into black, white, and others; and Histological type into five categories: diffuse astrocytoma, anaplastic astrocytoma, pilocytic astrocytoma, unique astrocytoma variants, and astrocytoma, NOS. The Primary site was divided into brain (C71.0–C71.5), cerebellum (C71.6), brainstem (C71.7), spinal cord (C72.0), and others (C70.0 71.8 C71.9 C72.3 C72.5 C72.8 C75.1 C75.3); the Lateral division was unilateral and bilateral; the Grade was I-IV; for continuous variable Tumor size and Extension were divided using X-tile to select the best grouping method, and finally the Tumor size was classified as ≤ 60 mm and ≥ 61 mm, and the Extension was classified as 10–30 mm and 40–75 mm; Surgery, Radiation, Chemotherapy: yes/no; Tumor number was grouped as 1 and > 1.

Statistical methods

The data extracted from the SEER database were first organized according to the inclusion and exclusion criteria using Excel and classified into low-grade and high-grade brain astrocytoma patients according to WHO classification. The survival rates were calculated by Kaplan–Meier curve method using R-studio 4.2.2 software for low-grade and high-grade brain astrocytoma patients, respectively. and the effect of the included factors on patient survival was demonstrated by K–M curves, and log-rank test was used for group comparisons of the same variables. The data of low-grade and high-grade astrocytoma were randomly divided into training set and validation set in a 7:3 ratio with R-studio software, and χ2 tests were performed between different variables in the training and validation sets using SPSS. Univariate and multivariate Cox regression analyses were performed on the training set data using R-studio4.1.1, create a nomogram of the final filtered variables using the R packages 'rms', 'foreign', and 'survival', and the area under the ROC curve (AUC value) and C-index were used to evaluate the accuracy of the model, with AUC and C-index taking values ranging from 0–1, the closer to 1 indicating the more accurate the model; the calibration curve was used to evaluate the calibration degree of the model, and the closer the calibration curve was to the standard curve indicating the stronger predictive ability of the model. The differences were considered statistically significant at P < 0.05, except for the univariate Cox regression at P < 0.1.

Results

Comparison of patient baseline features

In this study, a total of 2214 patients were included in the study, there were 539 patients with low-grade astrocytoma and 1675 patients with high-grade astrocytoma. R-studio 4.2.2 was randomly split into training set and validation set according to the ratio of 7:3, with 379 patients in the low-level training set, 160 patients in the validation set, 1175 patients in the high-level training set, and 500 patients in the validation set. Comparing the different variables in the training set and the validation set, the p-value of the χ2 test result was > 0.05, and the difference was not statistically significant, indicating that the two groups were randomly assigned. Information on the two different grades and the results of the χ2 tests are shown in Tables 1 and 2.

Table 1 General data on training and validation sets for patients with low-grade astrocytoma n (%).
Table 2 General data on training and validation sets for patients with high-grade astrocytoma n (%).

Impact of different factors on patient survival

Risk factor analysis affecting survival in patients with low-grade astrocytoma

By univariate Kaplan–Meier survival curve and log-rank test, Age (P < 0.0001), Primary site (P < 0.0001), Tumor histological type (P < 0.0001), Tumor size (P = 0.01), Extension (P = 0.00013), Surgery (P = 0.00016), Radiation (P < 0.0001), Chemotherapy (P < 0.0001) and Tumor number (P = 0.015) were risk factors affecting the prognosis of patients with low-grade astrocytoma. The established K-M survival curve and log-rank test results showed that the factors of Age ≥ 80 years, Primary site at brainstem, Diffuse astrocytoma, Tumor ≥ 61 mm, deeper Extension, Bilateral, no Surgery, Radiotherapy, Chemotherapy and Tumor number > 1 were all related to poor survival time (Fig. 1).

Figure 1
figure 1figure 1figure 1

Kaplan Meier survival curve in low-grade astrocytoma patients. (a) Age; (b) sex; (c) race; (d) primary site; (e) histology type; (f) grade; (g) tumor size; (h) extension; (i) laterality; (j) Surg; (k) radiation; (l) chemotherapy; (m) tumor number.

Risk factor analysis affecting survival in patients with high-grade astrocytoma

By univariate Kaplan–Meier survival curve and log-rank test, Age (P < 0,0001), Primary site (P < 0.0001), Tumor histology type (P < 0.0001), Tumor size (P < 0.0001), Extension (P < 0.0001), Laterality (P = 0.01), Surgery (P < 0.0001), Radiation (P < 0.0001), Chemotherapy (P < 0.0001) and Tumor number (P < 0.0001) are risk factors for the prognosis of patients with high-grade astrocytoma. The results of the established K-M survival curves and log-rank tests showed that Age ≥ 80 years, Primary site at brainstem, Astrocytoma, NOS, Tumor size < 60 mm, deeper Extension, Bilateral, no Surgery, no Radiotherapy or Chemotherapy and Tumor number > 1 were all associated with poorer survival time in patients (Fig. 2).

Figure 2
figure 2figure 2figure 2

Kaplan Meier survival curve in high-grade astrocytoma patients. (a) Age; (b) sex; (c) race; (d) primary site; (e) histology type; (f) grade; (g) tumor size; (h) extension; (i) laterality; (j) Surg; (k) radiation; (l) chemotherapy; (m) tumor number.

Single-factor and multi-factor Cox regression results

Univariate and multivariate COX regression results for low-grade astrocytoma

Patient data from the low-grade astrocytoma training set (13 variables) were included in univariate Cox regression analysis, and the univariate Cox regression excluded the gender variable (P > 0.1). To avoid omitting important variables, 12 variables with P < 0.1 in the univariate Cox regression were included in the multivariate Cox regression. If P < 0.1 in the univariate Cox regression analysis, the factor was associated with prognostic survival of the patients; if P < 0.05 in the multivariate Cox regression analysis, the factor was an independent factor affecting the survival of the patients. The univariate Cox regression results of this study showed that age greater than 40 years, white ethnicity, histological type of tumor, primary site, lateral bilateral tumor, grade II, larger tumor size, deeper entry into the brain, surgery, radiotherapy, chemotherapy, and the number of tumors were factors related to the prognosis and survival of patients; multivariate Cox regression results showed that older age, bilateral tumors, and radiotherapy and chemotherapy were independent factors affecting patient survival (Table 3).

Table 3 Univariate and multivariate Cox regression analysis of low-grade astrocytoma training set patient survival.

Univariate and multivariate COX regression results for high-grade astrocytoma

The results of high-grade astrocytoma univariate Cox regression showed that age greater than 60 years, diffuse astrocytoma, initial location, bilateral tumors, tumor size, deeper entry into the brain, surgery, radiotherapy, chemotherapy, and tumor number were factors related to the patient's prognosis and survival. Multivariate Cox regression results showed that older age, diffuse astrocytoma, initial location, bilateral tumor, tumor size, deeper brain penetration, surgery, radiotherapy, and chemotherapy were independent factors affecting patient survival (Table 4).

Table 4 Univariate and multivariate Cox regression analysis of high-grade astrocytoma training set patient survival.

Creation of nomogram

The variables screened in the multifactorial Cox regression analysis (P < 0.05) were included in the R-studio software to create a nomogram model. Different values for each variable were taken to obtain different values of scores, and the total scores were obtained by adding all the scores of each variable, and according to the total scores, the survival rate of patients at 3 and 5 years could be predicted accordingly (Figs. 3, 4).

Figure 3
figure 3

Nomogram of 3-year and 5-year survival prediction for patients with low-astrocytoma of the brain astrocytoma.

Figure 4
figure 4

Nomogram of 3-year and 5-year survival prediction for patients with high-astrocytoma of the brain astrocytoma.

Validation of nomogram

The area under the ROC curve and C-index were used to evaluate the discrimination of the model, and the calibration curve was used to evaluate the calibration of the model.

Validation of nomogram in patients with low-grade astrocytoma

The AUC values of 3-year and 5-year survival rates of patients with low-grade astrocytoma training set were 0.829 and 0.801, respectively, and the AUC values of 3-year and 5-year survival rates of patients in the validation set were 0.902 and 0.829, respectively (Fig. 5). The C-index was 0.818 (95% CI 0.779, 0.857) for patients in the training set and 0.834 (95% CI 0.785, 0.883) for patients in the validation set. Meanwhile, the predicted survival curves for the 3-year and 5-year patients in the training and validation sets in Fig. 6 are closer to the actual curves, and the curves fit better, indicating that the model is more accurate.

Figure 5
figure 5

ROC curves of 3-year and 5-year survival prediction in patients with low-grade astrocytoma (a and b are the training set, c and d are the validation sets).

Figure 6
figure 6

3-year and 5-year survival calibration curves for patients with low-grade astrocytoma (a and b are training sets, c and d are validation sets).

Validation of nomogram in patients with high-grade astrocytoma

The AUC values of 3-year and 5-year survival rates of patients with high-grade astrocytoma training set were 0.814 and 0.806, respectively, and the AUC values of 3-year and 5-year survival rates of patients in the validation set were 0.802 and 0.823, respectively (Fig. 7). The C-index was 0.774 (95% CI 0.758, 0.790) for patients in the training set and 0.766 (95% CI 0.752, 0.780) for patients in the validation set. The 3-year and 5-year predicted survival curves of patients in the training and validation sets were in line with the true curves (Fig. 8).

Figure 7
figure 7

ROC curves of 3-year and 5-year survival prediction in patients with high-grade astrocytoma (a and b are the training set, c and d are the validation sets).

Figure 8
figure 8

3-year and 5-year survival calibration curves for patients with high-grade astrocytoma (a and b are training sets, c and d are validation sets).

Discussion

Under the current trend of "digital medicine", it is important for both doctors and patients to use a combination of clinical diagnosis and intelligent means to determine the patient's condition and prognosis related risk factors. On the one hand, it can assist doctors to understand the patient's condition in time for more correct treatment; on the other hand, it is conducive to patients having a clearer understanding of their own conditions, which can greatly promote communication between doctors and patients. At the same time, in recent years, more and more scholars have conducted tumor research by mining SEER database, thus generating a variety of tumor prediction models, which may become a new direction for tumor research in the future14. Patients with astrocytoma of the brain diagnosed from 2004 to 2015 in the SEER database were included in this study, and a total of 2214 patients were screened according to the inclusion and exclusion criteria. Patients were randomly divided into training set and validation set according to different levels in a ratio of 7:3. Results from a univariate Kaplan–Meier survival curve analysis showed that the factors we included had an impact on patient survival, regardless of whether the tumor was low-grade or high-grade brain astrocytoma, with the exception of age and gender. The results of univariate and multifactor cox regression analysis of the training set data for patients of both grades showed that no radiotherapy and chemotherapy were protective factors for patients with low-grade brain astrocytoma with an OR less than 1, whereas the opposite was true for high-grade. It could indicate that patients with certain tumors of low grade would have longer survival without radiotherapy treatment, while patients with high-grade astrocytoma would need radiotherapy to survive longer. This result is clinically consistent and has some clinical significance. Meanwhile, the COX regression results affecting patient survival were consistent with the K-M curve, indicating the accuracy of the results.

Age has been found to be an important factor affecting the survival of patients in both low-grade and high-grade brain astrocytoma, and this result is more consistent with the findings of other scholars. Previous studies have also found a strong relationship between brain tumors and age8, older age predicts higher risk of disease15,16. However, some scholars studying advanced age and brain tumors have also found that elderly people may have slower tumor progression17, and low-grade and high-grade brain tumor log-rank test results and the Cox regression results indicated that older patients are more likely to have lower survival rates. In conclusion, age is an extremely important factor in the prognosis of patients with brain tumors and deserves further study. The gender distribution in this study was relatively balanced. In terms of racial distribution, Whites were overwhelmingly represented. In this study, the K-M curve and Cox regression results showed that the differences between sex and race were not statistically significant (P > 0.5). Studies have found that the incidence and mortality of brain tumors in both men and women have decreased year by year in recent years, but no significant differences have been found between sexes and races18.

The primary site of the patient's brain astrocytoma is also an important factor affecting survival. By comparing the K-M survival curves of low-grade brain astrocytoma with those of high-grade brain astrocytoma, we can find that the survival rate of patients with low-grade brain astrocytoma is significantly higher than that of high-grade brain astrocytoma, and this result is consistent with clinical reality. The data of this study has been analyzed to find that most of the tumors are concentrated in the cerebrum, and experts who have studied children's brain tumors have found that children's brain tumors, especially astrocytoma, are more common in the cerebellum19, which may be related to the wider distribution of age contained in the data of this study. Therefore, the age of the patient can affect the distribution of astrocytoma in the brain. We found that the survival rate of patients with pilocytic astrocytoma, a slow-growing benign tumor that generally does not require radiotherapy, is the highest among both low-grade and high-grade brain astrocytoma by K–M survival curves of brain tumor histology type. The results of cox regression showed that diffuse astrocytoma was a major risk factor for patient survival and astrocytoma has a poor prognosis9,20. At the same time, in this study, we found that the survival rate of patients with high-grade brain astrocytoma with bilateral tumors was lower than that of patients with unilateral tumors by K–M survival curves, and a greater number of tumors, deeper extension, and sequence number were associated with poorer patient survival. But this study found that the smaller the tumor, the lower the survival rate of patients, studies on breast cancer21, adult glioma16 and peripheral schwannoma22 have found that larger tumors are related to poor prognosis, the clinical inconsistency may be due to the fact that the classification of astrocytic tumors in this study is not the latest classification standard, and there are no molecular typing-related classification standards in the 2004–2015 database.. Current treatment for high-grade brain tumors or malignant brain tumors23, surgery on patients, and simultaneous radiotherapy and chemotherapy can benefit the survival of patients7,14. The results of this study yielded an OR greater than 1 for both low-grade and high-grade tumors in patients without surgery relative to patients with surgery, indicating that surgery has a better prognosis for patients, and this result is consistent with the current conventional treatment of brain tumors in clinical practice. The present study also has some limitations, as the SEER database itself provides a limited amount of information, and the database does not provide any information on genes, so we could not study the prognostic factors of brain tumors at the genetic level19. Second, with the development of gene sequencing, brain tumors have entered the era of molecular typing, the data extracted in this study before 2016, there was no molecular typing in the database, so molecular typing analysis could not be performed, and different histotypings would change the prognosis of patients, and it is worth further research in the future.

In conclusion, In this study, the risk factors for patients with low-grade and high-grade brain astrocytoma were screened by univariate Kaplan–Meier survival curves, respectively, while the risk factors affecting the prognosis of patients with brain astrocytoma in both grades were more completely included and the nomogram were successfully established, with high AUC and C-index in both tumor training and validation sets for both grades, and a good calibration curve fit, indicating that the nomogram has a strong predictive ability to predict the 3- year and 5-year survival rates of patients. However, since the data were obtained from the United States, more studies are needed to verify whether the results obtained from the application of this data can be applied to the Chinese population, and the results obtained from this study can provide some reference for clinicians.