Introduction

Recently, Beetz et al.1 reported that the normal tissue complication probability (NTCP) models developed in a population treated with a specific technique could not be generalized and extrapolated to a population treated with another technique without external validation. They showed that 3D conformal radiotherapy (3D-CRT)-based models for patient-reported xerostomia among head and neck cancers (HNC) patients treated with primary radiotherapy (RT) turned out to be less valid for patients treated with intensity-modulated radiotherapy (IMRT), so the 3D-CRT NTCP models cannot be used for IMRT cohorts.

In addition, we performed a validation test of the Quantitative Analyses of Normal Tissue Effects in the Clinic (QUANTEC) guidelines against quality of life (QoL) questionnaire datasets collected prospectively from patients with HNC, including head and neck squamous cell carcinoma (HNSCC) and nasopharyngeal carcinoma (NPC)2. We have found that the QoL datasets validate the QUANTEC guidelines and suggest that the modified QUANTEC 20/20-Gy spared-gland guideline is suitable for clinical use in HNSCC cohorts to effectively avoid xerostomia and that the QUANTEC 25-Gy guideline is justified for NPC cohorts, implying that a difference exists between the two cohorts that needs to be investigated. NPC is a specific entity different from head and neck carcinoma3.

HNSCC develops from the mucosal linings of the upper aerodigestive tract, comprising 1) the nasal cavity and paranasal sinuses, 2) the oropharynx, 3) the hypopharynx, larynx and 4) the oral cavity. NPC is a carcinoma arising in the nasopharynx that shows light microscopic or ultrastructural evidence of squamous differentiation. It encompasses squamous cell carcinoma, non-keratinizing carcinoma (differentiated or undifferentiated) and basaloid squamous cell carcinoma4,5. The disease behavior of NPC is different from HNSCC. The treatment strategies are also different. Approximately 90% of NPC patients develop lymphadenopathy and 50% of patients have bilateral lymph node involvement. Because the nasopharynx is immediately adjacent to the base of the skull, surgical resection with an acceptable margin is impossible. Radiation therapy is the sore treatment of NPC6,7. However, surgical resection with a safe margin is the treatment of choice for HNSCC. Therefore, the doses and fields of radiation therapy are different from NPC and HNSCC. With the advances of adjuvant treatment, concurrent chemotherapy may be considered according to patient's disease status to improve the control rate both in HNSCC and NPC. Whatever, the disease itself should not affect salivary flow or the patient's perception of salivary flow independent of radiation dose to those salivary glands. To ensure that xerostomia was induced primarily by the radiation treatment, patients with moderate-to-severe xerostomia at baseline need to be excluded from the analysis8,9,10,11.

Developing a multivariable logistic regression model requires an answer to the question of the number of predictive factors to include. Some predictive factors such as clinical and treatment-related factors that may have important effects on the risk of radiation-induced complications need to be taken into consideration. Xu et al.12,13,14 introduced least absolute shrinkage and selection operator (LASSO) to build NTCP models of xerostomia after 3D-CRT for HNC. De Ruyck et al.15 developed a multicomponent prediction model for acute esophagitis in lung cancer patients using LASSO. Our previous study developed a multivariate logistic regression model with LASSO to make valid predictions about the incidence of patient-reported xerostomia for HNC patients10. These reports all recommended the LASSO method for multivariable logistic regression NTCP modeling12,15.

The goals of this study were to characterize the incidence of moderate-to-severe patient-reported xerostomia among HNSCC and NPC patients treated with curative-intent IMRT and to find clinical and dosimetric factors associated with the toxicity. Specifically, we sought to explore the use of LASSO that incorporates the bootstrapping technique to develop multivariable logistic regression models that can be used to predict the incidence of moderate-to-severe patient-reported xerostomia for HNSCC and NPC patients. On the basis of the associations identified, it would then be possible to offer an efficient set of predictive factors to limit the risk of xerostomia for HNSCC and NPC patients treated with IMRT.

Result

One hundred and fifty-two HNSCC and 84 NPC patients completed QoL questionnaires at three time points (before RT, during RT and at 3 months after RT). Ninety-two HNSCC and 66 NPC patients completed QoL questionnaires at 12 months after IMRT. At the 3-month time point (for acute toxicity evaluation), 19 HNSCC and 6 NPC patients already suffering from moderate-to-severe xerostomia at baseline were excluded, leaving 133 HNSCC and 78 NPC patients to be analyzed. At the 12-month time point (for late toxicity evaluation), ten HNSCC and five NPC patients already suffering from moderate-to-severe xerostomia at baseline were excluded, leaving 82 HNSCC and 61 NPC patients to be analyzed.

The scatter plots of the mean dose and the differences in dose distributions to both the parotid glands between the HNSCC and NPC cohorts were shown in Fig 1. At 3 months after treatment, 32.9% of the HNSCC and 56.0% of the NPC patients reported moderate- to-severe xerostomia. After 12 months, 29.3% of the HNSCC and 37.9% of the NPC patients reported moderate-to-severe xerostomia (Table 1).

Table 1 Characteristics of patients with HNSCC and NPC treated by IMRT
Figure 1
figure 1

The scatter plots of the mean dose (a, b) and the differences in dose distributions to the contralateral and the ipsilateral parotid glands between the HNSCC and NPC cohorts (c, d).

Abbreviation: HNSCC: head and neck squamous cell carcinoma; NPC: nasopharyngeal carcinoma.

The initial candidate predictive factors for HNSCC and NPC patients are shown in Appendixes 1 and 2 (supplementary information), respectively. The LASSO of bootstrap prediction in the multivariable logistic regression analysis ranked the predictive factors in descending order, as shown in Table 2 for HNSCC and NPC patients at the 3- and 12-month time points. For all four models, the dosimetric factors for the mean dose given to the contralateral parotid gland and the ipsilateral parotid gland (Gy) were selected as the first two significant predictors. Followed by the different clinical and socio-economic factors being selected, namely age, financial status, T stage and education for different patients and periods. All corresponding coefficients of the multivariable logistic regression NTCP models are shown in Table 3. The NTCP value for each individual patient can be calculated using the following logistic regression formulae:

Table 2 Predictive factors correlation ranking for the 3- and 12-month time points by LASSO
Table 3 Multivariable logistic regression coefficients and odds ratios for the NTCP models for patient-reported xerostomia 3 and 12 months after treatment

For the 3-month time point, the NTCP model for HNSCC was where S = −32.29 + (Dmean-c*0.637) + (Dmean-i*0.185) + (age*0.202); for NPC, the model was where S = −44.98 + (Dmean-c*0.218) + (Dmean-i*0.185) + (financial status*corresponding coefficient) + (age*0.158); for the 12-month time point, the NTCP model for HNSCC was where S = −44.87+ (Dmean-c*1.400) + (Dmean-i*0.358) + (T stage*corresponding coefficient); for NPC, the model was where S = −45.96 + (Dmean-c*0.558) + (Dmean-i*0.538) + (education*corresponding coefficient).

The overall performance for both time points of the NTCP model for patient-reported xerostomia in terms of scaled Brier score, Omnibus and Nagelkerke R2 was satisfactory and corresponded well with the expected values (Table 4). The AUC for the HNSCC model was 0.88 and 0.98 for the time points of 3 and 12 months, respectively. For the NPC model, the AUC was 0.87 and 0.96 for the time points of 3 and 12 months, respectively. Finally, the Hosmer-Lemeshow test showed a significant agreement between predicted risk and observed outcome for both models16 (Table 4). External validations results were shown in Table 5. The system performances were shown worse than the original models in Table 4.

Table 4 System performance evaluation
Table 5 External validation between HNSCC and NPC cohorts

The parameters for the univariate NTCP regression analysis, shown in Table 6, were calculated by using the Dmean-c and Dmean-i for both patients. The long term tolerance contralateral parotid mean dose producing a 50% complication rate (TD50) was 25.4 Gy and 40.0 Gy for HNSCC and NPC cohorts after 12 month of IMRT, respectively.

Table 6 Parameter estimates from the univariate logistic regression NTCP model

Discussion

Xerostomia is one of the most important side effects of high-dose radiotherapy for HNSCC and NPC16,17,18. Currently, the prediction of this side effect is generally based on the parotid gland mean doses only. However, this parameter lacks sensitivity and specificity for estimating patient-specific treatment outcome correctly. To increase the predictive performance, additional parameters are required1,8,9,17. Therefore, this study combined clinical data and treatment parameters to develop a predictive multicomponent model for xerostomia.

The predictive models were achieved by LASSO because of two arguments. First, it selects models with the smallest number of factors while preserving predictive value with higher AUC performance than the previous study2. This is useful for clinical practice in consideration of time and cost efficiency. Secondly, the technique includes factors based on predictive value as opposed to statistical significance after correlation analysis. This is an important feature since univariate correlation analyses have to be followed by correction for multiple testing, which holds the risk of eliminating true positive results15.

Early NTCP models, like the LKB19 and the univariate logistic regression model20, are based on information derived from DVHs generated from dose distributions in the target volumes and the surrounding organs at-risk (OARs). For example, the mean dose received by the parotid glands is the only predictive factor for xerostomia in univariate models. Recently, we reported the results of a prospective study that was conducted to develop a univariate NTCP model for patient-reported moderate-to-severe xerostomia among HNC patients treated with IMRT2. The AUC values for the model were 0.68 (95% CI 0.61–0.74) and 0.72 (95% CI 0.64–0.80) for the 3- and 12-month time points, respectively. The only predictive factor was the mean dose to parotid glands. For the HNSCC patients in this study, when the number of predictive factors used was increased to three, the system performance AUC values improved from 0.68 to 0.88 for the 3-month time point and from 0.72 to 0.98 for the 12-month time point. For the NPC patients, the system performance AUC values improved from 0.68 to 0.87 for the 3-month time point and from 0.72 to 0.96 for the 12-month time point. The multivariable approach allowed the integration of different predictive factors in estimating the risk on xerostomia at 3- and 12-month in individual patients. All AUC values showed great performance (≥0.85). Dmean-c and Dmean-i were selected for both HNSCC and NPC patients with different corresponding coefficients for the individual models. In parallel, published data demonstrate that the best-studied predictive parameters with high levels of association with xerostomia are mean dose contralateral parotid gland (Gy)8.

In this multivariable model study, the Dmean-c and Dmean-i to the parotid glands were the most principal components causing xerostomia; however, age, T stage, financial status and education were also being selected. The result is similar to the previous study10. We likewise found that elderly patients have a higher probability of suffering from xerostomia than younger patients. Beetz et al. stated that older patients are more likely to use medication and to have co-morbidities that may influence and reduce saliva production at rest8,10,16. In this study, those who had a higher financial status or a higher level of education tended to avoid the inconvenience of xerostomia. Similarly, Ramsey et al. showed that lower financial status in colorectal cancer patients was associated with a worse outcome for reported pain21. Fang et al. found that NPC survivors with a higher annual family income and level of education presented a significantly better outcome on QoL scores22. These findings suggest that the patient's individual abilities and the resources available to cope with the threat of treatment complications are powerful variables that affect their future quality of life. Financial status and education remain two of the most significant variables correlated with patient-reported xerostomia in this study10. That the risk of complications may depend on more factors than only the dose to a single organ seems to be true. Clinical datasets on normal tissue complications often include a large number of variables, many of which need to be investigated and incorporated into a model because they are possibly related to a given complication. As reported by El Naqa et al., the prediction of endpoints can be improved by mixing clinical and dose-volume factors, while bootstrap-based variable selection analysis increases the reliability of the predictive models17. Indeed, our results showed better performance of the multivariable model compared with the univariate relationships between dose-volume prognosis factors and XER3m or XER12m; (XER3m or 12m: patients reported moderate- to-severe xerostomia after 3- or 12-month). In this regard, it should be stressed that dose-effect relationships for this endpoint should be described by multiple NTCP curves rather than by one single NTCP curve. Moreover, whether the gain is worth the increased complexity needs further investigation. The problem of increased complexity is a potential limitation of this study. After all, a large number of selected predictive factors may lead to instability for the models.

For the HNSCC cohort reported moderate- to-severe xerostomia were 32.9% and 29.3% at 3- and 12- month after treatment respectively. For the NPC cohort reported were 60.2% and 40.9% at 3- and 12-month after treatment respectively. This was because the prescribed doses generally exceeded 70 Gy in most NPC patients where a higher dose was used due to the curative aim of treatment. These doses might have led to the higher incidence of xerostomia in NPC patients than in HNSCC patients. However, the recovery rate is controversial; namely, more NPC patients than HNSCC patients recovered. This phenomenon implied that sparing both parotid glands seems to be having better recovery ratio than sparing one gland when the QUANTEC guideline has been hold. Whether this irradiated glands response existed may need further investigation.

From the point of anatomical concerns, the dose distributions in relevant organs at risk for NPC patients, in particular in the parotid glands, are different to those obtained with HNSCC. The question arises as to whether predictive models developed among patients with HNSCC are also valid among those NPC patients and vice versa. The external validation was performed, as inputted the HNSCC dataset, overall model performance of the NPC NTCP model for the HNSCC cohort was markedly lower in terms of the AUC, scaled Brier score and Nagelkerke R2. On the contrary, as inputted the NPC dataset to the HNSCC NTCP model, the system performance was worse than the original models, indicating that the differences in performance as observed in the HNSCC and NPC cohort cannot be explained well by each other. We recommended that the predictive models developed in HNSCC cohort cannot be generalized to the NPC cohort without external validation and vice versa. Similar concept reported by Beetz et al.1 who showed that the NTCP models developed in the 3DCRT could not be generalized and extrapolated to the cohort treated with IMRT.

Due to the Dmean-c and Dmean-i were the two most significant dosimetric predictors for all four models, therefore single Dmean-c and Dmean-i univariate NTCP regression models were considered for convenience use. To our knowledge there are no univariate NTCP models presented for Dmean-c and Dmean-i. For the univariate NTCP analysis, the TD50 for Dmean-c (50% cutoff point) was 25.4 Gy and 40.0 Gy for the HNSCC and NPC cohorts, respectively. However, these results are similar to those reported on mean dose to the parotid glands by Miah et al.23, 26.3 Gy for HNSCC and Kam et al.24, 42 Gy for NPC cohorts. The reason is clearer for the difference existed between the two cohorts needs to be investigated separately.

Prediction of patient-reported moderate- to-severe xerostomia for HNSCC and NPC patients can be improved by using multivariable logistic regression models with LASSO technique. On the basis of the associations identified, it is possible to offer an efficient set of predictive factors to limit the risk of xerostomia for HNSCC and NPC patients treated with IMRT. The predictive factors included in the models are useful to further optimize current IMRT treatment with regard to patient-reported xerostomia and to indicate which predictive factors are the most important to spare as much as possible. Moreover, the predictive model developed in HNSCC cannot be generalized to NPC cohort treated with IMRT without validation and vice versa.

The fact that chemotherapy, a non-dosimetric patient factor, may affect the risk of moderate-to-severe xerostomia toxicity, is an issue of special concern10. Moiseenko et al.25 and Deasy et al.26 reported that the use of chemotherapy was not typically related with xerostomia toxicity. This is consistent with our results, as chemotherapy was not significant among the candidate predictive factors used in this study and there was no association between chemotherapy and risk of patient-reported moderate-to-severe xerostomia. However, the chemotherapy regimens may be a factor for xerostomia. However, we were not planning to analyze the effect of chemotherapy regimens for xerostomia instead of screening for general factors in this study. The effect of chemotherapy regimens may be studied further in the future.

There are a number of potential weaknesses of this study. Treatment methods may differ among nations and institutions. Differences in radiation modality may produce different kinds and different levels of xerostomia toxicity. The model used is established in relatively homogenous population (patients in one hospital) and it will be useful to determine if the findings hold for other patients. The major weakness of this study is the lack of examination of dose to other structures, including submandibular glands and the oral cavity. The risk of xerostomia may be influenced by the techniques used for treatment or the co-irradiated of other organs may be needed for further investigation.

Methods

Study population

QoL questionnaire datasets from 152 patients with HNSCC and 84 patients with NPC were analyzed. All participants were treated with IMRT at the Kaohsiung Chang Gung Memorial Hospital between September 2007 and May 2011. The QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The characteristics of the patients with HNSCC and NPC are listed in Table 1. The problem of missing values was imposed by applying the stochastic expectation maximization (EM) algorithm27. This study was approved by the Chang Gung medical foundation institutional review board (99-1420B, 96-1231B) and all participants gave written informed consent; and all experiments were performed in accordance with relevant guidelines and regulations.

IMRT techniques

All patients were treated with IMRT as described in detail in previous publications2. For the IMRT planning goal, the mean dose to each parotid gland should be kept as low as possible, consistent with the desired clinical target volume coverage. The IMRT technique reduces the mean parotid dose, reducing xerostomia, as assessed by the Radiation Therapy Oncology Group (RTOG) xerostomia-related questionnaire score28. Sparing at least one parotid gland appears to eliminate complications25. Dose distributions were calculated and dose-volume histograms (DVHs) were generated separately for each parotid gland, enabling separate analysis. Two IMRT techniques were used: simultaneous integrated boost (SIB) and sequential mode (SQM). The prescribed total dose ranged from 54.0 to 77.4 Gy (median, 70.0 Gy). Details about the prescribed dose and fractions for the SIB and SQM techniques can be found in previous studies29,30.

Chemotherapy

Ninety-four HNSCC patients and seventy-five NPC patients received concurrent chemotherapy for XER3m. The regimens used involved with weekly CDDP regimen, PF regimen (cisplatin + fluorouracil) for 2–6 courses, or modified regimens according to patient's disease status by medical oncologist.

QoL evaluation

A prospective survey of QoL using the European Organization for Research and Treatment of Cancer (EORTC) C30 and H&N35 QoL questionnaires (QLQ-C30 and QLQ-H&N35) was performed on 152 patients with HNSCC and 84 patients with NPC. Details about the QoL evaluation can be found in previous studies2,10. The patients were asked to complete the questionnaire prior to treatment and 3 months, 6 months, 1 year and 2 years after IMRT. For the purposes of this analysis, the 3-month and 12-month follow-up time points were used. Chinese versions of the EORTC QLQ-C30 and QLQ-H&N35 questionnaires were obtained from the Quality of Life Unit, EORTC Data Center, Brussels, Belgium2,31. For each item on the EORTC QLQ-C30 and QLQ-H&N35 questionnaires, the following four-point Likert scale was used: none (0), a little (33), quite a lot (66) and a lot (100). All QoL scores are given in the text. A high score on the functional or global QoL scale represents a relatively high/healthy level of functioning or global QoL, whereas a high score on the symptom scale represents the presence of a symptom or problem. The EORTC QLQ-H&N35 questionnaire was used to evaluate the analytical endpoint for xerostomia and only the dry month item was used for this study. The primary endpoint was defined as moderate (66) to severe (100) xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT; this corresponds to the two highest scores on the four-point Likert scale. As we were primarily interested in moderate–to-severe xerostomia induced by RT itself, patients with moderate–to-severe xerostomia at baseline were excluded from further analysis1,8,10,16,22.

Statistical analysis

We aimed to develop a multivariable logistic regression NTCP model with LASSO to make valid predictions about the risk of moderate-to-severe patient-reported xerostomia using QoL datasets. The multivariable logistic regression analysis, with an extended bootstrapping technique, was used as described by El Naqa et al.17 and Beetz et al.1,8,16.

For each patient, predictive values were calculated for each set of predictive factors based on the multivariable logistic regression coefficients according to the following formula:

in which n is the number of predictive factors in the built model; variables xi represent different predictive factors; and βi are the corresponding regression coefficients.

For each HNSCC patient, 17 candidate predictive factors were initially included in the variable selection procedure. The candidates included 15 clinical and two dosimetric factors. For each NPC patient, 15 candidate predictive factors were initially included in the variable selection procedure. The candidates included 13 clinical and two dosimetric factors. The dosimetric candidate factors were the mean dose given to the contralateral parotid gland (Dmean-c) and the ipsilateral parotid gland (Dmean-i) (Gy). We excluded Vx values, which were previously found to be highly correlated with each other10,16; Dmean-c and Dmean-i were the only two DVH-parameters in this study. We used the LASSO process to select the optimal numbers of potential predictive factors for the NTCP predictive model. The LASSO was first proposed by Tibshirani in 199632; the details can be found in previous studies10,12,13. It uses the following equation to shrink the coefficients and select the predictive factors:

where d is the number of variables selected and t is tuning parameters that control the degree of penalty, which can be determined by cross-validation. Details can be found in previous studies10,12,33. However, in order to generalize the use of the models, a compact model can be generated by manually setting the value of t (to set like a penalty). In this study, the goal was achieved when the optimal selected number of predictive factors was set to no more than three if the AUC ≥ 0.85. After selecting the predictive factors, the system performance can be checked by using the AUC, scaled Brier score, Nagelkerke R2, Omnibus and Hosmer-Lemeshow test1,2,8,16.

External validations were checked to answer the question arisen as to whether predictive model developed among HNSCC patients are also valid among those patients with NPC who treated with IMRT and vice versa. System performance was checked by the same methods used above.

Single contralateral parotid gland and the ipsilateral parotid gland mean dose model conserved traditional techniques were considered for convenience use. The parameters for the univariate NTCP regression model are shown. Statistical analyses were performed using SPSS 19.0 (SPSS, Chicago, IL, USA).