Modelling prognostic factors in advanced pancreatic cancer

Pancreatic cancer is the fifth most common cause of cancer death. Identification of defined patient groups based on a prognostic index may improve the prediction of survival and selection of therapy. Many prognostic factors have been identified often based on retrospective, underpowered studies with unclear analyses. Data from 653 patients were analysed. Continuous variables are often simplified assuming a linear relationship with log hazard or introducing a step function (dichotomising). Misspecification may lead to inappropriate conclusions but has not been previously investigated in pancreatic cancer studies. Models based on standard assumptions were compared with a novel approach using nonlinear fractional polynomial (FP) transformations. The model based on FP-transformed covariates was most appropriate and confirmed five previously reported prognostic factors: albumin, CA19-9, alkaline phosphatase, LDH and metastases, and identified three additional factors not previously reported: WBC, AST and BUN. The effects of CA19-9, alkaline phosphatase, AST and BUN may go unrecognised due to simplistic assumptions made in statistical modelling. We advocate a multivariable approach that uses information contained within continuous variables appropriately. The functional form of the relationship between continuous covariates and survival should always be assessed. Our model should aid individual patient risk stratification and the design and analysis of future trials in pancreatic cancer.

Pancreatic ductal adenocarcinoma is a common cause of cancer death and is difficult to treat because clinical presentation is often late, and the disease is resistant to conventional chemotherapy. Long-term survival remains poor with a 5-year survival rate of 0.4 -4% (Bramhall et al, 1995;Jemal et al, 2003). Multivariable prognostic models are important for grouping patients into risk sets for predicting survival and treating appropriately. There is currently no prognostic tool in routine use to identify subgroups of pancreatic cancer patients for selection and stratification of treatment and prediction of survival.
An important issue in prognostic factor studies is the nature of the relationship between the factor and survival (functional form). Continuous variables are often simplified at analysis by assuming a linear relationship with log-hazard or by introducing a step function through categorisation (frequently dichotomisation). If the linearity assumption is not correct, the final prognostic model could be misspecified. Misspecification of the functional form may lead to inappropriate conclusions but has not been previously investigated in pancreatic cancer studies. Many researchers avoid this problem by dichotomising, with a consequent loss of power. There is also the risk of important bias when the choice of cutoff is data-driven and the use of different cutoff points across multiple studies hinders direct comparisons.
The aim of this study was to evaluate potentially important baseline prognostic factors for survival in advanced pancreatic cancer using prospective data from two randomised controlled trials and a total of 653 patients (Bramhall et al, 2001(Bramhall et al, , 2002. The study investigated clinical, histological, biochemical and demographic variables. A multivariable approach was used accounting for the functional form of the relationship between continuous factors and survival. Models were developed either on the basis of standard assumptions of log linear or step functional relationships with survival, or a novel approach based on nonlinear relationships, using more complex fractional polynomial (FP) transformations: a flexible, parametric method for modelling nonlinear (1) Treatment (2) Laboratory CA242 (2) CA19-9 (2) Leukocytes (1) Gamma GT (1) Albumin (1) LDH (1) CRP (3) Iron (1) relationships (Royston and Altman, 1994;Altman and Lyman, 1998).

Data
Two international phase III British Biotech studies (BB128, Bramhall et al, 2001;BB193, Bramhall et al, 2002) randomised 414 and 239 patients with advanced pancreatic cancer, respectively: BB128 randomised patients between marimistat and gemcitabine; BB193 randomised patients between marimistat with gemcitabine and gemcitabine alone. The studies had similar eligibility criteria: histologically or cytologically unresectable pancreatic cancer, within 8 weeks of diagnosis or disease recurrence and Karnofsky performance status of X50% (BB128) or X60% (BB193). Previous therapy for metastatic or locally advanced disease was an exclusion criterion. The primary outcome measure in both studies was survival time calculated from the date of randomisation to the date of death from any cause. Randomisation was stratified by cancer stage (stage I/II, III or IV), Karnofsky performance status (50 -70%, 80 -100%), sex and study centre. The first stage of data reduction was considering only factors that were clinically relevant and available within an NHS outpatient clinic. Eighteen baseline clinical, histological, biochemical and demographic variables (including trial and randomised treatment group) were considered appropriate for analysis as possible prognostic factors (Table 2).

Statistical analysis
We followed a strategy aimed at maximising model performance and avoiding poorly fitted and overfitted regression models in the development (Harrell et al, 1996) and reporting (McShane et al, 2005) of multivariable prognostic models. Initial analysis was based on standard methodology comparing Kaplan -Meier survival estimates using the log-rank test and estimating univariate hazard ratios for levels of each factor. The hazard of death was assessed in the multivariable setting using Cox proportional hazards regression modelling with variable reduction by backward elimination. The proportional hazards assumption was investigated for each covariate using log cumulative hazard (Collett, 1994) and martingale residual plots, and incorporating a time-dependent covariate (X ¼ factor(LN (survival)ÀLN (mean survival))) and did not indicate any significant violation. Ten of the 18 possible prognostic factors were collected as continuous measurements. Continuous data were investigated by assessing three different assumptions of the underlying relationship between survival and predictor, they are: (a) a linear relationship between the predictor and log hazard, (b) a step functional relationship using dichotomised covariates (laboratory measures based on central laboratory reference ranges) and (c) a nonlinear relationships based on either a simple log or more complex nonlinear FP transformation (Royston and Altman, 1994). For the third model, the functional form of each variable was assessed univariately comparing the Akaike's Information Criterion (AIC) (Collett, 1994) of a model based on the simple log transformation with the AIC of a model based on the best fitting FP transformation. First-and second-degree FP transformations (Meier-Hirmer et al, 2003) were considered using a selection level of 0.05 for input of variables based on power values of the polynomial ranging (À2, À1, À0.5, 0 (log), 0.5, 1, 2, 3). The best FP for each predictor was selected if it resulted in a significantly better fit (significantly smaller AIC) than the log transformation. The most appropriate (log or FP) transformation, if any, was applied to each variable and all variables were considered multivariately using Cox proportional hazards regression based on a backward selection method using a nominal significance level of 0.05 for elimination and including trial, sex, cancer stage (stratification factors at randomisation) and randomised treatment in each model.
The majority of variables had p5% missing values (Table 2). Tumour stage, CA19-9 and WBC had 5 -10% missing values, and lymph node status was missing for 24% of patients. Metastases or lymph node status was considered in the analysis as dummy variables using 'negative' as a reference level. Primary analysis was based on complete cases and a secondary analysis used multiple imputation to investigate the possible influence of variables with larger amounts of missing data (Rubin, 1987;Schafer, 1997) and provided valid inferential alternative results.
Model fit was assessed comparing AIC statistics, deviance residuals and Kaplan -Meier survival statistics for four predictive groups. The four predictive groups were based on quartiles of linear predictor scores, assessed comparing median survival estimates and hazard ratios. The bootstrap resampling approach described by Harrell et al (1996) was applied to assess the extent of overfitting in the final model, using 200 bootstrap resamples. This approach repeats the model selection methods used in the original model development in a series of bootstrapped resamples, freezing the derived model and applying to the original sample. Model optimism (overfitting) is described by the difference in the rank correlation coefficient relating predicted and observed survival times between the model derived in the bootstrap resample and that from the frozen model applied to the original sample averaged over 200 resamples. This provides an honest estimate of internal validity penalised for overfitting (Harrell et al, 1996).
Analyses were carried out using SAS and R using a two-sided significance level of 0.05 throughout.

Patient characteristics
A total of 653 patients were randomised. The eighteen clinically appropriate factors for analysis are presented in Table 2 and appear balanced across the two studies. On average, patients in the two trials were randomised 20 and 15 days after diagnosis and started treatment the day following randomisation. The average age of patients was 63 years (range 29 -89), 368 (56%) were male, 439 (68%) had cancer stage IV disease, 436 (67%) presenting with metastases and 251 (39%) had lymph node involvement.

Survival
The majority of patients (612, 94%) had died by the time of analysis with a median follow-up time of 21 months for the 41 patients still alive (Table 2, Figure 1). The median survival estimate for the group is 4.7 months (95% CI: 4.2, 5.1) with 12-month survival estimate of 17% ( Figure 1). Hazard functions estimated for 1-monthly time intervals to 18 months from trial entry were similar for both trials and reasonably constant over time. No significant survival benefit for marimastat was identified in the BB128 trial (P ¼ 0.19) when compared with gemcitabine (Bramhall et al, 2001). Similarly, no significant survival benefit was seen for a combination of gemcitabine and marimistat when compared with gemcitabine alone in the BB193 trial (P ¼ 0.95) (Bramhall et al, 2002).

Multivariate analyses
Three Cox proportional hazards regression models were developed (Table 4) using 556 patients (520 deaths) with complete data (excluding patients with missing data) based on the assumption of (a) a linear relationship between continuous covariates and log hazard, (b) a step functional dichotomisation of continuous covariates and (c) a nonlinear transformation of continuous covariates. All three models included trial, sex, cancer stage (stratification factors at randomisation) and randomised treatment group.
Nonlinear transformations were appropriate for two variables, LDH and CA19-9, and the estimated log hazard ratio functions are shown graphically in Figures 2 and 3. The second-degree FP function for CA19-9 estimates increasing risk up to an approximate CA19-9 value of 14 000 and then decreases with increasing CA19-9. The log function for LDH estimates increasing risk for increasing values of LDH.

Model comparison
In all three models, albumin, LDH and WBC were highly statistically significant and influential prognostic factors. Metastases were also an important variable but its parameter estimate and overall significance were reduced in the 'transformed' model when continuous covariates were included in a more appropriate format. In both the 'linear' and 'transformed' models, alkaline phosphatase was also a highly significant and influential prognostic factor. CA19-9 was also a highly significant and influential prognostic factor in both the 'categorical' and 'transformed' models. This variation is largely explained by the nonlinear relation of CA19-9 to survival (Figure 2), which could explain why it was considered important when dichotomised but not when included as linear. When considered as a transformed seconddegree FP, its significance was much greater. Bilirubin was selected as a highly significant factor in the 'categorical' model but was not included in either the 'linear' or 'transformed' models. AST and BUN were only selected as prognostic in the 'transformed' model.

Model performance
The AIC statistic was smallest for the 'transformed' model (Table 4, Model 3), indicating a better fit to the data. Deviance residuals for this model were plotted against the linear predictor and were randomly scattered centred around a residual value of zero ranging between À3.86 and 3.33, which suggests the data have not been mis-modelled.
When assessing model validity, the R 2 measure of model fit was estimated as 0.30. The bootstrap resampled estimate of R 2 of 0.26 described model optimism (overfitting) under 5% and gives an improved estimate of model accuracy.
Multiple imputation allowed all 653 patients to be included in the modelling process and confirmed all the variables included in the 'transformed' model with increased significance for metastases (P ¼ 0.001), and the model also included nodal status (P ¼ 0.016) that had been excluded from all models prior to imputation, suggesting a strong link with other variables already in the model.

DISCUSSION
Large, prospective, phase III randomised controlled trials aim to provide robust statistical evidence for new treatment combinations. Stratification is important to control for known important variability in the data. Generally, patients with pancreatic cancer are not clinically separated into prognostic groups, with the exception of surgical status, before consideration for treatment. This study investigated potentially important baseline prognostic factors for survival as possible stratification variables for randomisation and analysis. Data from 653 patients included in two international randomised controlled trials in advanced pancreatic cancer (Bramhall et al, 2001(Bramhall et al, , 2002 were analysed investigating multiple clinical, histological, biochemical and demographic variables in the form of both binary and continuous measurements. Valid statistical analyses are necessary to make best use of the data and optimise clinical results. As such, a multivariable approach was used to account for the functional form of the relationship between continuous prognostic variable factors and survival. Misspecification of functional form may lead to inappropriate conclusions but has not been previously investigated in pancreatic cancer studies. Continuous variables are often simplified by assuming a linear relationship between predictor and log hazard, that is the log risk increases or decreases   linearly as the value of the factor increases, which may not be appropriate. Dichotomisation of continuous data is common but is problematic and unnecessary. As the variability in outcome within groups is ignored by categorisation, the variability between groups may be significantly underestimated as patients close to the cut point are analysed as being very different rather than being very    similar, resulting in a serious reduction of statistical power to detect relationships between predictors and outcome, residual confounding and serious bias (Altman and Royston, 2006;Royston et al, 2006a). Regression using FPs of continuous covariates has been used in data from breast cancer (Sauerbrei et al, 1999) and metastatic renal carcinoma (Royston et al, 2006b) trials. Our study supported these in showing that this approach allowed important additional prognostic information to be extracted with less sophisticated approaches missed. FPs provide a flexible, parametric approach for modelling nonlinear relationships, making full use of the information available within each variable and as such can provide a clearer insight into the nature of the underlying relationship (Royston and Altman, 1994;Altman and Lyman, 1998).
We developed three prognostic models, two based on standard assumptions of log linear or step functional relationships with survival and a novel approach based on nonlinear relationships using more complex FP transformations. The model based on transformed covariates (Table 4, Model 3) was the best-fitting model, better utilising the information within nonlinear covariates. This model confirmed five previously reported prognostic factors, namely albumin, CA19-9, alkaline phosphatase, LDH and metastases; and also identified three additional possible prognostic factors not previously reported: WBC, AST and BUN. Nonlinear transformations were appropriate for two variables indicating strong nonlinear effects on survival: CA19-9 as a second-degree FP and LDH under a log transformation. Importantly, the effect of CA19-9 was not apparent in the 'linear' model, the effect of alkaline phosphatase was not apparent in the 'categorical' model and the effects of AST and BUN were not apparent in either the 'linear' or 'categorical' models, indicating how the significant effect of these variables may go unrecognised due to simplistic assumptions made in statistical modelling.
Shrinkage represents the degree to which a plot of predicted and observed values is flattened from the 451 line attributable to overfitting. Overfitting leads to inflated estimates of model fit and is a potentially important source of bias in prognostic models. Overfitting may be minimised through sensible model selection, which for survival models implies avoiding attempting to fit models with more than 1 candidate variable (degree of freedom) for each 10 events of interest (e.g., death) included in the analysis. The degree to which overfitting is present in the fitted model may be estimated either directly through validation in an external data set or through a bootstrap process. In practise, it is rare for an external data set to be available, and if data are scarce, it becomes attractive to use all available data to derive the prognostic model. Thus, bootstrap resampling approaches may become the model validation methods of choice. In our model, bootstrap resampling methods (Harrell et al, 1996) suggested minimal optimism. As shrinking estimators will not increase the real discrimination of the model, and the degree of overfitting estimated for the model is minimal, rescaling the model estimates appears neither helpful nor necessary.
A model based on multiple imputation methods (Rubin, 1987;Schafer, 1997) to control for missing covariate data selected an additional variable nodal status as prognostic (P ¼ 0.016), which had been excluded from all models prior to imputation. The true importance of this variable requires further investigation, suggesting a strong link with other variables already in the model. All prognostic models ideally require external validation to determine the generality across different data sets, and our results may be seen as provisional until replicated on independent data. Performance status and tumour size at randomisation are well-documented factors (Table 1) but unfortunately were not available in this data set and should be included in any external validation.
This research was based on data from two large, phase III randomised controlled trials representative of patients with advanced pancreatic cancer with a high event rate, long followup and an overall 1 year survival rate of 17% (Bramhall et al, 2001(Bramhall et al, , 2002. Analyses were based on a multivariable approach and utilised the information contained within continuous variables appropriately. The functional form of the relationship between continuous covariates and survival should always be assessed when investigating potential prognostic value. Models were based on information readily available in clinic and once validated should have the ability to aid decision-making by identifying patients with borderline disease for surgery and patients for inclusion into clinical trials or off-study treatment, especially since