Prognostic value of performance status assessed by patients themselves, nurses, and oncologists in advanced non-small cell lung cancer

Accuracy in the assessment of performance status by oncologists has not been well evaluated. We investigated possible discrepancies in the assessment of performance status among patients, nurses, and oncologists, and evaluated the prognostic importance of each assessment. Two hundred and six inpatients with inoperable, advanced non-small cell lung cancer were investigated prospectively. Weighted Kappa statistics for inter-observer agreement were 0.53 between oncologists and patients and 0.63 between oncologists and nurses. There was a significant difference among the assessments by the three groups (P < 0.001). Oncologists gave the healthiest performance status assessment, nurses an intermediate assessment, and patients the poorest. When included separately in the Cox model, the assessment by each group was significantly correlated with survival. However, the assessment by the patients themselves failed to distinguish survival of patients with performance status 1 and 2. Among the three models including patient-, nurse-, and oncologist-assessed PS, that including oncologist-assessed PS best fitted to the observed survival data. These results showed that the assessment by the patients themselves is different from those by the nurses and the oncologists and provided additional support for the use of the assessment by oncologists in clinical oncology. © 2001 Cancer Research Campaign http://www.bjcancer.com

Conflicting results have been reported by several studies on interobserver agreement in grading PS between the patients themselves and the oncologists. A previous study, for example, suggested that the PS assessment by the physicians specialized in clinical oncology was healthier than that by the patients (Loprinzi et al, 1994). However, the result might be biased because the patients had already been assessed by the physicians as chemotherapy-tolerable before participating in simultaneous chemotherapy trials. On the contrary, another small pilot study reported that the patients rated themselves as healtheir than the oncologists did (Taylor et al, 1999). Since patients are supposed to know their own physical condition better than anyone, the assessment of PS by the patients themselves would be reasonable and might be more reliable for prediction and stratification of prognosis. Indeed, recent studies on quality of life have suggested that the information from patient-completed questionnaires could be a prognostic factor independent of PS assessed by the physicians (Coates et al, 1992;Cella et al, 1997). Thus, if there are non-chance discrepancies in the assessments of PS among patients and external observers, the assessment of PS by health professionals should be rationalized by an advantage for prediction and stratification of patients' survival.
We, therefore, planned to investigate whether there was any systematic difference among the assessment of PS by patients themselves, nurses, and oncologists. To elucidate which of the three assessments would be the most useful prognostic factor, we prospectively examined PS of patients with recently diagnosed non-small cell lung cancer and evaluated the prognostic value of each assessment.

Patients
Between January 1995 and November 1998, consecutive inpatients with pathologically confirmed non-small cell lung cancer were prospectively accrued at Japanese Red Cross Nagoya First Hospital, a university-affiliated educational hospital with almost 1000 beds. To be eligible for this study, patients had to have stage III or IV disease at the initial staging procedure described below. Of 209 patients who were eligible, 3 refused to enter the study because of no interest in participation. This study had at least 85% statistical power to detect the 15% difference in the proportions of patients assessed as ECOG PS 0 to 1 (estimated proportions, 55% for patients themselves, 70% for oncologists) at the 5% significance level. This study was approved by the Institutional Review Board, and participants gave written informed consent before they entered the study.

Performance status assessment
The patients were asked to complete a short questionnaire regarding their own assessment of ECOG PS (Oken et al, 1982) as soon as after the diagnosis of lung cancer (Table 1). The questionnaire was delivered to each patient by a ward clerk to be filled out by the patient at the bedside. When the patient was unable to fill out the questionnaire by himself/herself, the clerk read it aloud and wrote down the patient's answer given orally. The clerk did not provide any specific implication of the terms in the questionnaire that might cause bias in the final interpretation of each PS description. Simultaneously, the attending oncologist and nurse working on the ward assessed the patient's ECOG PS in separate rooms of the ward by completing each questionnaire which included the same description of PS as shown in Table 1. The completed questionnaires were sealed in envelopes by each respondent before being turned in to the responsible investigator of this study (MA). Thus, each observer was blinded to the ratings of the others. The nurses and oncologists, who were dedicated to the care of patients with lung cancer, received no special training for participation in this study because they were familiar with the assessment of PS in daily oncology practice. They were asked not to interview the patients solely for assessment of PS, but to use the information they had obtained during the patient's care. All the oncologists were members of the Japan Lung Cancer Society.

Staging procedures
Patients were staged according to the TNM system (Mountain, 1997). Patients underwent the following procedures on the initial presentation: medical history, physical examination, histologic or cytologic confirmation of non-small cell lung cancer, complete blood cell counts with differential, serum chemistry, chest radiograph and computed tomography (CT), abdominal CT and/or ultrasonography, brain CT, and radionuclide bone scan. Bone radiographs were used to confirm metastatic lesions suspected by bone scan.

Demographic and clinical factors
Besides PS assessments by patients, nurses, and oncologists, the following potential prognostic factors were evaluated: age, gender, histology, metastatic organs (brain, liver, bone, lung, and lymph node), disease stage, T factor, N factor, weight loss within 6 months, leukocyte count, neutrophil count, platelet count, haemoglobin, serum albumin, serum lactate dehydrogenase (LDH), and the treatment modality (chemoradiotherapy, chemotherapy, radiotherapy, or supportive care alone).

Statistical methods
The proportions of patients assessed as fully ambulatory (PS 0 or 1) by the patients themselves and the oncologists were compared by chi-square test. The inter-observer agreement in the assessment of PS was evaluated using simple and weighted Kappa statistics (Cohen, 1960(Cohen, , 1968. In order to identify the predictor of disagreement in the PS assessment between the patients themselves and the oncologists, a multivariate logistic regression analysis was performed (Breslow and Day, 1980). In this analysis, the following variables were investigated for association with the proportion of patients whose assessment agreed with that of the oncologists: age, gender, weight loss, disease stage, treatment modality, and attending oncologist. Friedman test and Mann-Whitney U-test were used to compare the assessments of PS by the three groups and to compare each of the three pairs of assessments, respectively. Because elderly patients are generally excluded from clinical trials of cancer treatment, we examined whether the results of these comparisons remain unchanged when patients over 75 years old were excluded from the study population. Survival was calculated from the day of pathological diagnosis until death. Patients alive were censored as of last known follow-up. Survival curves were calculated using Kaplan-Meier method and compared using log-rank test. Prognostic importance of the factors including PS assessed by the three groups was analyzed using the Cox regression model (Cox, 1972). Forward stepwise procedures were used to select factors that were included in the final model. A-2 log likelihood value for fitting a model with all the explanatory variables was Table 1 Patient questionnaire for the determination of their ECOG PS Please circle the number of the phrase which characterizes you best at this time, according to your own judgement. Please do not ask advice from any others 0 Fully active, able to carry on all pre-disease performance without restriction 1 Restricted in physically strenuous activity but ambulatory and able to carry out work of a light or sedentary nature, e.g., light house work, office work 2 Ambulatory and capable of all self-care but unable to carry out any work activities. Up and about more than 50% of waking hours 3 Capable of only limited self-care, confined to bed or chair more than 50% of waking hours 4 Completely disabled. Cannot carry on any self-care. Totally confined to bed or chair calculated for individual Cox models which included patient-, nurse-, and oncologist-assessed PS separately, and was used to compare the performance of these three models. The likelihood ratio test was used to examine whether the fit of model including oncologist-assessed PS, which proved to be the best performing model in the present study, could be further improved by additional inclusion of the PS scores by the patients or the nurses. Statistical analyses were performed by SAS ver. 6.12 software (SAS Institute Inc., Cary, NC).

RESULTS
Characteristics of the 206 patients who entered the study were presented in Table 2. Age of the subjects was 67 ± 10 (mean ± SD) years old. No patient was lost to follow-up, and 16 patients alive were censored. The median follow-up time in the 16 patients was 824 days (range 352-1296). Twelve oncologists and 45 nurses participated in the assessment of PS. The distribution of the assessments by the patients themselves and the health professionals are shown in Tables 3 and 4. The oncologists assessed 71% of patients as ECOG PS 0 or 1, while 59% of the patients rated themselves as ECOG PS 0 or 1 (P = 0.007). The simple and weighted Kappa statistics between the two assessment were 0.29 (95% confidence interval 0.20-0.39) and 0.53 (95% confidence interval 0.45-0.61). The two Kappa values were 0.45 (95% confidence interval 0.36-0.55) and 0.63 (95% confidence interval 0.54-0.71) between the nurses and the oncologists, and 0.38 (95% confidence interval 0.28-0.47) and 0.56 (95% confidence interval 0.48-0.64) between the patients and the nurses, respectively. Thus, the best agreement was observed between nurse-and oncologist-assessed PS irrespective of the Kappa statistics used. In multivariate logistic regression analysis, only gender was significantly correlated with the disagreement in the PS assessment between the patients themselves and the oncologists (P = 0.037); the level of agreement was lower in female patients (37.2%) than in male patients (54.0%). The proportion of patients whose assessment of PS was poorer than that by the oncologists was 51.2% in females and 31.3% in males (P = 0.015).
A systematic difference in the assessments of PS was observed among the three groups (P < 0.001); assessment by oncologists   was the healthiest, that by nurses was intermediate, and that by patients themselves was the poorest. Assessments by oncologists and nurses were both significantly healthier than that by patients themselves (P < 0.001 and P = 0.003, respectively). On the other hand, there was no significant difference in the assessment between nurses and oncologists (P = 0.137). Difference among the three assessments remained significant when female and male patients were analysed separately (P < 0.001 and P = 0.004, respectively). The systematic difference remained significant when 23 patients over 75 years old were excluded (P < 0.001), with healthier assessments by oncologists and nurses than those by patients (P < 0.001 and P = 0.005, respectively). In survival analysis, 4 patients with previous malignancy were excluded. Median survival time of the resultant 202 patients was 221 days. Survival curves according to PS graded by patients and oncologists are shown in Figure 1. In univariate analyses, PS assessments by patients, nurses, and oncologists were all significantly correlated with survival (P < 0.001 in all cases). However, survival curves of patients with PS 1 and 2 by their own assessment were superimposed until 600 days after diagnosis. When included separately in the Cox model, each of the three PS assessments was significantly correlated with survival (Table 5). Gender, weight loss, haemoglobin, disease stage, T factor, bone metastasis, and treatment modality were selected as covariates in every case. In addition, lymph node metastasis was selected as a covariate when nurse-or oncologist-assessed PS was incorporated into the model. Because the selected factors were almost the same across the three models, we used all these factors as covariates for which the adjustment was carried out. The -2 log likelihood values with all the explanatory variables were 1494.4, 1502.2, and 1488.2 for the Cox models including patient-, nurse-, and oncologist-assessed PS, respectively. Since the degrees of freedom were the same among the three models, these results indicated that the Cox model including oncologist-assessed PS best fitted to the observed survival data.

Performance status by patients and health professionals 1637
British Journal of Cancer (2001) 85(11), 1634-1639 © 2001 Cancer Research Campaign  The fit of this oncologist-assessed PS model was not significantly improved by additional inclusion of categorical variables indicating the PS scores by the patients or the nurses (P = 0.374 and P = 0.339, respectively).

DISCUSSION
The present study demonstrated that the assessments of PS by health professionals, i.e., nurses and oncologists, were systematically different from that by the patients themselves. This finding may be relevant to previous reports that observer ratings of patients' quality of life were different from those of the patients themselves (Presant, 1984;Slevin et al, 1988;Osoba, 1994). Assessments of PS by nurses and oncologists were found to be healthier than that by the patients themselves. Furthermore, the agreement in the PS assessment between the two health professionals was better than that between the patients and the health professionals. These results may suggest that the tendency to rate healthier PS was not peculiar to the oncologists but also true of other health professionals. The difference in the PS assessment remained significant when the analyses were limited to younger patients who are considered potentially suitable for clinical trials. We, therefore, consider that the problem of discrepancy in the assessment of PS would be relevant in judging eligibility for clinical trials.
Investigations of the predictor variables of inter-observer disagreement are scarce in the PS assessment, although a similar analysis was conducted in the assessment of quality of life in cancer patients (Sneeuw et al, 1999). Interestingly, the results of the present study showed that female patients had a lower level of agreement in the PS assessment between the patients and the oncologists. The female patients reported poorer PS of their own compared with the male patients. Only two female oncologists assessed PS of 22 patients in our study, and this small number precluded detailed analysis of the gender-related difference. However, a similar difference was reported in other studies on quality of life in the general population (Brazier et al, 1992;Jenkinson et al, 1993;Hjermstad et al, 1998). In these studies, females reported poorer health status, including physical wellbeing, than males. In addition to the results in these 'reference population', female patients with asthma were shown to report worse physical and social status than male patients (Osborne et al, 1998), which warrants further studies to investigate whether this gender-related difference is also observed in patients with other chronic diseases.
Oncologist-assessed PS has traditionally been used as a criterion in clinical trials, which is partly supported by an acceptable level of agreement among the assessments by oncologists, especially for patients in good physical condition (Conill et al, 1990;Roila et al, 1991;Sørensen et al, 1993). This is the first study investigating whether the assessment of PS by the oncologists is rationalized in terms of the prognostic value. Univariate and multivariate analyses demonstrated that the assessment of PS 0, 1, and 2 by oncologists successfully stratified survival, whereas that by the patients themselves failed to distinguish survival of patients with PS 1 and 2. It would be critical to stratify patients for clinical trials in which only fully ambulatory patients should be the candidates (Bonomi et al, 1991;Ihde, 1992;Shepherd, 1994). Furthermore, the Cox model including oncologist-assessed PS best fitted to the observed survival data as indicated by the lowest -2 log likelihood value. These advantages for prediction and stratification of prognosis provided additional support for the use of the assessment of PS by oncologists in clinical trials for advanced non-small cell lung cancer. Because one reason for restriction on the entry of patients with PS 2 onto clinical trials were their poor tolerability of treatment (Ruckdeschel et al, 1986), further work will be needed to determine whether oncologist-assessed PS is also more reliable in predicting treatment-related toxicities than patient-assessed PS.
The present study showed that the straightforward assessment of PS by the patients themselves is not a fruitful method, because the assessment by the patients failed to demonstrate a better prognostic value than that by the oncologists. Although Loprinzi et al suggested that patient-assessed PS could provide independent prognostic information in their study, the ability of the Cox model to explain the data did not change substantially even when PS assessed by patients themselves was included in or excluded from the model (Loprinzi et al, 1994). In addition, the poorest model in their study was the one that did not include oncologist-assessed PS. These results suggest that the prognostic importance of PS assessment by patients themselves does not exceed that by oncologists. This advantage in assessment by oncologists in the estimation of prognosis might be partly explained by the availability of information from observation and examination of the patient (Schag et al, 1984). Considering the problem of colinearity among the PS assessments, prognostic importance of the combination indices of these assessments may be worthy of exploration.
In conclusion, the assessment of PS by the patients themselves is different from the assessments by health professionals such as nurses and oncologists. Assessments by the oncologists and the nurses were significantly healthier than that by patients themselves. However, this does not imply that health professionals would be optimistic and patients pessimistic and it is not possible to say who is correct. This might be an unconscious appeal for comfort or care from patients toward health professionals. Rather, this suggests that the assessments by health professionals would predict patients' prognosis more precisely than that by the patients themselves. Our results reinforced the rationale for the traditional use of oncologist-assessed PS in clinical oncology, and supported treatment decisions based on oncologists' assessment of PS.