Introduction

Chronic kidney disease (CKD), with its high prevalence and mortality, has become an important public health problem across the globe and in China1. Furthermore, although recent years have seen the survival time of patients with CKD be significantly prolonged with the continuous improvement of diagnosis and treatment technology, the various accompanying psychological and social problems still affect patients’ health-related quality of life (HRQoL).

Within the topic of HRQoL, several different generic health utility assessment instruments are available2,3, with the EQ-5D-3L being the preferred one for evaluating utility in cost-utility analysis (CUAs) in many countries4. This instrument describes HRQoL using five dimensions, each with three response levels (no problems, some problems, and severe problems; resulting in 243 health states): mobility, self-care, usual activities, pain/discomfort, and anxiety/depression. Then, each EQ-5D-3L health state can be converted into a utility score using a country-specific scoring algorithm, namely, a value set. This utility score, in turn, is preference-based and ranges from 0 (death) to 1 (perfect health), with negative values representing health states worse than death5. Health utility is required to derive quality-adjusted life years (QALYs), which is an outcome measure of the CUAs method of economic evaluations—the latter being used to inform priority-setting decisions in healthcare.

Available in Chinese, the EQ-5D-3L has been widely used in Chinese healthcare contexts for over a decade, albeit two value sets have only become domestically available in 2014 and 2018 through the studies conducted by Liu et al.6 and Zhuo et al.7, respectively. Furthermore, previous comparisons of utility scores based on value sets for different countries suggested substantially different results8,9, which thereafter lead to differences in QALYs estimation and CUAs results, and ultimately to different healthcare funding decisions. These issues become even more important for modelled CUAs, where survival and QALYs are extrapolated over long periods. Thus, the choice of value set may interfere in decision-making and country-specific value sets should be used whenever possible.

Couple these issues with the availability of two EQ-5D-3L value sets for China and the lack of studies on this topic, it remains unclear which of the two value sets should be used among Chinese patients with CKD. Furthermore, prior to the establishment of these two value sets, related studies in China generally used value sets from other countries, including those from the United Kingdom (UK) and Japan. To our knowledge, there has been no published study comparing different value sets for Chinese patients with CKD, and prior research corroborates this assumption10,11.

This study, therefore, had a three-fold aim: provide reference data on which EQ-5D-3L value set should be used with Chinese patients with CKD; assess differences in HRQoL by applying the Chinese (from 2014 and 2018), the UK, and the Japanese value sets; and examine differences in utility scores for key preventive influencing factors.

Methods

This study used data collected through a cross-sectional, multicenter, survey-based study on the HRQoL of adult patients (age ≥ 18 years) with CKD. Participants were outpatients admitted to eight hospitals in four big cities (Beijing, Shanghai, Chengdu, and Guangzhou) in China from November to December 2012. The participating hospitals were the main nephrology centers of each city, as follows: Peking University People’s Hospital and China-Japan Friendship Hospital in Beijing; Huashan Hospital and ShangHai Sixth People’s Hospital in Shanghai; ChengDu Military General Hospital and West China Hospital of Sichuan University in Chengdu; Guangzhou First People’s Hospital and Guangdong General Hospital in Guangzhou. The Chinese version of the EQ-5D-3L was applied, data were collected through face-to-face interviews, and informed consents were obtained from patients before being interviewed. The inclusion criteria of patients are described herein: (i) patients diagnosed with pre-dialysis CKD or patients had maintaining hemodialysis and peritoneal dialysis, for at least three months and who had resided locally for more than six months; (ii) patients capable of understanding the investigator’s questions and willing to complete the questionnaire. The study included 375 patients with CKD. Data from patients who provided incomplete or non-standard answers to the EQ-5D-3L were excluded from this study (2 patients). Therefore, the final sample included 373 patients.

Utilities were calculated based on the two Chinese6,7, the UK4, and the Japanese12 value sets which are presented in Table 1. The calculation formula of health utility score is as follows: Utility = 1 − (constant + sum of all coefficients × variable values). Specifically, when calculate the utility based on the value set of China 2014, Utility = 1 − (0.039 + 0.099 × M2 + 0.246 × M3 + 0.105 × S2 + 0.208 × S3 + 0.074 × U2 + 0.193 × U3 + 0.092 × P2 + 0.236 × P3 + 0.086 × A2 + 0.205 × A3 + 0.022 × N3). In the formula, M2, S2, U2, P2 and A2 respectively represent 1 if mobility, self-care, usual activity, pain/discomfort and anxiety/depression are at level 2, and 0 for others. M3, S3, U3, P3 and A3 respectively represent 1 when the above five dimensions are at level 3, and 0 for others. N3 is equal to 1 if at least one of the five dimensions is at level 3, and 0 otherwise. When a patient’s health status was M3S3U3P2A2, in other words, this patient reported “severe problem” in mobility, self-care and usual activities dimensions and “some problem” in pain/discomfort and anxiety/depression dimensions, the health utility value was Utility = 1 − (0.039 + 0.246 + 0.208 + 0.193 + 0.092 + 0.086 + 0.022).

Table 1 Comparison of utility calculation methods based on the two value sets for China and those for the UK and Japan.

Shapiro–Wilk test was used to examine whether the calculated utility scores were normally distributed, and Friedman test and Wilcoxon signed rank test were used to determine differences in utility scores derived from the four value sets. These tests examined whether using different value sets led to different utility scores and whether using one value set over another could interfere with QALY in CUA. The minimal clinically important difference (MCID) was set at 0.05 based on the minimum time that could be traded in the original time trade-off experiments used to develop the EQ-5D-3L8. Furthermore, intra-class correlation coefficients (ICCs) and Bland–Altman plots were used to evaluate the consistency between utility scores from the four value sets; consistency were considered good if ICC > 0.7013. Since the utility score of many patients was 1 (i.e., implying the existence of a ceiling effect) and utility scores of less than 1 were continuous, Tobit regression model was used to analyze the influencing factors of utility score. The independent variables in the model included CKD stages, age, sex, education level, city, insurance type, monthly income, dialysis duration, and primary renal disease. The included variables were customary in previously published articles related to this topic1,10,11.

All statistical analyses were conducted using Stata version 16.0, except for the Bland–Altman plot, which was drawn by MedCalc 20.1, and ICCs, which were calculated using SPSS version 25.0. Statistical significance was set to p < 0.05.

The Ethics approval and consent to participate

Ethical approval for this study was obtained from Peking University Ethics Review Committee (IRB 00001052-17006) in China. All methods were performed in accordance with the relevant guidelines and regulations of the review board. All patients were approached for informed consent. Additionally, confidentiality was guaranteed.

Results

Descriptive analysis

The mean age of patients was 59.2 ± 15.7 years, and two out of three (68.4%) patients were in the pre-dialysis stage. More than 70% of the patients had lower education level (senior high school and below education) and monthly income (less than 5000 CNY). Table 2 presents the distribution of limitation by each dimension among patients with CKD. In total, 202 (54.16%) patients had no problems in any dimension (i.e., utility score of 1), and the problems reported most often were pain/discomfort (32.17%; with “some problem” and “severe problem” combined) and anxiety/depression (25.47%). The least reported problem was self-care (8.58%; with “some problem” and “severe problem” combined).

Table 2 Distribution of limitation by each dimension among patients with chronic kidney disease.

Table 3 displays the descriptive statistics of utility scores calculated using the value sets for China, the UK, and Japan. According to Shapiro–Wilk tests of normality (p < 0.001), the utility scores based on the four value sets were not normally distributed.

Table 3 Descriptive statistics of utility scores based on the two value sets for China and those for the UK and Japan.

Comparison of utility scores based on the four value sets

Table 4 shows the results of comparing the utility scores based on the four value sets for China, the UK, and Japan. According to Friedman test results, the differences among utility scores based on the four value sets were statistically significant (p < 0.001), with Wilcoxon signed rank test results then showing that the China 2018 value set yielded significantly higher utility scores than did the other three (p < 0.001).

Table 4 Results of comparison and consistency analysis of utility scores based on the two value sets for China and those for the UK and Japan.

Consistency analysis of utility scores based on the four value sets

Table 4 also presents the consistency of utility scores based on the four value sets for China, the UK, and Japan. All ICCs were high and statistically significant (p < 0.001), and the ICCs between the value sets for China 2014, the UK, and Japan were all greater than 0.9, indicating good consistency. Meanwhile, the ICCs between the value sets for China 2018 and China 2014, the UK, and Japan were less than 0.7, indicating less consistency. In addition, the mean differences between the value sets for China 2018 and the other three were greater than the MCID of 0.05, indicating that a significant difference exists between different value sets. The consistency of utility scores for each pair of value sets was also assessed using Bland–Altman plots (Fig. 1), which show that the consistency intervals were wide and that some points fell outside of the plot. These results indicate that the four value sets were not interchangeable.

Figure 1
figure 1

Bland–Altman plots of consistency between utility scores based on the two value sets for China and those for the UK and Japan.

Influencing factors of utility scores based on four value sets

Table 5 shows that the influencing factors of utility scores of Chinese patients with CKD mainly included CKD stages, age, education level, city, and primary renal disease, and the findings were similar across all value sets. For example, the utility scores of patients with pre-dialysis CKD were higher than those of dialysis patients, while the utility scores of peritoneal dialysis patients were higher than those of hemodialysis patients. Furthermore, utility scores decreased with an increase in age and increased with a rise in education level, and the scores of patients in Guangzhou and Chengdu were higher and lower, respectively, than those of patients in Beijing. There were no statistically significant differences by sex, insurance type, monthly income, and dialysis duration.

Table 5 The influencing factors of utility scores based on the two value sets for China and those for the UK and Japan.

Discussion

To our knowledge, this was the first study to estimate the health utility of Chinese patients with CKD using both Chinese EQ-5D-3L value sets (i.e., from 2014 and 2018). We also compared the application of four value sets for estimating utility scores and explored the influencing factors of the estimated utility. The findings demonstrate a statistically significant difference regarding utility scores between the Chinese, the UK, and Japanese value sets. This difference can be explained by cultural dissimilarities across countries and methodological differences of the related studies14,15, suggesting that the use of different value sets can lead to different utility scores and generate discrepant QALY gains and cost utility results16,17.

Our results showed that the Chinese 2018 value set obtained higher utility scores than did the other three value sets. Utility scores are converted from the EQ-5D-3L descriptive system by applying a formula that attaches values to each of the levels in each dimension, that is, a value set. Furthermore, these scores are calculated by deducting the appropriate weights from 1, which is the value for full health18. Specifically, health utility score Utility = 1 − (constant + sum of all coefficients × variable values). As can be seen from Table 1, the Chinese 2018 value set does not include constant term and N3 term and the coefficient in dimension is also the lowest. The Chinese 2014 value set has smaller constant term and N3 term compared to the value set of Japan and UK. In our study, the problems reported most often were “some problem” in pain/discomfort dimension for CKD patients. From the coefficients of this level in each value set, we can see that pain/discomfort has the greater effect on utility value for UK and has the similar effect for Japan compared to that of China (2014 edition). For the above reasons, China set show mostly higher utility score compared to value sets from other countries.

Regarding the two Chinese value sets, the utility score based on the Chinese 2018 value set was higher than that based on the Chinese 2014 value set. This was probably because the first does not contain a constant coefficient and an N3 (indicating if level 3 (severe problem) occurs within at least one dimension), and another explanation is that the two Chinese value sets were established in studies with different populations. Particularly, the Chinese 2014 value set was established with an urban sample, whereas the 2018 value set was established with a sample comprising both urban and rural participants6,7.

Accordingly, when there is more than one value set available and stakeholders must decide on one to use, they could consider whether their value set of choice was established with a sample that is consistent with their targeted population. Specifically, when the targeted study population are urban sample, the study should use the Chinese 2014 value set as which was established with an urban sample. And when the targeted study population are from both urban and rural areas, the study should use the Chinese 2018 value set as which was established with a sample comprising both urban and rural participants. In addition, we think that it’s necessary to further establish a value set with rural sample, because the difference between rural and urban China is huge.

Regarding consistency, the ICCs in our study between the value sets for China 2018, the UK, and Japan were less than 0.7, indicating low consistency for a sample of Chinese patients with CKD; these findings were confirmed by those related to MCID. The Bland–Altman plots further indicated that these four value sets were not interchangeable. The value set of the UK and Japan was established much earlier than the value sets of China, the different develop era could be a possible reason for interchangeable of different value sets. With the development of economy and medical technology, the health preference may also change in different times. These findings are generally consistent with those in a previous study conducted in China19. This cited study compared the utility scores based on the Chinese 2014, the UK, and the Japan value sets for the Tibet general population, finding that the three value sets had relatively good consistency but were not interchangeable. In prior research, researchers usually set the MCID at either 0.074 or 0.0520. We decided to go with the latter because a small difference in utility scores has the potential to lead to large differences in health policy decision-making. In this type of decision-making, every minor difference is given due attention, even at the cost of potentially overestimating the differences between utility scores. Indeed, our findings demonstrate, from a different perspective than that in prior research, that value sets for different countries yield very divergent results and that they are not interchangeable.

The current study also showcases that the utility scores of patients with pre-dialysis CKD were higher than that of patients with peritoneal dialysis. This finding may be because of two possible explanations: first, the health conditions of patients with pre-dialysis CKD may be better than those of patients with dialysis; second, dialysis may bring health problems and inconveniences to the daily lives of patients. We also noticed that the utility scores of patients with peritoneal dialysis were higher than those of patients with hemodialysis. This may be because patients with peritoneal dialysis can receive home care and at any given time, so they may be less influenced by the treatment in their daily lives. These results find consistency in the evidence of prior literature21,22, as well as underpin the need for stakeholders to consider different methods to delay CKD progress and further improve patients’ quality of life.

This research further shows that utility scores decreased with an increase in age, once again showing results similar to those of previous studies23. In general, older adults are likely to be less healthy and to have more complicated health conditions (e.g., comorbid chronic diseases) than younger adults. Our results also show that education level positively influences utility scores, and this may be related to patients with higher education level having better awareness of CKD and more access to social support. These findings are similar to those of a prior study on HRQoL23. Our findings emphasize that invested stakeholders could endeavor to increase awareness of CKD, which can be operationalized by designing and implementing health education programs for patients with CKD with low education levels.

Regarding theoretical and practical implications, first, this research was the first to use both EQ-5D-3L value sets for China (i.e., from 2014 and 2018) to estimate health utility scores of Chinese patients with CKD. Our evidence can, therefore, be referred to when attempting to calculate QALYs and CUAs in the Chinese context. Second, we compared utility score differences among four value sets (i.e., China 2014 and 2018, the UK, and Japan), delivering reference data for future researchers when choosing the most suitable value set for their sample. Third, we analyzed the influencing factors of utility scores using a Tobit regression model, making our evidence more comprehensive and explanatory.

This study also has its limitations. Particularly, this study used convenience sampling in only four representative big cities in China, making it so that generalizations of the findings should be performed with caution. In addition, although we controlled for as many relevant variables as possible, other unobserved characteristics may have led to the health utility differences we observed between countries. Furthermore, the EQ-5D-3L data used in this study were collected in 2012 and may no longer be applicable to the present patients with CKD in China; however, we do not consider this to be a limitation of the present analysis as we were interested in comparing results between countries and not in exploring whether the data collected then would be relevant today. The fact that it was impossible to compare findings from four different value sets which in fact be considered a strength of the study at 2012 because the Chinese value sets are not available at that time.

Conclusions

This research provides a benchmark for the health utility of Chinese patients with CKD measured by EQ-5D-3L, delivering utility score data that can be useful for future economic evaluation studies. It also shows that the value sets for China, the UK, and Japan were not interchangeable for calculating utility scores of Chinese patients with CKD, suggesting that stakeholders could use, in Chinese contexts, the value sets for China—not those for other countries. As for the choice of the two Chinese value sets, we recommend that the researcher should consider whether their value set of choice was established with a sample that is consistent with their targeted population. Future health policy-making in China could focus more on devising methods to delay CKD progression and deliver proper care for those with low education and older adult patients in order to improve their HRQoL.