Comparison of validation and application on various cardiovascular disease mortality risk prediction models in Chinese rural population

This research aims to assess application of different cardiovascular disease (CVD) mortality risk prediction models in Chinese rural population. Data was collected from a 6-year follow-up survey in rural area of Henan Province, China. 10338 participants aged 40 to 65 years were included. Baseline study was conducted between 2007 and 2008, and followed up from 2013 to 2014. Seven models: general Framingham risk score (general-FRS), simplified-FRS, Systematic Coronary Risk Evaluation for high (SCORE-high), SCORE-low, Chinese ischemic CVD (CN-ICVD), Pooled Cohort Risk Equation for white (PCE-white) and for African-American (PCE-AA) were assessed and recalibrated. The model performance was evaluated by C-statistics and modified Nam-D’Agostino test. 168 CVD deaths occurred during follow-up. All seven models showed moderate C-statics ranging from 0.727 to 0.744. Following recalibration, general-FRS, simplified-FRS, CN-ICVD, PCE-white and PCE-AA had improved C-statistics of 0.776, 0.795, 0.793, 0.779, and 0.776 for men and 0.756, 0.753, 0.755, 0.758 and 0.760 for women, respectively. Calibrations χ2 of general-FRS, simplified-FRS, SCORE-high, CN-ICVD and PCE-AA model for men, and general-FRS, CN-ICVD and PCE-white model for women were statistically acceptable, indicating these models predicts CVD mortality risk more accurately than others and could be recommended in Chinese rural population.

The aim of this study is to compare and validate the performances of different CVD risk prediction models in Chinese rural population. Seven models will be assessed to examine whether existing models are adapted to settings of Chinese people.

Results
Baseline Characteristics. The demographic and clinical characteristics of the participants at baseline examination are presented in Table 1. Cardiovascular risk factors, such as total cholesterol (TC), high density lipoprotein cholesterol (HDL-c), fasting glucose, systolic blood pressure (SBP) and body mass index (BMI) were more prevalent in women than in men. However, smoking was prevalent in men and uncommon in women. The prevalence of hypertension and diabetes were higher in women than in men.
During a 6-year follow-up, 417 deaths from all-causes and 168 deaths due to CVD occurred in cohort with a follow-up duration of 60942 person-years. The 6-year cardiovascular and all-cause mortality rate was 1.6% and 4.0% respectively. There were 80 cardiovascular deaths in duration of 23406 person-years follow-up in men and 88 deaths in duration of 37536 person-years follow up in women. The average 6-year risk of CVD death was 1.71% for men and 1.17% for women. Cardiovascular risk stratification and mortality distribution. All seven original models were assessed, however, the ability of discrimination and calibration were poor (results were not shown). After FRS, CN-ICVD and PCE model recalibrated by mean values from Chinese present study and SCORE model recalibrated by diabetes status, the distributions of 10-year CVD death risk categories predicted by seven models (general-FRS, simplified-FRS, SCORE-low, SCORE-high, CN-ICVD, PCE-white and PCE-AA model) were present in Fig. 1.
As the PCE-white and PCE-AA model categorized the 10-year CVD death risk into four sets instead of three as the other five models did, their risk distribution was peculiar compared with others' . The FRS, SCORE, and CN-ICVD models indicated similar risk stratification trends in women. Despite the high prevalence of cardiovascular risk factors in the population, the SCORE-Low and CN-ICVD models divided more than 90% of the subjects into low-risk group, and in the SCORE-high and both FRS models the proportion of low risk was more than 60%. However, all risk models categorized quite different risk set in men. Comparison of CVD risk prediction models. The CVD mortality risk sets of PEC models were compared with that of FRS, SCORE and CN-ICVD models (see Supplementary Table S1, Supplementary Table S2  and Supplementary Table S3). The PCE-white model performed better agreement than PCE-AA model. The CN-ICVD model had a better agreement in low risk set than high. Agreement for CVD mortality risk categorization and correlation of scores between FRS models and SCORE models was good for both genders, and slightly better for men. There was hardly any misclassification between the extremes of risk categories in these models (see Supplementary Table S4). There was poor correlation between all models with the CN-ICVD model.
Model performance. The receiver operating characteristic (ROC) curves of diffident CVD risk prediction models were shown in Fig. 2(A and B), with diacritical type of lines. Area under the ROC (AUC) was an indicator of the predictive veracity for the models. AUCs of seven CVD death prediction models showed moderately good discrimination for cardiovascular mortality (see Table 2). In men, the AUCs arranged from 0.  Figure S1). In men SCORE-low and PCE-AA model underestimated the CVD risk; however, PCE-white model overestimated the risk. Rather, in women the SCORE-low, PCE-AA model and SCORE-high model underestimated the CVD risk. Statistically, calibration of the SCORE-high model was acceptable for men with a modified Nam-D' Agostino test (χ 2 = 5.109, P = 0.276), and so was PCE-white model for women with a modified Nam-D' Agostino test (χ 2 = 2.310, P = 0.679).

Coefficients Recalibration.
Recalibration was conducted in five models (FRS, CN-ICVD and PCE models) by using the coefficients and mean values of risk factors recalibrated from the current population (Table 3 and Table 4). The ability of discrimination and calibration were evaluated in these models for 5-year CVD mortality ( Table 5). ROC curves of diffident recalibrated CVD risk prediction models were shown in  Supplementary Table S7, Supplementary Table S8 and Supplementary Figure S2).

Discussion
The present study evaluated the ability of seven cardiovascular mortality risk models to predict CVD mortality risk in Chinese rural population. Predicted 10-year mortality risk distribution demonstrated that the general-FRS, simplified-FRS, SCORE-low, SCORE-high, PCE-white and PCE-AA models, while CN-ICVD model can stratify CV risk in Chinese rural population. Recalibration improved the model performance. All the seven models performed well in discrimination. Following recalibration, the general-FRS, simplified-FRS, SCORE-high, Risk prediction models are essential and cost-effective for prevention of CVDs, especially in limited resource settings such as rural regions of China. The CN-ICVD model incorrectly categorized most people into low CVD mortality risk group in present study. This would lead high-risk individuals to be unidentified, resulting in higher rates of under-treatment and more subsequent complications. The findings of this study confirms that before the application of these prediction models to clinical practice or guidelines, the performances have to be assessed in the population interest as not all the risk-prediction models can identify high-risk subjects. Recalibrate the coefficients of models by the targeted population could improve the performance of predictive accuracy. General-FRS model showed a good discrimination in Australian population 7 , Spanish population 8 and Tehran population 9 , in which the AUCs were higher than the findings in this study. However, in a Malaysia hospital-based study, the general-FRS had a c-statistic of 0.63 10 , and this maybe because the study subjects were actual patients from a clinic but not general population from community, whose overall CVD risk profile was already high even before CVD events occurring. Simplified-FRS model also well discriminated in different studies 11,12 , which was similar to the findings of this study. General-FRS and simplified-FRS were both assessed and no difference was observed between them, indicating that the simplified-FRS model without laboratory parameters can be used more widely in low-income regions. SCORE models performed similarly in European countries from which they developed 13,14 . However, the SCORE-high model performed poorly in Norway, a high cardiovascular risk country 15 , suggesting that the validation of a recommended model before application is essential for every country. The calibration statistics reported previously indicated the SCORE-high model was well-calibrated in Asian males 12 , but not in females, consistent with our study. The CVD death rate is quite low in this population while the SCORE-high model had a well performance in men. This may be related to the observation was 6 years instead of 10 years in original study. In addition, the CVD mortality would be high in 10-years follow-up, which needs furthermore researches to validate. The SCORE models performing poorly on calibration in women may be as a result of the underestimation of women's cardiovascular mortality risk for Chinese women have shown high cardiovascular causes of mortality 16 . The performances of PCE models were evaluated in the Korean Heart Study population, showing an AUC of 0.727 (PCE-white) and 0.725 (PCE-AA) for men, the corresponding AUC for women were 0.738 and 0.739 17 . While the study in Malaysian population showed a moderate discrimination with AUC of 0.63 and a good calibration with χ 2 = 12.6 (P = 0.12) 18 . In this study, both the PCE models had a good discrimination and PCE-white model calibrated well with calibration χ 2 for women.
Despite the mainly Caucasian ethnicity in development of the cardiovascular risk prediction models (FRS, SCORE and PCE models) assessed in this study, following recalibration they were able to discriminate cardiovascular risk in Chinese rural population. This is most likely because these prediction models were developed from contemporary real population cohorts and contemporarily used in other countries with good discrimination. Finally, the recalibrated general-FRS, simplified-FRS, SCORE-high, CN-ICVD and PCE-AA model could be recommended for men, and general-FRS, CN-ICVD and PCE-white model for women to predict CVD mortality risk in Chinese rural population.
Perspectives. The early identification of high-risk individuals is a crucial strategy in primary prevention of cardiovascular diseases (CVD). Effective implementation of a strategy to identify these individuals in a clinical setting is reliant on the availability of appropriate CVD risk prediction models and guideline recommendations. Several well-known models for CVD mortality risk prediction have been developed and utilized in the USA and Europe, but might not be suitable for use in other regions or countries.
In this study, seven CVD mortality risk prediction models were assessed and recalibrated in rural population. General-FRS, simplified-FRS, SCORE-high, CN-ICVD and PCE-AA model for men, and general-FRS, CN-ICVD and PCE-white model for women predicts CVD mortality risk in Chinese rural population more accurately than others, those could be recommended in clinical practice or guidelines. It might be helpful to rural health care practitioners and people to predict the risk of CVD as well as improving preventive awareness of CVD. The following study will focus on developing a CVD mortality risk prediction model for Chinese rural population.

Models
Cut-off * Sensitivity (95%CI) Specificity (95%CI) Cut-off # Sensitivity (95%CI) Specificity (95%CI) AUC (95%CI) Strengths and Limitations. This study collected data from a relatively large scale population-based prospective cohort in Chinese rural regions. Although it is the first time to assess seven CVD deaths risk prediction in Chinese rural population, some limitations need to be noticed. First, the mortality information was obtained from the national death registration record. Since the record system was not perfect, there might be some missing death events, which was confirmed by physician of the local clinic. Second, data were collected on actual observed 6-year CVD mortality events, instead of 10-year predicted in seven original studies. Calibrations of the SCORE and PCE models were calculated by adjusting coefficient to predict 5-year CVD mortality risk, and we recalibrated the coefficients and baseline survival rate for 5 years Third, the study was conducted in single area of Henan Province, therefore, the results need to be validated on a larger population, probably in a multicenter study. Although this study has several limitations, the findings are relatively actual and reliable to reflect the real condition.
Novelty and Significance. 1) What Is New? Different CVD mortality risk prediction models have been developed and widely validated in western countries. However, research on validation of these existing risk models in Chinese population is limited, especially in rural regions. Our study assessed and recalibrated seven CVD morality risk prediction models in Chinese rural population. The main findings suggested that recalibration of model could improve the ability of discrimination and calibration. Following recalibration general-FRS, simplified-FRS, SCORE-high, CN-ICVD and PCE-AA model for men, and general-FRS, CN-ICVD and PCE-white model for women predicts CVD mortality risk more accurately than others, and those could be recommended in Chinese rural prediction (could be used to clinical practice or guidelines). 2) What Is Relevant? Findings of this study might be helpful to rural health care practitioners and people to predict the risk of CVD as well as improving preventive awareness of CVD. Moreover, the availability of appropriate CVD mortality risk model is essential to reduce CVD deaths via identifying high risk individuals, which can urge them to change life-style and get treatment if necessary. And the early identification of high-risk individuals is effective on health education.

Conclusion
The study highlighted that it is crucial to assess and recalibrate cardiovascular mortality risk prediction models before their application to clinical practice or guidelines, as not all the risk-prediction models can distinguish between high and low-risk objects. Following recalibration, general-FRS, simplified-FRS, SCORE-high, CN-ICVD and PCE-AA model for men, and general-FRS, CN-ICVD and PCE-white model for women can be used to identify high cardiovascular risk in the Chinese rural population. The application of existing prediction models for CVD mortality should be cautious and it is essential to assess and recalibrate the original models in targeted population.

Methods
Study population and samples. This survey is a 6-year population-based prospective cohort study on the rural areas in Henan Province, China. Details of survey methods have been described and reported previously 19 . Briefly, the baseline survey was conducted from July to August of 2007 and that of 2008, and the data were collected by questionnaires, medical examinations and fasting blood samples. Subjects were permanent residents with no major disability or severe infectious diseases. Follow-up survey was completed in the same way from July to August of 2013 and July to October of 2014. There were 20194 participants aged 18 to 78 years in original cohort. Participants with a baseline age outside the age range of interest were excluded from this study (4401 persons < 40 years; 3030 persons > 65 years). There were 1531 participants not coming for follow-up in the 6-year duration. Out of 11232 participants, 864 participants with cancer, chronic kidney disease or prior history of CVD at baseline were excluded. Another 30 participants were also excluded as there were missing data for calculation of risk models. Eventually a total of 10338 participants were eligible for analysis. This study was approved by the Medical Ethics Committee of Zhengzhou University. The methods were carried out in accordance with the relevant guidelines and regulations. All participants signed an informed-consent form. Data Collection and Laboratory Measurements. Data were collected by specially trained physicians and public health workers who used standardized methods with stringent quality control. The information regarding demographic characters, family and individual disease history, dietary and lifestyle were obtained by a standardized questionnaire. Anthropometric data were also taken: height, weight, waist circumference and hip circumference measured twice, systolic blood pressure (SBP) and diastolic blood pressure (DBP) recorded utilizing HEM-770A sphygmomanometer in the sitting position for three times according to the American Heart Association's standardized protocol 20 . Eight to ten hours of fasting blood specimens were collected in EDTA-K 2 tubes for measurement of lipid profile and plasma glucose, respectively. Blood specimens were centrifuged at 4 °C and 3,000 rpm for 10 minutes, and the plasma was transferred and stored at − 20 °C for biochemical analyses. Hypertension was defined as SBP ≥ 140 mm Hg and/or DBP ≥ 90 mm Hg, and/or diagnosed as hypertension by a physician and currently receiving anti-hypertension treatment according to 20l0 Chinese guidelines for the management of hypertension 21 . Diabetes status was defined as having a fasting plasma glucose (FPG) ≥ 7.0 mmol/L, and/or diagnosed as diabetes by a physician 22 . Type 1 diabetes mellitus, gestational diabetes and other special type diabetes were excluded.
Cardiovascular Disease outcomes. The cardiovascular mortality was the outcome of interest in this study. Cardiovascular risk prediction models. Seven CVD mortality risk models were selected and validated in this survey: the Systematic Coronary Risk Evaluation (SCORE) models (include SCORE-high and SCORE-low model) 5 , the Framingham Risk Score (FRS) models (include general-FRS and simplified-FRS model) 4 , the American College of Cardiology (ACC) and American Heart Association (AHA) new pooled cohort risk equation (PCE) models (include PCE-white and PCE-AA prediction model) 6,23 , and a Chinese ischemic cardiovascular diseases risk model (CN-ICVD model) 24 were included as they have similar risk factors and endpoints. All seven models predict the risk of 10-year CVD death. Furthermore, the risk factors used to calculate the CVD mortality risk for these seven models were collected in this study. Both SCORE models were included in this study as lack of information about whether the low risk or the high risk model performed better in China rural.
The same with SCORE model, both PCE models were selected. Diabetes status is not considered in the original SCORE models, whereas the risks predicted with diabetes were recommended to be multiplied two-fold for male and four-fold for female in SCORE models 5 . According to the current estimated effect of diabetes on CVD risk 25 , the predicted CVD death risk for individuals with diabetes was calculated multiplying by three in male, and by five in female in this study. Cardiovascular death risk

Models
Cut-off # (%) Sensitivity (95%CI) Specificity (95%CI) +LR * −LR ** AUC (95%CI) χ 2 P  and classified risk category of each subject for all seven models were calculated. For FRS, SCORE, and CN-ICVD models, cardiovascular risk was stratified into three categories: low, intermediate, and high. However, for both the PCE models, CVD risk was stratified into four categories 6 : < 7.5%, 7.5-9.9%, 10.0-19.9% and ≥ 20%. Low CV risk was defined as ten-year risk of < 10%, < 1% and < 5% for the FRS, SCORE and CN-ICVD models, respectively 4,5,24 . High risk was defined as ≥ 20% for both the FRS models 4 , ≥ 10% for CN-ICVD model 24 and ≥ 5% for both the SCORE models 5 . All the other values were stratified into intermediate risk group. The Spearman's correlation coefficients were also calculated to assess the correlation between the rankings of each participant's absolute CV mortality risk. Lower numbers of population misclassification in extremes of risk categories were used to determine the agreement between different models.
Statistical Analysis. The whole process of statistical analysis was performed with the R software (version 3.2.3, https://www.R-project.org). Continuous variables were described by mean ± standard deviation (if were normal distribution) or median (inter-quartile range) (if were not normal distribution), while categorical data were reported as count and percentages. The six-year predicted CVD mortality risk of each subject was calculated by the seven risk models, and then compared with the actual observed CVD deaths. Validity and the predictive accuracy of the CVD risk models were assessed based on their discrimination and calibration. A 2-tailed P value < 0.05 was considered significant. The C-statistics was calculated to evaluate the discriminative power of risk models. C-statistics was also known as the area under the receiver operating characteristic (ROC) curve (AUC), and the ROC curve was plotted. Calibration was assessed statistically by modified Nam-D' Agostino test 26,27 to determine if the observed cardiovascular deaths differed significantly from the expected 28 . The Kaplan-Meier analysis was used to obtain the number of observed CVD death event 1 , which was then compared with the predicted events in each group in calibration charts. Ideally a well-calibrated model performs well in a variety of divisions into groups. We started with 10 deciles, but collapsed small deciles with their closest neighbors, until all groups contained a predefined minimum number of events (at least 2 per group) according to the prior study by Demler and her colleagues 27 . Finally, all the seven models were evaluated by collapsing into five groups for women. All models were evaluated by collapsing into five groups except CN-ICVD model which was accessed by four groups. Calibration was also determined graphically by plotting the observed and expected mortality events, grouped according to five or four groups of predicted possibility, respectively. CVD models Recalibration. All seven original models were accessed while the ability of discrimination and calibration were poor (results were not shown). In the adjusted FRS, CN-ICVD and PCE model, the coefficients were taken from the original models, but mean values from Chinese current study were used for the risk factors and the mean incidence rates. As the mean variable of risk factor were not included in original SCORE models, thus the SCORE models were adjusted by diabetes status (see Cardiovascular risk prediction models). Discrimination was calculated for all seven models and calibration was tested for four models: SCORE-high, SCORE-low, PCE-white and PCE-AA model. Only for model calibration, the 5-year risks of fatal CVD were estimated by modifying the S 0 (10) into S 0 (5) based on previous study 29 in PCE models and adjusting all relevant regression equations by Conroy et al. 5 in SCORE models.
In theory, a more appropriate integrated event risk prediction model could be produced by adopting coefficients to correct for different background incidence rates in different cultures 30,31 . Thus, recalibration was conducted based on adjustment of coefficients and mean value for each risk factor in this study. The coefficients and mean values were recalibrated from the Chinese current population if the original model was expressed in form of formula (1).
Finally, FRS, CN-ICVD and PCE models were recalibrated and assessed in this approach. The SCORE-high, SCORE-low were modified time parameter of the equations based on Conroy et al. 5 and the risk factor of diabetes status.