To the Editor:

Zhu et al.1 recently used Cox proportional hazards (CPH) analysis to evaluate the clinical significance of various clinical and pathological factors in patients with anal squamous cell carcinoma. The authors conclude that HPV status, patient age, tumor stage, and lymph node involvement by tumor, are each independently associated with the overall survival (OS) of these patients. It is likely, however, that these conclusions are inaccurate as they are derived from flawed statistical analysis.

The validity of OS analysis is inherently linked to the number of events (deaths) observed within a population rather than the overall size of the population itself. In CPH analysis, statistical validity is tied to the events per variable (EPV) ratio. This is the ratio of events (deaths) observed in a population to the number of variables analyzed. Larger EPV ratios are associated with more robust analyses, while smaller EPV ratios are increasingly susceptible to bias. To minimize the risk of introducing error in CPH analysis, it is recommended that an EPV ratio greater than 20:1 is used2. Additionally, it is widely accepted that the minimum EPV ratio required to perform CPH is 10:1 and that particularly high frequencies of type-I error occur in CPH analyses with an EPV ratio of less than 5:13,4,5,6,7.

Zhu et al. fail to specify the number of events (deaths) observed in their cohort. Instead, they list that 74 patients were “free” of cancer and 37 patients were “not free” of cancer1. If one assumes that the patients listed as “cancer free” were alive at their most recent follow-up, and that all of the patients listed as “not free” of cancer may have died, this leaves a maximum of 37 potential deaths (events) in the cohort. Given that the authors analyze eight variables in their CPH model (HPV status, patient age, gender, T stage, lymph node involvement, surgery, radiation, and chemotherapy), the EPV ratio could not have been greater than 4.6:1. This is markedly lower than the widely recommended EPV ratio of 20:1 and the minimum required EPV ratio of 10:1. Moreover, Zhu et al. included both continuous and categorical variables in their CPH analysis. CPH models that include both continuous and categorical variables are more likely to introduce an error than CPH models containing either continuous or categorical variables alone. As such, a minimum EPV ratio of 40:1 has been recommended for CPH models that combine both continuous and categorical variables6.

The EPV ratio Zhu et al. used to perform their CPH analysis falls considerably short of the thresholds required for adequate CPH analysis. It is thus likely that their CPH results, and the conclusions derived from them, are biased. The incorrect application and interpretation of statistical analysis leads to the incorrect reporting of data which may mislead the medical field3, 8,9,10. To avoid these inaccuracies, it is imperative that researchers inform themselves of the correct applications and the limitations of the statistical tests they use. We sincerely hope that this article brings attention to the limitations of CPH analysis and encourages researchers to respect these limitations.