Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Technical Report
  • Published:

Competing risk analysis using R: an easy guide for clinicians

Abstract

In the last decade with widespread use of quantitative analyses in medical research, close co-operation between statisticians and physicians has become essential from the experimental design through all phases of complex statistical analysis. On the other hand, easy-to-use statistical packages allow clinicians to perform basic statistical analyses themselves. Since the software they most commonly use does not perform in depth competing risk analysis, we recommend an add-on package for the R statistical software. We provide all the instructions for downloading it from internet and illustrate how to use it for analysis of a sample dataset of patients who underwent haematopoietic stem cell transplantation for acute leukaemia.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1
Figure 2

Similar content being viewed by others

References

  1. Klein JP, Moeschberger ML . Survival Anal, 2nd edn. Springer: New York, 2003, 536pp.

    Google Scholar 

  2. Pintilie M . Competing Risks: A Practical Perspective. John Wiley & Sons: New York, 2006, 240pp.

    Book  Google Scholar 

  3. Klein JP, Rizzo JD, Zhang MJ, Keiding N . Statistical methods for the analysis and presentation of the results of bone marrow transplants. Part I: unadjusted analysis. Bone Marrow Transplant 2001; 28: 909–915.

    Article  CAS  Google Scholar 

  4. Satagopan JM, Ben-Porat L, Berwick M, Robson M, Kutler D, Auerbach AD . A note on competing risks in survival data analysis. Br J Cancer 2004; 91: 1229–1235.

    Article  CAS  Google Scholar 

  5. Gray RJ . A class of K-sample tests for comparing the cumulative incidence of a competing risk. Ann Stat 1988; 16: 1141–1154.

    Article  Google Scholar 

  6. R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing 2006, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

  7. Dalgaard P . Introductory Statistics with R. Springer: New York, 2002, 267pp.

    Google Scholar 

  8. Iacus SM, Masarotto G . Laboratorio Di Statistica Con R. McGraw-Hill: Milano, 2002, 384pp.

    Google Scholar 

  9. Venables WN, Ripley BD . Modern Applied Statistics with S, 4th edn. Springer: New York, 2002, 495pp.

    Book  Google Scholar 

  10. Choudhury JB . Non-parametric confidence interval estimation for competing risks analysis: application to contraceptive data. Stat Med 2002; 21: 1129–1144.

    Article  Google Scholar 

  11. Fine JP, Gray RJ . A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc 1999; 94: 496–509.

    Article  Google Scholar 

  12. Marubini E, Valsecchi MG . Analysing Survival Data from Clinical Trials and Observational Studies. Wiley: New York, 1995, 430pp.

    Google Scholar 

  13. Pepe MS, Mori M . Kaplan–Meier, marginal or conditional probability curves in summarizing competing risks failure time data? Stat Med 1993; 12: 737–751.

    Article  CAS  Google Scholar 

  14. Lin DY . Non-parametric inference for cumulative incidence functions in competing risks studies. Stat Med 1997; 16: 901–910.

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank Dr Geraldine Anne Boyd, University of Perugia, for editing this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to F Aversa.

Appendices

Appendix A

Competing risk analysis is dedicated to the study of failure probabilities when each individual may fail due to one of several causes, called competing events. The cumulative incidence function is defined as the probability of failing from cause r (r=1 ,…, k where k is the number of causes of failure) up to a certain time point t. Formally, it may be written as

where λr(t) is the cause specific hazard rate and S(t)=Pr(T⩾t) is the survival function. Non-parametric MLE of (cause specific) CIF is computed as follows:

where drj is the number of failures at time tj from cause r, nj is the number of individuals at risk at time tj, and Ŝ(tj) is the Kaplan–Meier estimate of the overall survival function. It is interesting to note that ∑kr=1Îr(t)=1−Ŝ(tj), that is the sum of cumulative incidence from all causes is equal to 1 minus the Kaplan–Meier estimate of survival.

Confidence interval estimation can be derived10 based on the ln(−ln) transformation, so the (1−α)100% confidence interval for the cumulative incidence function at time t for cause r is given by

where zα/2 is the upper α/2 percentile of the standard normal distribution, and σr(t) is the square root of the estimated variance of Îr(t). This can be calculated as follows (see Marubini and Valsecchi, p 341, eq. 10.12):12

where dj=∑kr=1drj.

Finally, comparison of cause-specific CIFs in different groups can be performed using one of the tests proposed, among others, by Gray,5 Pepe and Mori,13 and Lin.14

Appendix B

Rights and permissions

Reprints and permissions

About this article

Cite this article

Scrucca, L., Santucci, A. & Aversa, F. Competing risk analysis using R: an easy guide for clinicians. Bone Marrow Transplant 40, 381–387 (2007). https://doi.org/10.1038/sj.bmt.1705727

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/sj.bmt.1705727

Keywords

This article is cited by

Search

Quick links