The increasing availability of large healthcare databases is fueling an intense debate on whether real-world data should play a role in the assessment of the benefit–risk of medical treatments. In many observational studies, for example, statin users were found to have a substantially lower risk of cancer than in meta-analyses of randomized trials. Although such discrepancies are often attributed to a lack of randomization in the observational studies, they might be explained by flaws that can be avoided by explicitly emulating a target trial (the randomized trial that would answer the question of interest). Using the electronic health records of 733,804 UK adults, we emulated a target trial of statins and cancer and compared our estimates with those obtained using previously applied analytic approaches. Over the 10-yr follow-up, 28,408 individuals developed cancer. Under the target trial approach, estimated observational analogs of intention-to-treat and per-protocol 10-yr cancer-free survival differences were −0.5% (95% confidence interval (CI) −1.0%, 0.0%) and −0.3% (95% CI −1.5%, 0.5%), respectively. By contrast, previous analytic approaches yielded estimates that appeared to be strongly protective. Our findings highlight the importance of explicitly emulating a target trial to reduce bias in the effect estimates derived from observational analyses.
Subscribe to Journal
Get full journal access for 1 year
only $18.75 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
This study is based in part on data from the Clinical Practice Research Datalink obtained under license from the UK Medicines and Healthcare Products Regulatory Agency. The data are provided by patients and collected by the UK National Health Service (NHS) as part of their care and support. The interpretation and conclusions contained in this study are those of the authors alone. Because electronic health records are classified as sensitive data by the UK Data Protection Act, information governance restrictions (to protect patient confidentiality) prevent data sharing via public deposition. Data are available with approval through the individual constituent entities controlling access to the data. Specifically, the primary care data can be requested via application to the Clinical Practice Research Datalink (https://www.cprd.com).
Access to the computer code used in this research is available by request to the corresponding author.
Hernán, M. A. & Robins, J. M. Using big data to emulate a target trial when a randomized trial is not available. Am. J. Epidemiol. 183, 758–764 (2016).
Soni, P. D. et al. Comparison of population-based observational studies with randomized trials in oncology. J. Clin. Oncol. 37, 1209–1216 (2019).
Visvanathan, K. et al. Untapped potential of observational research to inform clinical decision making: American Society of Clinical Oncology Research Statement. J. Clin. Oncol. 35, 1845–1854 (2017).
Hemingway, H. et al. Big data from electronic health records for early and late translational cardiovascular research: challenges and potential. Eur. Heart J. 39, 1481–1495 (2018).
Gerstein, H. C., McMurray, J. & Holman, R. R. Real-world studies no substitute for RCTs in establishing efficacy. Lancet 393, 210–211 (2019).
Framework for FDA’s Real-World Evidence Program (U.S. Food and Drug Administration, 2018).
Hernán, M. A., Sauer, B. C., Hernandez-Diaz, S., Platt, R. & Shrier, I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J. Clin. Epidemiol. 79, 70–75 (2016).
Graaf, M. R., Beiderbeck, A. B., Egberts, A. C., Richel, D. J. & Guchelaar, H. J. The risk of cancer in users of statins. J. Clin. Oncol. 22, 2388–2394 (2004).
Poynter, J. N. et al. Statins and the risk of colorectal cancer. N. Engl. J. Med. 352, 2184–2192 (2005).
Friis, S. et al. Cancer risk among statin users: a population-based cohort study. Int. J. Cancer 114, 643–647 (2005).
Chen, M. J. et al. Statins and the risk of pancreatic cancer in type 2 diabetic patients—a population-based cohort study. Int. J. Cancer 138, 594–603 (2016).
Khurana, V., Bejjanki, H. R., Caldito, G. & Owens, M. W. Statins reduce the risk of lung cancer in humans: a large case-control study of US veterans. Chest 131, 1282–1288 (2007).
Clancy, Z. et al. Statins and colorectal cancer risk: a longitudinal study. Cancer Causes Control 24, 777–782 (2013).
Pradelli, D. et al. Statins use and the risk of all and subtype hematological malignancies: a meta-analysis of observational studies. Cancer Med. 4, 770–780 (2015).
Shannon, J. et al. Statins and prostate cancer risk: a case-control study. Am. J. Epidemiol. 162, 318–325 (2005).
Flick, E. D. et al. Statin use and risk of colorectal cancer in a cohort of middle-aged men in the US: a prospective cohort study. Drugs 69, 1445–1457 (2009).
Flick, E. D. et al. Statin use and risk of prostate cancer in the California Men’s Health Study cohort. Cancer Epidemiol. Biomark. Prev. 16, 2218–2225 (2007).
Hoffmeister, M., Chang-Claude, J. & Brenner, H. Individual and joint use of statins and low-dose aspirin and risk of colorectal cancer: a population-based case-control study. Int. J. Cancer 121, 1325–1330 (2007).
Boudreau, D. M. et al. The association between 3-hydroxy-3-methylglutaryl conenzyme A inhibitor use and breast carcinoma risk among postmenopausal women: a case-control study. Cancer 100, 2308–2316 (2004).
Cholesterol Treatment Trialists (CTT) Collaboration et al. Lack of effect of lowering LDL cholesterol on cancer: meta-analysis of individual data from 175,000 people in 27 randomised trials of statin therapy. PLoS One 7, e29849 (2012).
Dale, K. M., Coleman, C. I., Henyan, N. N., Kluger, J. & White, C. M. Statins and cancer risk: a meta-analysis. JAMA 295, 74–80 (2006).
Maisonneuve, P. & Lowenfels, A. B. Statins and the risk of colorectal cancer. N. Engl. J. Med. 353, 952–954 (2005).
Setoguchi, S., Avorn, J. & Schneeweiss, S. Statins and the risk of colorectal cancer. N. Engl. J. Med. 353, 952–954 (2005).
Miettinen, O. S. The need for randomization in the study of intended effects. Stat. Med 2, 267–271 (1983).
Sattar, N. et al. Statins and risk of incident diabetes: a collaborative meta-analysis of randomised statin trials. Lancet 375, 735–742 (2010).
Denaxas, S. C. et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). Int. J. Epidemiol. 41, 1625–1638 (2012).
Denaxas, S. et al. UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocz105 (2019).
Robins, J. M., Hernán, M. A. & Rotnitzky, A. Effect modification by time-varying covariates. Am. J. Epidemiol. 166, 994–1002 (2007).
Grodstein, F. et al. Postmenopausal estrogen and progestin use and the risk of cardiovascular disease. N. Engl. J. Med. 335, 453–461 (1996).
Manson, J. E. et al. Estrogen plus progestin and the risk of coronary heart disease. N. Engl. J. Med. 349, 523–534 (2003).
Hernán, M. A. & Robins, J. M. Authors’ response, part I: observational studies analyzed like randomized experiments: best of both worlds. Epidemiology 19, 789–792 (2008).
Hernán, M. A. et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 19, 766–779 (2008).
Margulis, A. V. et al. Validation of cancer cases using primary care, cancer registry, and hospitalization data in the United Kingdom. Epidemiology 29, 308–313 (2018).
Herrett, E., Thomas, S. L., Schoonen, W. M., Smeeth, L. & Hall, A. J. Validation and validity of diagnoses in the General Practice Research Database: a systematic review. Br. J. Clin. Pharmacol. 69, 4–14 (2010).
Bonovas, S., Filioussi, K. & Sitaras, N. M. Statin use and the risk of prostate cancer: a metaanalysis of 6 randomized clinical trials and 13 observational studies. Int. J. Cancer 123, 899–904 (2008).
Collin, S. M. et al. Prostate-cancer mortality in the USA and UK in 1975–2004: an ecological study. Lancet Oncol. 9, 445–452 (2008).
Mainous, A. G. 3rd, Baker, R., Everett, C. J. & King, D. E. Impact of a policy allowing for over-the-counter statins. Qual. Prim. Care 18, 301–306 (2010).
Thompson, W. A. Jr. On the treatment of grouped observations in life studies. Biometrics 33, 463–470 (1977).
Hernán, M. A., Lanoy, E., Costagliola, D. & Robins, J. M. Comparison of dynamic treatment regimes via inverse probability weighting. Basic Clin. Pharmacol. Toxicol. 98, 237–242 (2006).
Herrett, E. et al. Data resource profile: Clinical Practice Research Datalink (CPRD). Int. J. Epidemiol. 44, 827–836 (2015).
O’Neil, M., Payne, C. & Read, J. Read codes version 3: a user led terminology. Methods Inf. Med. 34, 187–192 (1995).
Morley, K. I. et al. Defining disease phenotypes using national linked electronic health records: a case study of atrial fibrillation. PLoS One 9, e110900 (2014).
Kuan, V. et al. A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service. Lancet Digital Health 1, e63–e77 (2019).
García-Albéniz, X., Hsu, J. & Hernán, M. A. The value of explicitly emulating a target trial when using real world evidence: an application to colorectal cancer screening. Eur. J. Epidemiol. 32, 495–500 (2017).
This research was partly supported by NIH grant P01 CA134294. B.A.D. is supported by an ASISA Fellowship.
The authors declare no competing interests.
Peer review information Jennifer Sargent was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, stratified by age, sex and coronary heart disease status, CALIBER, 1999–2016.
Sensitivity analysis with a 60-d, rather than 30-d, maximum gap between successive prescriptions. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Sensitivity analysis additionally adjusting for physical activity, alcohol consumption, family history of cancer, practice region, influenza vaccination in the past year. and cancer screening in the past year. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Sensitivity analysis adjusting for ever-diagnosis (i.e., having ever received a diagnosis) with cardiovascular disease and diabetes by carrying forward indicators. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Sensitivity analysis truncating weights at their 99.5th percentile. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Sensitivity analysis additionally applying weights for censoring due to loss to follow-up. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Estimated hazard ratios and 95% confidence intervals for total cancer diagnosis and type 2 diabetes diagnosis comparing statin therapy with no statin therapy, when emulating a target trial and when replicating the approach of previous observational analyses, CALIBER, 1999–2016.
Covariates used in the primary and sensitivity analyses when emulating a target trial of statin therapy and cancer risk, CALIBER, 1999–2016.
About this article
Cite this article
Dickerman, B.A., García-Albéniz, X., Logan, R.W. et al. Avoidable flaws in observational analyses: an application to statins and cancer. Nat Med 25, 1601–1606 (2019) doi:10.1038/s41591-019-0597-x