Abstract
The increasing availability of large healthcare databases is fueling an intense debate on whether real-world data should play a role in the assessment of the benefit–risk of medical treatments. In many observational studies, for example, statin users were found to have a substantially lower risk of cancer than in meta-analyses of randomized trials. Although such discrepancies are often attributed to a lack of randomization in the observational studies, they might be explained by flaws that can be avoided by explicitly emulating a target trial (the randomized trial that would answer the question of interest). Using the electronic health records of 733,804 UK adults, we emulated a target trial of statins and cancer and compared our estimates with those obtained using previously applied analytic approaches. Over the 10-yr follow-up, 28,408 individuals developed cancer. Under the target trial approach, estimated observational analogs of intention-to-treat and per-protocol 10-yr cancer-free survival differences were −0.5% (95% confidence interval (CI) −1.0%, 0.0%) and −0.3% (95% CI −1.5%, 0.5%), respectively. By contrast, previous analytic approaches yielded estimates that appeared to be strongly protective. Our findings highlight the importance of explicitly emulating a target trial to reduce bias in the effect estimates derived from observational analyses.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Timing of dialysis in acute kidney injury using routinely collected data and dynamic treatment regimes
Critical Care Open Access 28 November 2022
-
Methodik und Interpretation vergleichender Krankenkassendatenstudien: methodische Grundlagen (Teil 1)
Prävention und Gesundheitsförderung Open Access 30 September 2022
-
Emulierung von „target trials“ mit Real-world-Daten
Prävention und Gesundheitsförderung Open Access 29 July 2022
Access options
Subscribe to Nature+
Get immediate online access to Nature and 55 other Nature journal
$29.99
monthly
Subscribe to Journal
Get full journal access for 1 year
$79.00
only $6.58 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Buy article
Get time limited or full article access on ReadCube.
$32.00
All prices are NET prices.


Data availability
This study is based in part on data from the Clinical Practice Research Datalink obtained under license from the UK Medicines and Healthcare Products Regulatory Agency. The data are provided by patients and collected by the UK National Health Service (NHS) as part of their care and support. The interpretation and conclusions contained in this study are those of the authors alone. Because electronic health records are classified as sensitive data by the UK Data Protection Act, information governance restrictions (to protect patient confidentiality) prevent data sharing via public deposition. Data are available with approval through the individual constituent entities controlling access to the data. Specifically, the primary care data can be requested via application to the Clinical Practice Research Datalink (https://www.cprd.com).
Code availability
Access to the computer code used in this research is available by request to the corresponding author.
References
Hernán, M. A. & Robins, J. M. Using big data to emulate a target trial when a randomized trial is not available. Am. J. Epidemiol. 183, 758–764 (2016).
Soni, P. D. et al. Comparison of population-based observational studies with randomized trials in oncology. J. Clin. Oncol. 37, 1209–1216 (2019).
Visvanathan, K. et al. Untapped potential of observational research to inform clinical decision making: American Society of Clinical Oncology Research Statement. J. Clin. Oncol. 35, 1845–1854 (2017).
Hemingway, H. et al. Big data from electronic health records for early and late translational cardiovascular research: challenges and potential. Eur. Heart J. 39, 1481–1495 (2018).
Gerstein, H. C., McMurray, J. & Holman, R. R. Real-world studies no substitute for RCTs in establishing efficacy. Lancet 393, 210–211 (2019).
Framework for FDA’s Real-World Evidence Program (U.S. Food and Drug Administration, 2018).
Hernán, M. A., Sauer, B. C., Hernandez-Diaz, S., Platt, R. & Shrier, I. Specifying a target trial prevents immortal time bias and other self-inflicted injuries in observational analyses. J. Clin. Epidemiol. 79, 70–75 (2016).
Graaf, M. R., Beiderbeck, A. B., Egberts, A. C., Richel, D. J. & Guchelaar, H. J. The risk of cancer in users of statins. J. Clin. Oncol. 22, 2388–2394 (2004).
Poynter, J. N. et al. Statins and the risk of colorectal cancer. N. Engl. J. Med. 352, 2184–2192 (2005).
Friis, S. et al. Cancer risk among statin users: a population-based cohort study. Int. J. Cancer 114, 643–647 (2005).
Chen, M. J. et al. Statins and the risk of pancreatic cancer in type 2 diabetic patients—a population-based cohort study. Int. J. Cancer 138, 594–603 (2016).
Khurana, V., Bejjanki, H. R., Caldito, G. & Owens, M. W. Statins reduce the risk of lung cancer in humans: a large case-control study of US veterans. Chest 131, 1282–1288 (2007).
Clancy, Z. et al. Statins and colorectal cancer risk: a longitudinal study. Cancer Causes Control 24, 777–782 (2013).
Pradelli, D. et al. Statins use and the risk of all and subtype hematological malignancies: a meta-analysis of observational studies. Cancer Med. 4, 770–780 (2015).
Shannon, J. et al. Statins and prostate cancer risk: a case-control study. Am. J. Epidemiol. 162, 318–325 (2005).
Flick, E. D. et al. Statin use and risk of colorectal cancer in a cohort of middle-aged men in the US: a prospective cohort study. Drugs 69, 1445–1457 (2009).
Flick, E. D. et al. Statin use and risk of prostate cancer in the California Men’s Health Study cohort. Cancer Epidemiol. Biomark. Prev. 16, 2218–2225 (2007).
Hoffmeister, M., Chang-Claude, J. & Brenner, H. Individual and joint use of statins and low-dose aspirin and risk of colorectal cancer: a population-based case-control study. Int. J. Cancer 121, 1325–1330 (2007).
Boudreau, D. M. et al. The association between 3-hydroxy-3-methylglutaryl conenzyme A inhibitor use and breast carcinoma risk among postmenopausal women: a case-control study. Cancer 100, 2308–2316 (2004).
Cholesterol Treatment Trialists (CTT) Collaboration et al. Lack of effect of lowering LDL cholesterol on cancer: meta-analysis of individual data from 175,000 people in 27 randomised trials of statin therapy. PLoS One 7, e29849 (2012).
Dale, K. M., Coleman, C. I., Henyan, N. N., Kluger, J. & White, C. M. Statins and cancer risk: a meta-analysis. JAMA 295, 74–80 (2006).
Maisonneuve, P. & Lowenfels, A. B. Statins and the risk of colorectal cancer. N. Engl. J. Med. 353, 952–954 (2005).
Setoguchi, S., Avorn, J. & Schneeweiss, S. Statins and the risk of colorectal cancer. N. Engl. J. Med. 353, 952–954 (2005).
Miettinen, O. S. The need for randomization in the study of intended effects. Stat. Med 2, 267–271 (1983).
Sattar, N. et al. Statins and risk of incident diabetes: a collaborative meta-analysis of randomised statin trials. Lancet 375, 735–742 (2010).
Denaxas, S. C. et al. Data resource profile: cardiovascular disease research using linked bespoke studies and electronic health records (CALIBER). Int. J. Epidemiol. 41, 1625–1638 (2012).
Denaxas, S. et al. UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. J. Am. Med. Inform. Assoc. https://doi.org/10.1093/jamia/ocz105 (2019).
Robins, J. M., Hernán, M. A. & Rotnitzky, A. Effect modification by time-varying covariates. Am. J. Epidemiol. 166, 994–1002 (2007).
Grodstein, F. et al. Postmenopausal estrogen and progestin use and the risk of cardiovascular disease. N. Engl. J. Med. 335, 453–461 (1996).
Manson, J. E. et al. Estrogen plus progestin and the risk of coronary heart disease. N. Engl. J. Med. 349, 523–534 (2003).
Hernán, M. A. & Robins, J. M. Authors’ response, part I: observational studies analyzed like randomized experiments: best of both worlds. Epidemiology 19, 789–792 (2008).
Hernán, M. A. et al. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 19, 766–779 (2008).
Margulis, A. V. et al. Validation of cancer cases using primary care, cancer registry, and hospitalization data in the United Kingdom. Epidemiology 29, 308–313 (2018).
Herrett, E., Thomas, S. L., Schoonen, W. M., Smeeth, L. & Hall, A. J. Validation and validity of diagnoses in the General Practice Research Database: a systematic review. Br. J. Clin. Pharmacol. 69, 4–14 (2010).
Bonovas, S., Filioussi, K. & Sitaras, N. M. Statin use and the risk of prostate cancer: a metaanalysis of 6 randomized clinical trials and 13 observational studies. Int. J. Cancer 123, 899–904 (2008).
Collin, S. M. et al. Prostate-cancer mortality in the USA and UK in 1975–2004: an ecological study. Lancet Oncol. 9, 445–452 (2008).
Mainous, A. G. 3rd, Baker, R., Everett, C. J. & King, D. E. Impact of a policy allowing for over-the-counter statins. Qual. Prim. Care 18, 301–306 (2010).
Thompson, W. A. Jr. On the treatment of grouped observations in life studies. Biometrics 33, 463–470 (1977).
Hernán, M. A., Lanoy, E., Costagliola, D. & Robins, J. M. Comparison of dynamic treatment regimes via inverse probability weighting. Basic Clin. Pharmacol. Toxicol. 98, 237–242 (2006).
Herrett, E. et al. Data resource profile: Clinical Practice Research Datalink (CPRD). Int. J. Epidemiol. 44, 827–836 (2015).
O’Neil, M., Payne, C. & Read, J. Read codes version 3: a user led terminology. Methods Inf. Med. 34, 187–192 (1995).
Morley, K. I. et al. Defining disease phenotypes using national linked electronic health records: a case study of atrial fibrillation. PLoS One 9, e110900 (2014).
Kuan, V. et al. A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service. Lancet Digital Health 1, e63–e77 (2019).
García-Albéniz, X., Hsu, J. & Hernán, M. A. The value of explicitly emulating a target trial when using real world evidence: an application to colorectal cancer screening. Eur. J. Epidemiol. 32, 495–500 (2017).
Acknowledgements
This research was partly supported by NIH grant P01 CA134294. B.A.D. is supported by an ASISA Fellowship.
Author information
Authors and Affiliations
Contributions
B.A.D., X.G.-A., S.D. and M.A.H. conceived the overall study. B.A.D. analyzed the data. All authors contributed to the design and analyses. R.W.L. provided key input in processing data from the database. All authors contributed to the interpretation of the results. B.A.D. and M.A.H. drafted the manuscript, which was reviewed, revised and approved by all authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Jennifer Sargent was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1
Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, stratified by age, sex and coronary heart disease status, CALIBER, 1999–2016.
Extended Data Fig. 2
Sensitivity analysis with a 60-d, rather than 30-d, maximum gap between successive prescriptions. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Extended Data Fig. 3
Sensitivity analysis additionally adjusting for physical activity, alcohol consumption, family history of cancer, practice region, influenza vaccination in the past year. and cancer screening in the past year. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Extended Data Fig. 4
Sensitivity analysis adjusting for ever-diagnosis (i.e., having ever received a diagnosis) with cardiovascular disease and diabetes by carrying forward indicators. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Extended Data Fig. 5
Sensitivity analysis truncating weights at their 99.5th percentile. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Extended Data Fig. 6
Sensitivity analysis additionally applying weights for censoring due to loss to follow-up. Estimated hazard ratios for cancer diagnosis comparing statin therapy with no statin therapy, CALIBER, 1999–2016.
Extended Data Fig. 7
Estimated hazard ratios and 95% confidence intervals for total cancer diagnosis and type 2 diabetes diagnosis comparing statin therapy with no statin therapy, when emulating a target trial and when replicating the approach of previous observational analyses, CALIBER, 1999–2016.
Extended Data Fig. 8
Covariates used in the primary and sensitivity analyses when emulating a target trial of statin therapy and cancer risk, CALIBER, 1999–2016.
Supplementary information
Rights and permissions
About this article
Cite this article
Dickerman, B.A., García-Albéniz, X., Logan, R.W. et al. Avoidable flaws in observational analyses: an application to statins and cancer. Nat Med 25, 1601–1606 (2019). https://doi.org/10.1038/s41591-019-0597-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-019-0597-x
This article is cited by
-
Timing of dialysis in acute kidney injury using routinely collected data and dynamic treatment regimes
Critical Care (2022)
-
Statins and prostate cancer—hype or hope? The epidemiological perspective
Prostate Cancer and Prostatic Diseases (2022)
-
Automated causal inference in application to randomized controlled clinical trials
Nature Machine Intelligence (2022)
-
Methodik und Interpretation vergleichender Krankenkassendatenstudien: methodische Grundlagen (Teil 1)
Prävention und Gesundheitsförderung (2022)
-
Emulierung von „target trials“ mit Real-world-Daten
Prävention und Gesundheitsförderung (2022)