Cancer patient survival can be parametrized to improve trial precision and reveal time-dependent therapeutic effects

Individual participant data (IPD) from oncology clinical trials is invaluable for identifying factors that influence trial success and failure, improving trial design and interpretation, and comparing pre-clinical studies to clinical outcomes. However, the IPD used to generate published survival curves are not generally publicly available. We impute survival IPD from ~500 arms of Phase 3 oncology trials (representing ~220,000 events) and find that they are well fit by a two-parameter Weibull distribution. Use of Weibull functions with overall survival significantly increases the precision of small arms typical of early phase trials: analysis of a 50-patient trial arm using parametric forms is as precise as traditional, non-parametric analysis of a 90-patient arm. We also show that frequent deviations from the Cox proportional hazards assumption, particularly in trials of immune checkpoint inhibitors, arise from time-dependent therapeutic effects. Trial duration therefore has an underappreciated impact on the likelihood of success.

-Accession codes, unique identifiers, or web links for publicly available datasets -A description of any restrictions on data availability -For clinical datasets or third party data, please ensure that the statement adheres to our policy Field-specific reporting Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Data exclusions

Blinding
Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.
All data generated or analyzed during this study are included in this published article (and its supplementary information files). Data are also available through the website https://cancertrials.io./ and Synapse (ID: syn25813713).
The original data set consisted of 153 unique trials in breast, colorectal, lung, and prostate cancer in the metastatic and non-metastatic settings from 2014-2016 that met desired search criteria. For additional information on study selection, data extraction, and reconstruction procedures, see: No new clinical trials were performed as part of the study. Sample size information for specific studies is described in the original trial publications (list in Supplementary Data 1).
Trials were removed from the original data set if there were any inconsistencies in the imputed patient data as compared to its associated clinical trial (e.g.: differing numbers of patients from the publication at-risk table and imputed data). The quality of the data imputation was confirmed quantitatively, by calculating the hazard ratio for imputed data and comparing it to the corresponding trial's reported hazard ratio, and qualitatively, by overlaying the Kaplan-Meier curve generated from the imputed data on top of the published curve. Trials with a hazard ratio difference greater than 0.1, or with perceptible visual differences, were removed from the final data set and not analyzed further.
All code was re-executed in preparation for manuscript submission and the reproducibility of the results were confirmed. Each piece of code is provided in a folder containing a Mathematica Notebook (.nb), all data required by the code, and the corresponding code output (Supplementary Data 2). With source data kept within the same folder as the code, the Mathematica Notebook can be executed in Wolfram Mathematica by selecting "Evaluate Notebook" from the "Evaluation" menu.
Randomization was not relevant to our study as the work consisted of a re-analysis of existing clinical trial data. No new clinical trials were performed as part of the study. Randomization information for specific studies is described in the original trial publications (list in Supplementary Data 1).
Blinding was not relevant to our study as the work consisted of a re-analysis of existing clinical trial data. No new clinical trials were performed as part of the study. Blinding information for specific studies is described in the original trial publications (list in Supplementary Data 1).