Introduction

Dynamic Treatment Regimens (DTRs), also known as adaptive treatment strategies or adaptive interventions, are a set of sequential decision rules, each one corresponding to a key decision point in the patient’s history [1]. Each rule establishes the treatment for the patient among the available treatment options according to the information collected until then.

The DTR represents a formalisation of the multi-stage and dynamic decision process followed by clinicians in their everyday clinical practice. The final aim of the decision process is to tailor the treatment to the patients’ characteristics and clinical history, which is the key concept of precision medicine. In this sense, identifying the optimal DTR would be a way to put evidence-based precision medicine into practice, especially in chronic disease management [2], which is one of the most suitable clinical settings for DTRs. Particularly, cancer research is a promising field of application of SMART designs. Cancer is a chronic disease that requires treatment at multiple stages, according to each patient’s characteristics and clinical status [3].

However, providing evidence-based DTRs poses relevant methodological challenges to study design and DTRs’ effect estimation. The types of study commonly used for testing and comparing DTRs include observational studies, one-time randomised trials that randomise patients only once to the whole DTR, and sequential multiple assignment randomized trials (SMARTs) [2]. SMART designs randomise patients at each decision point considering information collected on the patient so far. They are of growing interest in the scientific community, but their use is not well-established yet.

The main difficulty in implementing trials to study DTRs is that there are still several open questions about sample size calculation and identification of the most appropriate method for data analysis [4]. In oncology, as in many other chronic conditions, it is common that the patient receives a frontline treatment followed by subsequent treatments adaptively chosen by the clinician. Consequently, the patient’s survival depends not only on the frontline treatment but also on the entire treatment strategy. However, the literature seems to be still dominated by trials investigating a single line or stage of the patient clinical history, ignoring previous or subsequent therapies, potentially leading to misleading results [5].

The present work aims to investigate the state-of-the-art of SMART designs in oncology, focusing on the statistical methods used for the sample size computation and data analysis within cancer clinical trials on solid tumours.

Methods

A systematic review was done. The review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [6].

Information sources and search strategy

The bibliographic search was performed on PubMed, Embase and CENTRAL (Cochrane Trial Registry), without date of publication restrictions. The search string is reported in Table S1 (Supplementary Material).

Eligibility criteria and selection process

Published protocols or results of SMART designs and registrations of SMART designs in clinical trial registries were considered eligible. To be included, the SMART design should be applied to solid tumour research, without restrictions on the intervention type.

The criterion to identify SMART designs was the presence of ≥2 stages in which patients were re-randomised to subsequent treatments according to a set of pre-specified decision rules based on patients’ characteristics and treatment history [7].

The study selection was done using the COVIDENCE software [8]. The title/abstract and full-text screening was performed by two independent reviewers (GL and EP). A third independent reviewer (ES) was in charge of solving disagreements.

Conference proceedings, book chapters, systematic reviews and metanalysis were excluded, but they were checked for eligible papers. Papers in the English language were considered.

Data extraction

Information on three domains of interest was considered, i.e. study characteristics, study design and study analysis. Study characteristics included publication year, setting, funding, trial registration (if any), the definition of the study design as SMART, and if the study presented a reanalysis of the original study data. The study design information included the number of stages, the type of intervention administered at each stage, the decision rules employed, the study objectives and endpoints and sample size reporting and calculation information. The study analysis domain included the methods used for data analysis and if specific data analysis techniques were used to account for the adaptive treatment resulting from the multiple sequential assignments. For protocols, such information was extracted from the statistical analysis plan.

A restricted subset of items was employed to extract data from trials’ registrations records to allow a minimum dataset for all trials’ registrations included in the review. The item selection depended on the fact that the detail of information reported slightly changed according to the trial registry type. In most cases, the statistical analysis plan was missing.

Study characteristics were reported for descriptive purposes. Study design information was chosen according to the key SMART designs’ components, e.g. the number of stages, decision rules. Finally, information on study analysis techniques allowed for answering the main research question of the present work, i.e. to describe the statistical methods used within SMART designs to identify potential discrepancies between the available methodological approaches in the statistical literature and the procedures applied.

The data extraction tool was based on an Excel file.

Risk of bias assessment

The Cochrane risk-of-bias tool for randomised trials (RoB 2) [9] was used to assess the risk of bias in the included studies (articles and protocols). For studies included twice in the review [10,11,12,13], the assessment of the risk of bias was performed only once.

Results

Search results

The search of the bibliographic databases resulted in the inclusion of 14,586 records (Fig. 1). The last search was performed on 9 September 2021.

Fig. 1: Study flow-chart.
figure 1

PRISMA flow-chart showing the study selection process.

After duplicate removal, title/abstract screening was performed, resulting in 823 included records which underwent full-text screening. After the full-text screening, 33 results were included in the present systematic review. Fifteen were reports of trials’ results [10,11,12,13,14,15,16,17,18,19,20,21,22,23,24], four were trials’ protocols [25,26,27,28] and fourteen were trials’ registrations.

Among the included records, there was a match between three trials’ protocols [26,27,28] and the corresponding trials’ registrations, and between two reports of trials’ results [16, 22] and the corresponding trials’ registrations.

Among trials’ results reports and protocols, nine were published in oncology journals [11, 12, 14, 15, 18, 19, 22,23,24], three in experimental and research medicine journals [25,26,27], two in internal and general medicine journals [10, 16]. The other five records were published in specialised journals in other areas of medicine, including clinical neurology [17], respiratory system [20] and peripheral vascular diseases [21]. One study was published in a nursing journal [28], and only one study was published in a statistics & probability journal [13].

All studies were found to present with some concerns at risk of bias assessment (Table S2, Supplementary Material), except for that of Marshall et al. [21].

Trials’ results

Fifteen studies presenting trials’ results were included in the present work. Table S3, Supplementary material, presents the detailed characteristics of the studies included.

Eight trials were located in the EU and 5 in North America. Thirteen out of fifteen were multicenter, and half (8 out of 15) received public or private funding. The first study was published in 1992. Six trials were published between 2010–2021, three between 2000–2009 and another six in the period 1990–1999.

Two studies [13, 22] presented a reanalysis of previously published data; for what concerns that of Wang et al. [13], the study presenting the first analysis of the data was included in this review [12], whereas for the one by Petracci et al. [22], it was included the one reporting the reanalysis.

Furthermore, two studies presented the same trial’s short- and long-term results [10, 11].

All the trials tested chemo/radio/hormone therapy for cancer treatment, including lung cancer, neuroblastoma, glioblastoma, pancreatic cancer, breast cancer, prostate cancer, colorectal cancer and recurrent venous thrombosis in solid tumours.

Interestingly, only one study out of the fifteen included was reported to have a SMART design [13]. All the trials were characterised by a two-stage design (Table 1 for detailed study design). The decision rule was most frequently based on the response to first-stage treatment. The objectives reported by most of the studies identified were to compare first and second-stage treatments separately or only first or second-stage treatments, except for Petracci et al. [22], Thall et al. [12] and Wang et al. [13]. The authors of these studies explicitly declared in the manuscript that the study’s objective was to identify the best treatment regimen resulting from the multiple assignments.

Table 1 Study design information on reports of trial’s results and protocols.

Eight studies did not report sample size calculation. Those reporting sample size calculations did not take into account the multiple assignments in the sample size estimation. Generally, the sample size was provided for each stage, or the powering of the study was made on one of the two stages and inflating according to the expected proportion of subjects entering the second randomisation. Of notice, Tummarello et al. [24] declared that the number of people entering the second randomisation was too small to allow groups’ comparison. Marshall et al. [21] and Bianchi et al. [14] underwent premature closure because of the low recruitment rate.

Regarding the data analysis (Table 2 for study details), the approaches most frequently used were the Kaplan–Meier method and the Cox Proportional Hazard model since most trials considered a time-to-event endpoint (overall or progression-free survival). Matthay et al. [10, 11] used such analysis approaches to compare the treatment regimens resulting from the two-stage randomisation among subjects entering the second randomisation. In all other trials, separate analyses of first and second-stage treatments were carried out, except for Petracci et al. [22], Thall et al. [12] and Wang et al. [13], which were interested in identifying the best treatment regimen. These authors adopted three different strategies of analysis to estimate the treatment effect taking into account patients’ baseline characteristics and outcome history throughout the trial. The analysis of Petracci et al. [22] involved the estimation of Inverse Probability of Censoring Weighting (IPCW) to account for selection bias resulting from patients’ selection in the second stage. Thall et al. [12] used a conditional logistic regression approach. Wang et al. [13] proposed the estimation of Inverse Probability Treatment Weighting (IPTW) for the reanalysis of Thall et al. [12].

Table 2 Study analysis information on reports of trial’s results and protocols.

Trials’ protocols

The review included four trial protocols [25,26,27,28]. They were all located in the USA and published after 2009. Three [26,27,28] out of four corresponded to trials’ registrations included in the present review. All the study’s protocols received funding, and two were multicenter (Table S3 for trials’ protocols characteristics).

No trials were aimed at testing chemo/radiotherapy for cancer treatment. Two tested interventions to reduce cancer symptoms in patients with different types of solid tumours [28] and breast cancer [27]. One tested pharmacological treatment for depression in melanoma patients undergoing IFN-alpha therapy [25]. Finally, that of Fu et al. [26] was aimed at lung cancer prevention through a smoking cessation programme.

In all four protocols included, the design was defined to be SMART. They were all based on two stages, and the decision rule was based on the response to the first-stage treatment (Table 1). All the protocols were declared to be aimed at identifying the optimal treatment strategy. However, it is worth pointing out that only one protocol presented the identification of the optimal treatment strategy as the study’s primary objective [25].

All protocols reported the sample size calculation, but the power analysis was based only on one of the two stages in three out of four records. Only Auyeung et al. [25] proposed an approach accounting for the two-stage design. Not least, Fu et al. [26], in the last trial’s update published within the trial registration, declared that a sample size reassessment was done to account for the low enrolment rate.

For data analysis (Table 2), all protocols planned to use traditional statistical tests and regression-based analyses to compare first and second-stage treatment separately. Furthermore, Sikorsii et al. [28], Kelleher et al. [27] and Auyeung et al. [25] proposed three different analysis approaches to identify the optimal treatment strategy. Auyeung et al. [25] proposed using marginal mean models to estimate the mean outcome for each regimen. Sikorsii et al. [28] declared that the optimal intervention sequence will be identified through Q-learning algorithm, including two Q functions considering patients and their caregivers’ baseline characteristics and history through the two stages. Also, Kelleher et al. [27] planned the use of the Q-learning algorithm and value search estimation. No technical details about models’ estimations were provided.

Trials’ registrations

Fourteen trials’ registrations were included in the review. Five referred to already included trials’ results [16, 22] and protocols [26,27,28]. All registrations were made after 2008, twelve were retrieved on clinicaltrials.gov, one from australianclinicaltrials.gov and one from the Clinical Trials Peruvian Registry. Half of the studies were located in North America.

Only five out of 14 trials (36%) were aimed at testing cancer chemo/radio treatments on overall survival or disease-free survival of patients with pancreatic cancer (3 registrations), colorectal cancer (1 registration) and neuroblastoma (1 registration). Six trials tested treatments for cancer and cancer treatment symptoms, such as fatigue, pain, sensory symptoms, depression, anxiety and quality of life, in patients with breast cancer or solid tumours and their caregivers. One trial’s registration was aimed at improving the management of cardiovascular comorbidities in cancer patients, and another one at testing interventions for COVID-19 prevention and treatment in cancer patients. Finally, one trial was aimed at cancer prevention (lung cancer), through a programme for smoking cessation, corresponding to the registration of the trial protocol published by Fu et al. [26].

Interestingly, all but five registrations referred to the study design as a SMART one. All designs were two-stage based, except for two studies. One included three stages, but only one decision-rule-based randomisation was specified (from the second to the third stage), while in the other trial, the number of stages depended on the patients’ COVID-19 status (no exposure, exposure to COVID-19, moderate or severe COVID-19 infection).

No information is reported regarding sample size calculation and data analysis because the statistical analysis plan was not available in almost all trials’ registrations.

Detailed characteristics of each one of the trials’ registrations included in the review are reported in Table S4, Supplementary Material.

Discussion

One of the most relevant findings of the present systematic review is the low number of studies retrieved. Such a low number of records suggests that the use of SMART designs in oncology is still limited, even though the advent of SMART designs offers new opportunities to develop evidence-based personalised treatment regimens, especially in cancer research [7]. Such findings could be related to the fact that they pose relevant methodological challenges to the sample size and treatment effect estimation and that there is still limited dissemination and perhaps understanding of the methods in the SMART research area. Unsurprisingly, most of the studies employed traditional techniques for study powering and analysis, considering each stage separately instead of comparing DTRs embedded in the trial, maybe because of the lack of formal guidelines for designing and reporting trials employing SMART methodologies.

Noticeably, the study design was defined as SMART in only one out of fifteen trial reports included, which would be one of the main reasons why most trials considered each stage separately from the other. This finding could be related to the fact that the formal introduction of SMART design is relatively new, even though the use of multiple randomisations according to pre-specified decision rules dates back to before the 2000s. Consequently, even though such trials, especially the oldest ones, were not labelled as SMART, they presented all the characteristics to be defined as SMART. Except for five, all study protocols and all trials’ registrations defined the study design SMART. This difference among the record types could be related to the timing of publication. Trial protocols and trials registrations were published within the last fifteen years, while about one-third of trials’ reports were published before 2000.

It is worth pointing out that, despite one of the primary goals of SMART designs is to identify the optimal DTR, only a few records included in the review considered determining the best treatment sequence as the study’s primary outcome. Such an aspect detected in the review is reflected by the approaches employed for sizing the study. It is undoubtedly that power analyses for SMART designs present relevant challenges because of the correlation structure between the embedded DTRs [29]. Several approaches have been proposed in the last years to undertake such issues in the SMART design [29,30,31] without definitive solutions. However, if the primary aim of a SMART study is to identify the best DTR, it follows that the sizing should be done to be able to detect the optimal DTR. However, most of the included trials reporting the power analysis used traditional methods for sample size estimation, since they did not consider the detection of the best DTR as the primary study endpoint. Generally, they estimated the sample size on only one of the trial’s stages and inflated the sample size on the expected proportion of subjects entering the second randomisation, or they estimated the sample size for each stage. The present finding is consistent with the conclusions of a recent review in the field [32].

For what concerns data analysis, most of the trials’ reports made separate analyses for each trial stage using traditional statistical methods, such as regression-based models, without considering patients’ history throughout the study. Such finding is consistent with the fact that most of the records included did not define the study design as SMART and did not identify the evaluation of the optimal DTR among the study outcomes. Focusing on the few reports aimed at identifying the best treatment regimen, two reports of trials’ results [13, 22], both reanalyses of previously published data, tried to account for each subjects’ history through the trial in treatment effect estimation using propensity weighting estimation, while Thall et al. used a conditional regression approach [12]. On the other side, two study protocols proposed the use of Q-learning algorithm to identify the most promising DTR, which is an approach that has been suggested to be promising for the analysis of data collected using SMART designs [33]. Furthermore, it is interesting to point out that, even though SMART strategies are well known to suffer from the multiple comparison problem since the number of DTRs embedded in the trial is often large, most of the trials included did not account for such a problem.

Finally, it is noteworthy that the focus of SMART designs in solid tumour research changed over time. The first trials published employing sequential multiple assignments were aimed at testing chemo/radio/hormone therapy for solid tumours. Conversely, they have focused more and more on cancer symptoms and cancers treatments side effects, such as fatigue, depression, anxiety and pain in the last years. SMART designs are particularly suitable for assessing complex or long-term interventions for chronic conditions that require management to adapt to patients’ needs, as is the case of cancer-related symptoms.

For what concerns study limitations, the search strategy is the main one. When the study was done, no index terms referring to SMART designs were available in any one of the thesaurus of the bibliographic databases searched. Not least, as clearly emerged from the systematic review, the term SMART was often not employed by the authors, even though the study design satisfied the criteria for being SMART. We tried to overcome such limitations by including all possible synonymous of the critical aspects of a SMART design in a well-defined research field, that of solid tumours. However, we cannot rule out that relevant reports could not be included in the search. Another study limitation is that most of the trials’ registrations did not report details on the statistical analysis plan. It follows that they contributed to the review only with general information on trial characteristics. However, it would be interesting to update the review to check if reports of these registrations will be published and if the employed methods are consistent with those used in practice.

Conclusions

The present systematic review showed that the use of SMART designs in solid tumour research is still limited; however, the interest in such designs is growing, and it is testified by the increasing number of protocols and trial registrations in the last years. However, the present work clearly showed that despite the SMART designs’ primary aim would be to identify the optimal treatment regimen resulting from the multiple assignments, most of the trials included did not consider the identification of the best DTR as their primary objective. Consequently, they did not employ ad hoc methods for powering and analysing the trial to determine the best DTR; powering and analysing each study’s stage separately is still the approach most widely used. Such aspects could be related to the fact that the SMART designs are relatively new.

Present results highlighted that greater efforts should be put forward to developing formal guidelines for SMART designs’ conduction and reporting. A thorough literature review of methodological papers presenting and discussing statistical approaches for SMART designs would represent the basis for formal guidelines in the field. With such a review, the development of standard guidelines would benefit from the involvement of a panel of experts, i.e. using the Delphi methodology, to improve the use of such design in cancer trials.