Main

Mental disorders are highly prevalent and show high comorbidity between them. Disorders across different diagnostic categories share commonalities and underlying processes on cognitive, neuropsychological and genetic levels1,2,3. Transdiagnostic cognitive behavioural therapy (TD-CBT) as an umbrella term encompasses different treatment approaches to tackle comorbidity4. In unified transdiagnostic treatments, patients with different disorders receive the same ‘broadband’ treatment that targets shared commonalities between these disorders. Examples of this approach include the unified protocol (UP) for emotional disorders5, the anxiety treatment protocol6 or transdiagnostic behaviour therapy7. Typically, unified transdiagnostic interventions apply the same selection and sequence of modules to all patients, independent of their characteristics. In tailored interventions, patients receive a treatment that is personalized to them. Different approaches to tailoring exist, from tailoring unified transdiagnostic treatments by personalizing the sequence of modules based on baseline characteristics to using idiographic case formulation to aggregate methods across different treatment packages. The key difference between unified transdiagnostic treatments and tailored interventions lies in their scope and focus. Unified transdiagnostic treatments have been specifically developed to target comorbidity by addressing shared mechanisms across different disorders in a comprehensive manner that is applicable to a range of patients. Tailored interventions, on the other hand, focus on addressing the specific needs and characteristics of individuals. Thus, unified transdiagnostic treatments offer a broad, overarching approach that can be applied to many disorders, while tailored interventions provide a more individualized treatment approach that considers the unique aspects of each person’s condition.

TD-CBT is a highly relevant approach to addressing treatment gaps and disseminating evidence-based treatments. Unified transdiagnostic approaches are especially promising in health care systems with notable treatment gaps, for example, by offering one transdiagnostic approach for emotional disorders instead of several disorder-specific approaches. In addressing a broader range of psychopathology, unified TD-CBT specifically may provide a more comprehensive treatment for patients, facilitate clinical training8 and lower treatment costs by reducing time invested by patients and therapists9,10,11. It can also be flexibly adapted to various treatment settings, ranging from individual and group face-to-face formats, to scalable internet-based self-help formats12,13.

Several reviews and meta-analyses investigated TD treatments. However, these previous meta-analyses differed in the settings they investigated (group, individual or internet-based), the target population (anxiety, anxiety and depression or emotional disorders) and the breadth of the transdiagnostic definition they applied (unified TD approaches, tailored interventions or specific treatment protocols). Others14 focused on face-to-face treatments in anxiety disorders and another meta-analysis15 compared TD treatments to disorder-specific treatments. Two meta-analyses16,17 investigated unified TD and tailored interventions in the internet-based setting and another18 focused on TD interventions in group format. Others, such as ref. 19, focused on a specific TD treatment, the UP5. The most recent meta-analysis20 aggregated findings from their large database on treatments for depression (https://www.metapsy.org/) that had a transdiagnostic stance. However, the authors did not focus exclusively on CBT and did not include transdiagnostic treatments targeting anxiety or other emotional disorders, although most transdiagnostic treatments are aimed at anxiety disorders. Two studies21,22 are the most comprehensive meta-analyses on TD-CBT for emotional disorders to date, including all settings as well as a focus on anxiety and depression. However, the search conducted by ref. 21 ended in 2013 and—given the novelty of TD-CBT then—they could only include four randomized controlled trials (RCTs) in their meta-analysis. On the other hand, ref. 22 only included clinician-guided internet-based interventions and did not restrict their study selection to RCTs. Thus, self-guided internet-based treatments without clinician support have not been included in their review, although it is a scalable format which has a particularly strong potential to reach larger populations.

Overall, unified TD treatments seem to produce large pre- to posttreatment effects in different settings. However, several questions need to be addressed: the comparability of unified TD-CBT to disorder-specific treatments, as there is conflicting meta-analytic evidence15,22, the comparability across different settings and the long-term effects. The surge in research activity on TD protocols in recent years warrants an updated comprehensive review and meta-analysis that aggregates findings across different unified TD protocols and settings. The current review and meta-analysis expands previous reviews and meta-analyses by investigating unified TD-CBT for emotional disorders in individual and group face-to-face settings, including internet-based interventions with and without clinician guidance.

We focused on treatments based on CBT principles and unified approaches (excluding tailored treatments), to (1) update results on short- and long-term efficacy of TD-CBT and (2) compare effects of transdiagnostic protocols to different types of control conditions, including waitlist control, treatment-as-usual (TAU), disorder-specific CBT (DS-CBT) and other active interventions. To summarize: in adult patients with emotional disorders (population), what is the effect of TD-CBT (intervention) on anxiety and depression (outcome) compared with waitlist, TAU, DS-CBT and other active interventions (comparison) at posttreatment and follow-ups?

Results

Included studies

Figure 1 shows the preferred reporting items for systematic reviews and meta-analyses (PRISMA) flowchart of the literature search and screening procedure. By systematic search and screening, we identified 56 eligible RCTs, including 6,916 individuals, published between 2005 and 2023. No preprints could be included in the final selection. Half the included studies were published after 2019 and RCTs with internet-based treatment format date from 2010 and later. Table 1 summarizes characteristics of the individual studies.

Fig. 1: PRISMA flowchart of the literature search and screening procedure.
figure 1

Three studies could not be included in the meta-analysis because either no self-report of anxiety or depression was available29 or no data were available27,28. For one study106, treatment effects at 12 months follow-up were reported in a separate publication118 which was not included in the final number of studies as this reflects the number of RCTs identified. However, we included the follow-up values in our meta-analysis.

Table 1 RCTs investigating TD-CBT for emotional disorders in individual, group and internet-based format

Most of the studies were conducted in Europe (n = 16), the United States (n = 13) and Australia (n = 11). We could also include several RCTs from Iran (n = 9) and other countries. Samples investigated ranged in size from 19 to 1,004 participants (median = 94.5) and were mainly comprised of females (M = 66%, s.d. = 19%), with a median age of 37 years. Most frequent among included diagnoses were generalized anxiety disorder (GAD; 79%), social anxiety disorder (SAD; 70%) and major depressive disorder (MDD; 55%). Most studies investigated the UP (n = 23) or similar treatments. Supplementary Table 1 gives an overview of the TD-CBT protocols we included in our review. The mean number of sessions was 11.19 (s.d. = 4.32), with a range of 4–20 sessions. TD-CBT was most frequently compared to a waitlist condition (n = 25), followed by TAU (n = 18), other active treatments (n = 15) and DS-CBT (n = 8). Some RCTs included many comparison groups. Treatments were mainly carried out in a group setting (n = 21), followed by individual formats (n = 18) and internet-based approaches (n = 17). Self-report questionnaires on symptoms of anxiety and depression were included in nearly all studies, except for five (one included no self-reports, two assessed only anxiety and another two only depression). Most RCTs used many questionnaires. The Beck anxiety inventory (BAI) (k = 17) (ref. 23) and generalized anxiety disorder screener (GAD-7) (k = 14) (ref. 24) were most commonly used to measure anxiety and the Beck depression inventory (BDI-II) (k = 21) (ref. 25) and patient health questionnaire (PHQ-9) (k = 19) (ref. 26) to assess depression. Around 75% of RCTs included at least one follow-up assessment (typically at 3 or 6 months), allowing for the investigation of longer-term effectiveness. Sixteen studies included a second follow-up (mostly at 6 or 12 months) and eight reassessed participants for a third time (mostly at 24 months). Attrition rates at posttreatment in TD-CBT samples were similar compared to control condition samples but differed by treatment format (individual setting: M = 12%, s.d. = 13%; group setting: M = 21%, s.d. = 14%; internet-based setting: M = 20%, s.d. = 10%).

For the meta-analytic calculations, we excluded refs. 27,28 because the data were not available and ref. 29 because no self-report of anxiety or depression was available, which resulted in a total N = 6,705 individuals for the meta-analytic calculations.

Risk of bias assessment

Agreement between the two independent raters in coding the risk of bias criteria was strong (M = 90.31%, s.d. = 7.82%, range 73.58–100%). Instances of disagreement mainly reflect differing levels of a rating; for example, ‘yes’ versus ‘probably yes’ but not the general direction. All ratings and the code for analysis of percentage agreement can be found on the Open Science Framework repository (see Data availability and Code availability). Figure 2 provides an overview of the risk of bias assessment for the five domains rated (see the section on ‘Study quality assessment’) for the individual studies. In Supplementary Fig. 1, we also provide a summary plot, depicting the percentage of studies showing low/high risk of bias or some concerns in each domain. We found that, overall, the risk of bias assessment of most of the included studies showed some concerns and no study was free from any risk of bias. Although there were hardly any concerns about bias in the randomization process, all studies showed some concerns for blinding of therapists, as they needed to be aware of the protocol they were providing, and assessors because we only included self-report outcomes. While intention-to-treat analyses were conducted in most studies, few reported comprehensive tests of potential bias in results due to missing outcome data, raising some concerns. Finally, although trial registrations were available for almost all included RCTs, hardly any studies provided an a priori specified analysis plan.

Fig. 2: Risk of bias assessment.
figure 2

Traffic-light plot of the domain-level judgements. Risk of bias was assessed across five domains for each study included in the meta-analysis using the revised Cochrane risk-of-bias tool (RoB 2.0). The combination of assessments in the five domains results in an overall risk of bias rating.

Meta-analysis

Controlled effect sizes

Tables 2 and 3 show controlled effect sizes as well as measures of heterogeneity (Q statistic and I2) for depression and anxiety outcomes for individual, group or internet-based settings, comparing TD-CBT to DS-CBT, TAU, waitlist and other treatments and for posttreatment as well as follow-ups. In addition, effect sizes and confidence intervals (CI) comparing TD-CBT to control for all three settings are displayed in the forest plots in Figs. 3 and 4 (posttreatment). Forest plots for the follow-up assessments are included in Supplementary Figs. 29.

Table 2 Between-group effect sizes of depressive and anxiety symptoms for transdiagnostic treatments compared to control groups at posttreatment
Table 3 Between-group effect sizes of depressive and anxiety symptoms for transdiagnostic treatments compared to control groups at follow-up
Fig. 3: Forest plots of controlled effect sizes (posttreatment) for depression.
figure 3

Studies are clustered according to the setting in which they investigated TD-CBT. One study88 compared TD-CBT to ACT and BA. We used a random-effects (RE) model to estimate pooled effects. n denotes the number of studies included. For each study, the black square represents the effect size (standardized mean difference, SMD) and the horizontal bars represent the 95% CI. The overall estimated effect size (Hedges‘ g) is depicted by the diamond with the dotted bars representing its 95% CI.

Fig. 4: Forest plots of controlled effect sizes (posttreatment) for anxiety.
figure 4

Studies are clustered according to the setting in which they investigated TD-CBT. One study88 compared TD-CBT to ACT and BA. We used an RE model to estimate pooled effects. n denotes the number of studies included. For each study, the black square represents the effect size SMD and the horizontal bars represent the 95% CI. The overall estimated effect size (Hedges‘ g) is depicted by the diamond with the dotted bars representing its 95% CI.

Across settings, TD-CBT revealed significantly stronger symptom reduction in depression (g = 0.74, 95% CI = 0.57–0.92, P < 0.001) and anxiety (g = 0.77, 95% CI = 0.56–0.97, P < 0.001) than controls at posttreatment. TD-CBT showed superiority to waitlist for depression (g = 1.23, 95% CI = 0.80–1.66, P < 0.001) and anxiety (g = 1.24, 95% CI = 0.82–1.67, P < 0.001) and to TAU for depression (g = 0.90, 95% CI = 0.66–1.14, P < 0.001) and anxiety outcomes (g = 0.98, 95% CI = 0.63–1.33, P < 0.001) with large effects. We found no statistically significant difference between TD-CBT and DS-CBT in alleviating depressive (g = 0.09, 95% CI = −0.07–0.25, P = 0.269) and anxiety symptoms (g = 0.09, 95% CI = −0.01–0.20, P = 0.091). The comparison between TD-CBT and DS-CBT was corroborated by conducting more Bayesian analyses. A description of the statistical procedure for the Bayesian analyses as well as forest plots for the original model and sensitivity analyses can be found in Supplementary Figs. 2227. Estimated effect sizes confirmed the frequentist findings for depression (g = 0.09, 95% CI = −0.12–0.27) and anxiety (g = 0.09, 95% CI = −0.04–0.24). In comparison to other active control groups (including bona fide treatments), TD-CBT was more effective for depression (g = 0.27, 95% CI = 0.13–0.42, P < 0.001) with small effects but not for anxiety (g = 0.14, 95% CI = −0.04–0.31, P = 0.128). TD-CBT was superior to controls at 3 months follow-up (depression g = 0.55, 95% CI = 0.30–0.80, P < 0.001; anxiety g = 0.48, 95% CI = 0.18–0.79, P = 0.002), at 6 months follow-up (depression g = 0.20, 95% CI = 0.10–0.30, P < 0.001; anxiety g = 0.23, 95% CI = 0.11–0.36, P < 0.001) and at 12 months follow-up (depression g = 0.24, 95% CI = 0.13–0.35, P < 0.001; anxiety g = 0.22, 95% CI = 0.12–0.32, P < 0.001) but not at 24 months follow-up (depression g = 0.20, 95% CI = −0.05–0.46, P = 0.111; anxiety g = 0.14, 95% CI = −0.02–0.31, P = 0.092). Overall, we found high and significant heterogeneity amongst studies (against all controls at posttreatment: I2 = 88.29% for depression and I2 = 91.72% for anxiety), which remained high after isolating treatment format. Results for sensitivity analyses are provided in Supplementary Tables 58. When removing outliers (Supplementary Tables 5 and 6) heterogeneity was reduced with comparable effects.

Uncontrolled effect sizes

Uncontrolled effect sizes and their CI as well as measures of heterogeneity (Q statistic and I2) for anxiety and depression outcomes for all three settings, from pre- to postassessment and at follow-ups, are reported in Supplementary Table 4 and Supplementary Figs. 1019.

Publication bias

Statistical analyses indicated asymmetry of funnel plots for controlled effects (Kendall’s tau = 0.36–0.38, P < 0.001; Egger’s test Z = 6.89–7.45, P < 0.001). Funnel plots, with and without trim and fill method, for the controlled effects for anxiety and depression (posttreatment) are provided in Supplementary Figs. 20 and 21.

Discussion

TD-CBT for emotional disorders attracted increased attention and considerable research activity in recent years. This is reflected in the large number of 56 RCTs with 6,916 patients included in our comprehensive review on individual, group and internet-based formats.

Overall, TD-CBT was effective in both the short and long terms. Most studies compared TD-CBT to waitlist-control conditions and yielded large effect sizes in line with previous benchmarks30. Our review and meta-analysis also included active control groups. We found that TD-CBT produced large effects in comparison to TAU which comprised very heterogeneous setups, ranging from low-key treatments to clinician-tailored personalized interventions (for example, ref. 31). In comparison to other active treatments, such as behavioural activation or CBT for perfectionism, TD-CBT had a stronger impact on depression, with small effects but not on anxiety. Of special interest is how TD-CBT compares to gold-standard DS-CBT (for example, evidence-based manualized individual therapy in one trial32 or group treatments in another trial33). Overall, TD-CBT produced comparable effects to DS-CBT, with no significant differences emerging between both approaches. The comparability of TD-CBT to DS-CBT was investigated in previous meta-analyses with mixed findings: ref. 22 found TD-CBT to produce comparable effects to DS-CBT for anxiety outcomes but to surpass DS-CBT in its efficacy for depression outcomes. While ref. 15 found TD-CBT to produce significantly greater effects than DS-CBT, their results also suggested that these differences may not be clinically significant. With our meta-analysis including more studies that directly compare TD-CBT and DS-CBT in RCTs, our findings provide further evidence of the comparability of TD-CBT and DS-CBT.

We have also investigated effects beyond the immediate end of treatment. While uncontrolled effects should be interpreted cautiously as they cannot be causally interpreted, we did find that the effects of TD-CBT remained stable over time, based on follow-up assessments at 3, 6 and 12 months. Five studies—four of them stemming from the same research group and investigating internet-based interventions—also included a long-term follow-up up to 24 months. For this long-term follow-up, we found large effects over time for TD-CBT on anxiety and depression outcomes (standardized mean changes (dSMC)) = 1.47–1.75), with no significant differences between TD-CBT and DS-CBT. While more research by independent groups is warranted, this comparability underlines the potential of TD-CBT and strengthens the argument for a broadly applicable, transdiagnostic approach, with high scalability and reach.

Concerning the different settings we have investigated, we found comparable effects between all three settings, individual, group and internet-based. There was a strong uptake of TD protocols in group and internet-based settings. Applying TD-CBT as a group treatment may be beneficial in health care systems with limited resources. On top of groups saving therapeutic resources, TD-CBT groups specifically may be a more feasible approach to delivering evidence-based care than offering disorder-specific groups. The comparable effects of the individual and internet-based setting strengthen the evidence for the comparability of both settings12,34. Delivering TD protocols online or in conjunction with in-person sessions (‘blended care’) may even boost the potential of TD-CBT to reduce the treatment gap for those in need: TD internet-based interventions are not only highly effective, they also address barriers to treatment access and can reach underserved communities, such as those living in geographically remote areas (for example, refs. 35,36) or those with limited mobility, for example, due to chronic physical conditions37.

Our study is not without limitations. We included anxiety, obsessive compulsive, depressive as well as adjustment disorders as primary diagnoses in our review, with most studies investigating TD-CBT for GAD, SAD and MDD. We can neither draw conclusions about the efficacy of TD-CBT for individual diagnoses nor judge its efficacy for other diagnoses which, depending on the definition, are also counted among the emotional disorders (for example, somatic symptom disorders, post-traumatic stress disorder or borderline personality disorder). TD-CBT was investigated in different continents and countries, speaking to its dissemination potential. However, investigations from South America and Africa were under-represented.

The risk of bias assessment revealed possible sources of bias especially in terms of blinding of assessors, patients and therapists—which can be expected given our focus on self-report measures and psychotherapy trials. However, we also found concerns in terms of selective reporting which highlighted that more open science practices in psychotherapy research are warranted, from preregistered analyses plans to open data sharing. This would facilitate replications by independent research groups which are needed to explore the generalizability of our findings and preclude allegiance effects which some of our included studies may be at risk of38,39,40. The implementation of such practices may also help to counteract publication bias. We found exceptionally high heterogeneity of effects. Overall heterogeneity decreased when taking treatment format into account and removing outliers. Future research should investigate whether other clinical or methodological factors such as mechanisms targeted in the TD-CBT protocol, treatment dose, patient or study characteristics might have an impact. An individual participant data meta-analysis would be a key next step in this regard. It also remains unclear if there are any contraindications for TD-CBT, since symptom deterioration, comorbidity and dropout were not systematically examined. Moreover, the clinical relevance of symptom improvement is yet to be investigated and outcome measures beyond symptoms of depression and anxiety, such as quality of life or level of functioning, should be explored. We chose to exclusively study adult populations for our investigation of TD-CBT due to differences in developmental adaptations of treatments, classifications of emotional disorders, outcome measures and treatment efficacy between child/adolescent and adult populations. We focused our review on unified ‘broadband’ TD-CBT that aims at changing mechanisms shared between disorders. With the surge of research on personalized interventions41, it may be a fruitful next step to investigate the merit of personalizing unified TD-CBT interventions as well. TD-CBT promises to facilitate training and clinical decision-making, rendering training and treatment less costly. A first study investigated the cost-effectiveness of TD-CBT and found that it may be a cost-effective alternative to TAU42,43. However, more research on whether the proposed advantages, for example, in terms of training times and cost-effectiveness, generally hold true is needed.

Our analyses provide evidence that TD-CBT in face-to-face individual, group and internet-based formats is efficacious in reducing symptoms of anxiety and depression. Evidence from trials on internet-based TD-CBT revealed large and stable long-term effects. Taken together, these findings further strengthen the transdiagnostic approach to the treatment of emotional disorders across settings.

Methods

This study is based exclusively on published literature, therefore no ethics approval was required. In conducting and reporting this review and meta-analysis, we followed the Cochrane Handbook for systematic reviews44 and the updated PRISMA45. The protocol was registered with PROSPERO on 27 September 2019 (registration no. CRD42019141512).

Search strategy

A systematic literature search was conducted on PubMed and MEDLINE, PsycINFO, Google Scholar, medRxiv (including bioRxiv) and OSF Preprints up to 16 June 2023. Different to what we preregistered, our search covered preprint servers to consider also the most recent findings. We built a search string by combining the concepts ‘transdiagnostic’, ‘CBT’, ‘emotional disorder’ and ‘RCT’ using the AND Boolean operator. Each concept included terms connected with the OR Boolean operator. The concept ‘CBT’ also covered terms describing the treatment setting (for example, ‘internet-based intervention’). We searched for relevant medical subject headings (MeSH), used by the United States National Library of Medicine to index articles in PubMed and MEDLINE (for example, ‘cognitive behavioural therapy’, ‘anxiety’, ‘depression’). In addition, we included terms commonly used in the relevant literature (for example, ‘unified’). The resulting string was then slightly adapted according to the search options of the different databases (see Supplementary Table 2 for the complete search strings). We included additional studies if they were identified by reference lists and met our inclusion criteria. We used Zotero (v.6.0.23) and Google Sheets.

Inclusion criteria

We included studies published between January 2000 up until June 2023.

Population

Studies were included if the treatment was delivered to treatment-seeking adults. Deviating from our preregistration, we did not apply an upper age limit of 65 years, if the study was not solely targeted at older adults and the mean age of the study population was comparable to other studies with adult populations. We opted for this change to provide a more comprehensive review. Participants had at least one clinician-established diagnosis of an emotional disorder. We included SAD, panic disorder, agoraphobia, GAD, obsessive compulsive disorder, unipolar depressive disorders and adjustment disorders as treatment targets.

Interventions

We selected studies that investigated TD-CBT in an individual, group or internet-based setting (with or without clinician guidance). This included established unified comprehensive TD protocols that were specifically developed to target underlying processes or comorbidity such as the UP5, false behaviour elimination therapy46, emotion regulation therapy47, affect regulation training48 or transdiagnostic behaviour therapy7. We also included protocols that contained CBT components that modified dysfunctional cognitions and behavioural patterns across diagnostic groups, for example, cognitive restructuring. Following this definition, we included ‘common elements approaches’49 if they were presented in a UP, that is, a combination of effective components across disorders. As our focus was not on third wave or experiential approaches within CBT, we excluded standalone mindfulness-based treatment approaches50,51, metacognitive therapy/ training52 and acceptance and commitment therapy53,54. We also excluded protocols targeting transdiagnostic phenomena that cannot be considered shared mechanisms between disorders, such as protocols focusing on self-worth or loneliness.

Comparison groups

We included studies that compared TD-CBT to a control group, including (1) waitlist-control condition (that is, delayed treatment), (2) TAU, (3) DS-CBT and (4) other active psychological interventions. TAU included all treatments that the original study defined as ‘usual care’, ‘standard care’ or ‘care as usual’55. Other active psychological interventions included interventions that are based on a psychological rationale but are neither considered TAU nor diagnosis-specific treatments, for example, behavioural activation.

Outcomes

Included studies applied a continuous self-report measure of anxiety and/or depression severity at pre- and posttreatment and (if available) at follow-up.

Study design

RCTs were included.

Exclusion of studies

We excluded studies if (1) the treatment was not based on CBT principles, such as psychodynamic interventions and process-experiential principles and (2) they investigated a modularized or tailored treatment, as we did not consider this in line with the concept of unified TD-CBT. Supplementary Table 3 provides an overview of reasons for exclusions for all excluded studies that were full-text screened.

Study selection and data extraction

Two reviewers independently screened search results based on title and abstract, evaluated potentially eligible publications through full-text read, selected studies matching the inclusion criteria and extracted data for the meta-analysis. Selection results were compared and any disagreements about eligibility were resolved through discussion and in consultation with the project leaders. Interrater-agreement was reached for 95% of the reviewed studies. If not reported in the publication, data were requested directly from study authors. We contacted 35 authors and sent up to two follow-up emails in case of no response, 69% of the authors sent us requested data. We extracted means and standard deviations of self-reported anxiety and/or depression at all available time points corresponding to pre- and posttreatment as well as follow-up. We grouped follow-up time points on the basis of the most frequent reassessments in the included studies. Most of the studies reassessed participants at exactly 3 months (n = 22 studies), 6 months (n = 16 studies), 12 months (n = 10 studies) or 24 months (n = 4 studies). Only one study had a shorter follow-up than 3 months, two studies had a follow-up between 3 and 6 months, four studies between 6 and 12 months and one study between 12 and 24 months. For the few studies with follow-ups falling in between those four measurement points, we allocated the data to the time point to which they were closest.

As many studies reported more than one outcome measure for anxiety or depression, we used the primary outcome measure defined by the study authors or, if this was not available, the measure most commonly used across studies in our final sample. Other variables extracted were control group (waitlist/TAU/DS-CBT/other) and treatment setting (individual/group/internet-based). Studies were grouped for synthesis by type of control group and treatment setting.

Statistical analyses

All analyses were conducted in R (v.4.3.1), using the metafor56 (v.4.2-0), meta57 (v.6.5-0) and dmetar packages58 (v.0.1.0).

We calculated controlled effect sizes for the difference between the transdiagnostic treatment and the control conditions in main outcomes (depression and anxiety) at posttreatment (relative efficacy), using the bias-corrected Hedges’ g and the 95% CI59. These were calculated by subtracting the mean posttreatment score of the transdiagnostic condition from the mean score of the control condition, divided by the pooled standard deviation of both conditions. Values of 0.2, 0.5 and 0.8 of Hedges’ g represent a small, moderate and large effect size, respectively60.

Building on previous work14,22, we expected considerable variability and thus used a random-effects model61 to account for heterogeneity of included studies62. We tested heterogeneity of effect sizes with the Q statistic, the I2 statistic and by visual inspection of forest plots. A P value of the Q statistic below 0.05 indicates heterogeneity63. I2 ranges from 0 to 100%, with 25% representing low, 50% moderate and 75% high heterogeneity64. We addressed heterogeneous effect sizes by conducting subgroup analyses for the three different treatment formats (individual, group or internet-based), if at least three studies per subgroup were available. Additionally, we investigated whether excluding outliers impacted effect sizes and heterogeneity. In line with previous meta-analyses, outliers were defined as studies whose 95% CI did not overlap with the 95% CI of the overall effect size20.

We calculated uncontrolled effect sizes from pre- to posttreatment (absolute efficacy) for main outcomes (depression and anxiety) and from pretreatment to follow-up assessment. If reported, we used the intention-to-treat data from the studies for these analyses. As recommended by ref. 56, we estimated the uncontrolled effect sizes using dSMC and their respective 95% CI. Raw score standardization with heteroscedastic population variances at baseline (pretreatment) and posttreatment/follow-up were applied for more reliable estimates65,66. The effect sizes dSMC were determined using the means, standard deviations (s.d.) at each time point and the retest correlation between these time points. Values of 0.2, 0.5 and 0.8 for dSMC represent a small, moderate and large effect size, respectively60. If the correlation was not available from the studies included, retest correlations were calculated from the original study data. If not available, a default value of 0.5 was set61. In addition, we performed sensitivity analyses using 0.3 and 0.7 as retest correlations.

We assessed publication bias by inspecting the funnel plot on the depression and anxiety outcome measures as well as calculating rank correlations and Egger’s tests. Additionally, we applied the Trim and Fill procedure67.

Study quality assessment

As an updated version of the Cochrane risk-of-bias tool had become available since we registered this review on PROSPERO, deviating from our preregistration, we evaluated the risk of bias of the studies by using the revised Cochrane risk-of-bias tool (RoB 2.0)68. We assessed the risk as ‘low’, ‘some concerns’ or ‘high’ in the following five domains: (1) bias of the randomization process; (2) bias of deviations from intended interventions; (3) bias of missing outcome data; (4) bias in measurement of the outcome; and (5) bias in selection of the reported results. Each domain is made up of several criteria. For example, the first question in domain (1) asks ‘Was the allocation sequence random?’ which, after consulting the respective manuscript, is answered as ‘yes’, ‘probably yes’, ‘probably no’, ‘no’ or ‘no information’. The RoB 2.0 provides examples and decision trees that clearly specify that certain combinations of ratings across questions within a domain result in the risk of bias of that domain being rated as ‘low’, ‘some concerns’ or ‘high’. In domain (1), a ‘high’ risk of bias would be noted if differences between intervention groups were evident at baseline, suggesting a problem with the randomization—regardless of whether no risk of bias was indicated in the evaluation of all other criteria. Two reviewers independently rated each study for bias. Final assessments were cross-checked and disagreements were resolved through discussions between the reviewers. We created the visualization of the risk of bias assessment with the shiny app robvis69.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.