Introduction

The global prevalence of overweight and obesity among children and adolescents aged 5–19 years has quadrupled over the past four decades [1]. Although effective prevention and treatment interventions exist [2], widespread access to evidence-based interventions remains a significant challenge. Furthermore, inadequate insurance coverage and reimbursement for childhood obesity treatment prevents children and families from obtaining affordable care [3], which may exacerbate global obesity disparities among disadvantaged youths [4].

Pediatric weight-management interventions that incorporate digital “eHealth” technologies or mobile “mHealth” technologies are promising low-cost solutions to increase access to care, as they can often be accessed anywhere, anytime. Furthermore, eHealth/mHealth technologies may bolster engagement through novel “kid-friendly” programming (e.g., exergaming, virtual reality) and may prevent weight gain among youth at-risk for overweight/obesity or enhance the durability of weight change through intervention personalization (e.g., momentary feedback, booster texts). Preliminary meta-analytic findings support the acceptability and feasibility of eHealth/mHealth technologies as both stand-alone and adjunctive interventions for pediatric obesity [5, 6]. However, the heterogeneity of eHealth/mHealth technologies included in prior studies precludes conclusions on whether or not intervention efficacy varies by type of technology and target population (e.g., parent- vs child-facing) [7, 8]. Evaluation of these factors is crucial in order to understand the extent to which technological interventions should be considered in the development of effective, affordable, integrated care models for pediatric obesity. Furthermore, given the constant advancement of technology, the application of emerging eHealth/mHealth technologies to pediatric obesity prevention and treatment warrants an updated investigation. Therefore, we synthesized the recent literature on technology-based interventions for the prevention and treatment of overweight/obesity in youth.

Method

Literature search and selection of studies

The review methodology was preregistered on PROSPERO (registration number: CRD42020150683). A systematic literature search was conducted following PRISMA guidelines [9] (see Appendix A for PRISMA checklist). A medical librarian (LHY) searched the literature for records including the concepts of obesity prevention and treatment, technology-based interventions, children, and randomized controlled trials. The librarian created search strategies using a combination of keywords and controlled vocabulary (see Appendix B for fully reproducible search strategies) in Ovid Medline 1946-, Embase.com 1947-, Scopus 1823-, Cochrane Central Register of Controlled Trials (CENTRAL), The Cumulative Index to Nursing and Allied Health Literature (CINAHL) 1937-, PsycINFO 1927-, and Clinicaltrials.gov 1997-. All search strategies were completed in September 2019. A total of 5036 results were found. After using the de-duplication processes based on previously published guidance [10], 2436 duplicate records were deleted, resulting in a total of 2600 unique citations included in the project library. Searches were updated April 25, 2020 by running all searches again and removing duplicates against the original Endnote library. An additional 806 unique records were found, resulting in a total of 3406 screened for inclusion.

Inclusion and exclusion criteria

Identified studies were screened for eligibility if they met the following inclusion criteria: (1) pediatric population (1–18 years old); (2) use of technology in an intervention targeting prevention or treatment of overweight and/or obesity (intervention must include use of technology but did not have to be solely delivered through technology; e.g., text messaging, phone calls, telehealth, mobile applications (apps), email, machine learning adaptive interventions); (3) primary or secondary outcome of relative weight or adiposity (e.g., BMIz, body fat percentage, fat mass index); (4) randomized controlled trial (RCT); and (5) published after January 2014 (previous 5 years). This was due to recent reviews ending their searches with literature published in 2014 [6, 11, 12] or focusing exclusively on parent-delivered [8], self-monitoring [5], or mobile technology [13] interventions for pediatric weight management. Furthermore, the aim of the present review was on advancing the current state of evidence by focusing on recent and current technology interventions. Phone calls were included as technology in this review as this approach to delivering telemedicine is often included in mHealth reviews [14, 15] and represents one way that technology reduces barriers to treatment by remotely delivering intervention components through mobile devices.

Studies were ineligible if they met any of the following exclusion criteria: (1) target population exclusively infants or adults (i.e., <1 or >18 years old); (2) lack of original data (e.g., trial protocol, review, commentary, secondary analysis of included study, or conceptual study) and/or lack of evaluation of participant outcomes (e.g., software, hardware, or computing/engineering proof of concepts); (3) primary purpose of the technology was for assessment (e.g., ecological momentary assessment) as opposed to intervention; (4) intervention primarily targeted chronic condition other than weight management (e.g., diabetes, chronic pain, psychiatric disorders); (5) intervention targeted children with a specific chronic condition (e.g., children with developmental delays, children with diabetes); (6) no reporting of weight outcome; or (7) not an RCT.

If the primary purpose of the technology was for intervention and not assessment and the technology was not the main intervention element, the technology still needed to be an essential component in the intervention. Therefore, in the first stage of the abstract screening process, records were examined for technology use. If the technology component was ancillary to the intervention such that it did not warrant mention in the abstract, it was believed that the trial did not warrant the label of “eHealth/mHealth intervention” or inclusion in this review. Importantly, if there was ambiguity about technology use, we were conservative in our exclusion of articles in the abstract screening, and we included records in the full text review if it was unclear whether technology was used and whether it was for assessment or intervention (or both).

Data extraction and synthesis

Search results received from the medical librarian (LHY) were uploaded into Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia) and screened by two authors (LAF and ACG) for inclusion/exclusion. Duplicates were removed prior to title and abstract screening. Disagreements were resolved through consensus or with a third reviewer (EFC). Year of publication, country, study design, participant characteristics (e.g., age, sex), intervention description, and main outcomes related to weight were abstracted using Microsoft Excel, version 1908. Studies were classified as prevention or treatment interventions given the differences in expected outcome for participants in these trials (i.e., anticipated decreases in adiposity for treatment interventions versus anticipated maintenance or stable adiposity for prevention interventions). Treatment interventions were defined as interventions targeting weight loss among youth with overweight or obesity. Prevention interventions were interventions that could have included youth across the weight spectrum and that did not target decreases in adiposity for the entire sample of participants.

Quality assessment

The quality assessment was completed in Covidence systematic review software (Veritas Health Innovation, Melbourne, Australia). Studies were independently rated by two authors (LAF and ACG) following the criteria established for studies included in Cochrane reviews [16]. Risk of bias was assessed for the following: sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessors, incomplete outcome data, selective outcome data, and other sources of bias. Studies were rated for each of the seven criteria as low risk of bias, high risk of bias, or unclear. Raters discussed any disagreements and reached consensus. Following established quality threshold recommendations [17], studies were judged overall based on the number of unclear and high risk of bias judgments. A study was judged to have overall high risk of bias if more than one criteria were rated as high risk or more than four criteria were rated as unclear. Studies were judged to have some concerns regarding bias if they had a high risk of bias for one of the seven criteria and at least one criteria with judgments of “unclear.” Algorithms for implementing criteria to reach risk of bias judgments previously outlined were used when, for example, measures were used to mitigate the risk of absence of masking participants and personnel to treatment condition, which is often not possible with behavioral interventions [17].

Meta-analysis

Given the inherent differences in expected outcome change for prevention trials and treatment trials, two separate analyses were conducted. Means and standard deviations, or effect estimates and standard errors, were extracted from all included articles (see Appendix C for tables of data that were extracted or provided by authors to calculate effect sizes). When data were not reported, authors were contacted (n = 5), of which two provided unpublished raw means and standard deviations for the present analyses [18, 19]. The studies of authors who did not provide data were excluded from the meta-analysis but included in the narrative review [20,21,22]. Two studies [23, 24] reported median and interquartile range, which were converted to mean and standard deviation using previously published equations [25].

The effects were then converted into a standardized effect size to compare between studies. Effect sizes (Cohen’s d) were calculated by subtracting the mean score of the intervention group (Mi) from the mean score of the comparator group (Mc) and dividing the result by the pooled standard deviations of both groups. This was done at post-test unless baseline differences on outcome variables existed despite the randomization. Then Cohen’s d values were calculated as the difference between the standardized pre- and post-change score for each group. Using previously published methods [26], 95% Confidence Intervals (CI) were calculated for the effect size of each study. Effect size conventions proposed by Cohen were used (i.e., d = .20, d = .50, and d = .80 indicate small, medium, and large effects, respectively) [27].

We used a random effects model with inverse variance weighting [28] to estimate a pooled mean effect size with 95% CI due to the diversity of studies and populations. Presence of heterogeneity was examined with I2-statistic and τ2. I2 is a percentage indicating proportion of the total variability in a set of effect sizes due to true heterogeneity (i.e. between-studies variability). A value of 0% indicates an absence of heterogeneity, and larger values show increasing levels of heterogeneity (i.e., 25%, 50%, and 75% can be considered low, moderate and high levels of heterogeneity, respectively) [29]. The DerSimonian-Laird estimator [30] was used to calculate τ2, and the Jackson method [31] was used to calculate the 95% CI for τ2 as a measure of between-studies variance, with values closer to 0 suggesting less heterogeneity [32].

Meta-regressions examined whether intervention effects differed based on: (1) comparator type (i.e., active comparator vs. waitlist control), (2) technology role (i.e., adjunct to treatment vs. mostly [e.g., one initial in-person session] or solely technology-delivered), (3) technology use (i.e., provider-delivered telehealth such as phone calls or video chats vs. other technologies used in a variety of ways that do not require a trained provider to deliver the content such as web programs, apps, texts, video/exergames, sensors, e-mails, and social media), (4) delivery target (i.e., parent-, child-, or both parent- and child-delivered interventions), (5) trial type pilot or n < 100 vs. n ≥ 100, (6) mean participant age, and (7) intervention duration in months.

For three-arm RCTs, the true control (e.g., wait list control or minimal contact) was used as the comparison group, with two exceptions: the two non-technology-based comparison conditions were combined, e.g., [33], unless the active intervention also contained the technology component, e.g., [34], in which case the two technology-based conditions were combined. All analyses were conducted in R Version 4.0.2 [35], using the meta package [36] for analyses, the metaviz package [37] for data visualization, and the dmetar package for meta-regressions.

Results

Study selection

Qualitative synthesis

Ninety-one full text articles were reviewed for inclusion; 55 articles [18,19,20,21,22,23,24, 33, 34, 38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83] representing 54 unique RCTs were identified as meeting the inclusion criteria for the qualitative synthesis (Fig. 1). Although two articles reported on the same RCT, one provided outcomes at post-intervention [49] and the other at long-term follow-up [48]; therefore, both articles were included in the review. Ten additional studies were identified from the reference lists of the included articles. However, all 10 articles were excluded due to the following reasons: not an RCT (n = 4), no weight outcome (n = 3), no technology component (n = 2), and no full text (n = 1).

Fig. 1
figure 1

PRISMA flow chart of study inclusion process.

Quantitative synthesis

Overall, data were available for 52 of 54 unique RCTs (n = 32 treatment trials and n = 20 prevention trials) at post-intervention and 14 RCTs (n = 6 treatment trials and n = 8 prevention trials) at long-term follow-up. Meta-regression models were only computed for post-intervention outcomes given the small number of prevention and treatment RCTs with data at long-term follow-up.

Study characteristics

Most studies (n = 27) were conducted with community samples, 15 with a clinical sample (e.g., patients at an outpatient clinic for obesity treatment), and 13 with a school sample. Twenty-two interventions focused on prevention of overweight and obesity and the other 33 articles reported on treatment interventions. Twenty-seven studies were conducted in the United States, 13 studies were from western or central Europe, four were from southeast Asia, four were from Australia, two from New Zealand, two from South America, one from the Middle East, one from eastern Europe, and one from Canada. Target participants ranged from 1.5 to 18 years of age. Studies typically targeted a specific age group, i.e., young children (<5 years) or infants (n = 9), children (<13 years; n = 27), or adolescents (n = 18); however, one study targeted both children and adolescents aged 4 to 17 years [65]. Although all studies were RCTs, 25 were cluster RCTs, five identified as pilot RCTs, and one was a crossover RCT. Seventeen studies did not include parents, 17 studies included parents in the intervention but they were not the primary targets, 13 studies targeted both the parent and the child, and the remaining eight studies primarily targeted parents. Table 1 displays study characteristics for all studies included in the review.

Table 1 Study characteristics of included studies.

Study quality

Twenty-nine of the 55 articles (55%) were judged as having an overall low risk of bias, defined as having fewer than three criteria judged as “unclear.” Thirteen articles (24%) were judged as having some concerns overall, based on a priori judgment algorithms [17]. Seven articles (13%) were judged as having some concerns but not likely to significantly bias the studies’ results. These articles did not mask participants and personnel to treatment condition; however, since it is often not possible to blind participants and personnel in behavioral interventions, this criteria was judged to be less likely to significantly bias results when other measures were used to reduce the risk of differential behaviors by patients and healthcare providers [17]. Finally, six articles (11%) were judged as having an overall high risk of bias. Information on the quality assessment ratings for all articles is included in Table 2.

Table 2 Quality assessment of included studies.

Type of intervention

Study interventions varied immensely in treatment modality and content. The interventions included mobile phone app interventions [42, 48, 49], text-based interventions [19, 60, 61], home-delivered interventions with technology adjunct components [34, 63], group sessions or interactive classes with phone calls [53, 58], group-based exergames [78], exergaming in addition to family-based behavioral treatment [82], and active video games as replacements for non-active video games [75], among others.

Although 21 studies did not report that the intervention was informed by any theoretical basis, most studies (n = 34, 62%) described at least one theory or evidence-based approach that informed the development of the intervention, and many (n = 15, 27%) reported several frameworks. The most common theories reported were social cognitive theory (n = 21), motivational interviewing (n = 7), cognitive-behavioral therapy (n = 4), and self-determination theory (n = 6).

The type of technology that each intervention utilized was also variable, and 11 interventions used more than one type of technology. Of the interventions that only utilized one type of technology (n = 44), six used text messaging, six involved exergaming, 10 relied on phone calls for remote delivery of the intervention, one used a wearable sensor, six used a mobile app, one used email, 10 used a web platform, one used a computerized decision tree for tailored intervention, two used video games, and one used video chat for delivering the intervention. Many studies involved multiple types of technology, including the internet (n = 15), phones calls (n = 14), text messaging (n = 12), mobile app (n = 8), exergames (n = 7), wearable sensors (n = 5), emails (n = 3), video chat (n = 3), computerized decision tree (n = 1), and social media (n = 1).

We created three separate classifications for the technology function for the interventions (i.e., solely technology-delivered, mostly technology-delivered, and technology used as adjunct to intervention). Using technology adjunctively refers to complementing in-person treatment with digital technology components. A study was classified as solely technology-delivered when it did not rely on any in-person elements. Finally, a “mostly” technology-delivered intervention involved one to two in-person sessions followed by a completely technology-delivered intervention. This was created to distinguish different ways of employing technology in an intervention, whether it involved regular in-person elements (i.e., using technology adjunctively), involved no in-person elements (i.e., solely technology-based) or minimal initial in-person elements (i.e., mostly technology-based). Twenty-four studies tested interventions delivered solely through technology. Twenty-four studies used technology as an adjunct to in-person delivered intervention. Seven studies were mostly delivered through technology but involved one or two in-person counseling sessions.

Intervention length and dosage

Intervention length ranged from 1 to 24 months, with an average intervention duration of 6.5 (SD = 4.5) months. Dosage varied from being structured contact (e.g., online modules for 60 min one time per week; biweekly phone calls with behavioral coach; three text messages per day) to self-paced use (i.e., engagement with the intervention content depended on the participant).

Comparison and control conditions

Although all included studies had comparison groups, studies varied greatly depending on the scope of the research question. Many studies (41%; n = 22) included an active control comparison condition that did not have a technology component, which ranged from an educational pamphlet to in-person treatment (e.g., family-based treatment, cognitive-behavioral therapy). Fifteen studies (27%) included a no-contact control condition (i.e., waitlist), 10 studies included a usual care comparison condition, and thirteen studies had an active intervention comparison with a technology component. Six studies (11%) had three intervention arms [24, 33, 34, 41, 55, 80]; most (n = 5) included a usual care or no-contact comparison condition along with an active intervention comparison, which involved technology components for some (n = 4).

Follow-up

Seventeen studies (31%) measured outcomes at an additional time point after the post-intervention time point. Studies varied in length of long-term follow-up, from 2 to 18 months, with an average of 8.6 (SD = 4.6) months.

Intervention efficacy

Prevention of overweight or obesity

Narrative

Out of 22 prevention studies representing 21 unique RCTs, six articles (five unique prevention RCTs; 24% of unique prevention RCTs) found significant intervention effects at post-intervention. Three of the RCTs that found significant intervention effects were solely technology-delivered interventions [44, 48, 49, 71], compared to no-contact waitlist controls [44] and active comparators without technology components [49, 71]. The other two RCTs that reported significant intervention effects used technology components as an adjunct to treatment [34, 50], compared to an active control with technology [34] and a waitlist control [50]. Two of the significant studies reported on the same RCT [48, 49].

Sixteen of 22 prevention studies representing 21 unique RCTs did not find significant differences between intervention and comparison conditions on adiposity or weight outcomes at post-intervention (76% of unique prevention RCTs). Four prevention studies that reported overall null findings did report significant effects for the intervention in subgroups. Two studies reported intervention effects among children with overweight [73] or obesity [59] at baseline, another reported greater reductions in weight status among girls at post-intervention and long-term follow-up [74], and the last study reported attendance at intervention classes as a predictor of slower increases of BMIz [58].

Of the nine prevention studies that included an additional follow-up measure post-intervention, eight trials reported no significant differences in adiposity measures between intervention and comparison conditions at follow-up; one study reported significant effects of the waitlist condition on BMIz at long-term follow-up, which was 6 months post-intervention [75]. Of the four unique RCTs that reported significant short-term intervention effects, only one of these reported long-term follow-up outcomes, finding no sustained intervention effects at long-term follow-up [48].

Meta-analysis

The random effects model with inverse variance weighting was used to calculate the pooled mean effect size for n = 20 prevention trials (Fig. 2). The estimated mean effect size was 0.004 (95% CI = −0.078, 0.086), which was not significantly different from zero (p = 0.930), where negative effect sizes represent greater effects of the treatment condition on outcomes compared to the comparator/control condition. Heterogeneity of the effect sizes at post-intervention was moderate (I2 = 42.6%; 95% CI = 2.4%, 66.2%), confirmed by a significant test of heterogeneity, Q (19) = 33.08, p = 0.024. The estimated between-studies variance suggests some heterogeneity among the true effects (τ2 = 0.01, 95% CI = 0.00, 0.07).

Fig. 2: Forest plot of effect sizes of included prevention randomized controlled trials at post-intervention.
figure 2

Negative effect sizes represent greater effects of the treatment condition on outcomes compared to the comparator/control condition.

The estimated mean effect size for n = 8 prevention trials at long-term follow-up was not significantly different from zero (d = 0.063, 95% CI = ‒0.019, 0.145, Fig. 3). Heterogeneity of the effect sizes at follow-up was low (I2 = 7.0%, 95% CI = 0.0%, 69.9%), confirmed by a nonsignificant test of heterogeneity (Q (7) = 7.53, p = 0.376). The estimated τ2 indicated low heterogeneity between studies (τ2 = 0.00, 95% CI = 0.00, 0.04).

Fig. 3: Forest plot of effect sizes of included prevention randomized controlled trials at follow-up.
figure 3

Negative effect sizes represent greater effects of the treatment condition on outcomes compared to the comparator/control condition.

Meta-regressions at post-intervention

Meta-regression models with random effects demonstrated no significant differences in study effect size by comparator type (n = 11 active comparators, n = 9 waitlist control, p = 0.735), technology role (n = 9 adjunct to treatment, n = 11 mostly or solely technology-delivered, p = 0.294), and technology use (n = 4 provider-delivered telehealth, n = 16 non-provider-delivered technology, p = 0.367), delivery target (child-delivered (n = 12), parent-delivered (n = 3), or both parent- and child-delivered (n = 5), p = 0.697), study type (n = 7 pilot trials and/or n < 100, n = 13 non-pilot trials, p = 0.516), age (p = 0.818) or intervention duration (p = 0.151).

Post-hoc analyses

Publication bias was assessed using a funnel plot (Fig. 4) and Eggers’ test, which did not indicate funnel plot asymmetry (intercept = -0.19, 95% CI = -1.61, 1.23, p = 0.800). This suggests that the scatter in the funnel plot may be due to sampling variation and does not suggest publication bias [84]. Influence analyses involved examination of outliers, Baujat plot and diagnostics, influence diagnostics, and a sensitivity analysis. Outlier analyses at post-intervention suggested that excluding one study, Rerksuppaphol et al. [71], would reduce the heterogeneity of effect sizes from 42.6% to 2.3% and the test of heterogeneity would no longer be significant (p > 0.05). This study was also identified as having the largest contribution to the heterogeneity of study effect sizes (14.0) in Baujat plots and diagnostics. The pooled mean effect of n = 19 RCTs when excluding this study is still not significantly different from zero (d = 0.027, 95% CI = -0.032, 0.086, p = 0.364). Meta-regressions with the one study excluded produced the same results (all ps > 0.05). Finally, a sensitivity analysis excluding the seven prevention RCTs that identified as pilot studies or had samples less than 100 demonstrated a pooled mean effect size of 0.015 (95% CI = -0.075, 0.105), which was not significantly different from zero (p = 0.745). Meta-analytic results for prevention RCTs are presented with all studies retained in analyses given that the significance of the pooled mean effect remains non-significant whether potential outliers are included or excluded.

Fig. 4: Funnel plot of included prevention randomized controlled trials.
figure 4

Number Study: (1) Coknaz et al. (2019, Turkey, n = 106). (2) DaSilva et al. (2019, Brazil, n = 895). (3) Delisle Nystrom et al. (2018, Sweden, n = 263). (4) Faith et al. (2019, USA, n = 28). (5) Fulkerson et al. (2015, USA, n = 160). (6) Gao et al. (2019, USA, n = 32). (7) Gutierrez-Martinez et al. (2018, Colombia, n = 120). (8) Haines et al. (2018, Canada, n = 44). (9) Hammersley et al. (2019, Australia, n = 86). (10) Hull et al. (2018, USA, n = 277). (11) Kennedy et al. (2018, Australia, n = 607). (12) Love-Osborne et al. (2014, USA, n = 165). (13) Lubans et al. (2016, Australia, n = 361). (14) Maddison et al. (2014, New Zealand, n = 251). (15) Nollen et al. (2014, USA, n = 51). (16) Rerksuppaphol et al. (2017, Thailand, n = 217). (17) Sherwood et al. (2015, USA, n = 60). (18) Sherwood et al. (2019, USA, n = 421). (19) Simons et al. (2015, The Netherlands, n = 260). (20) Smith et al. (2014, Australia, n = 361).

Treatment of overweight or obesity

Narrative

Six of 33 treatment studies all representing unique RCTs (21%) found significant differences in weight loss outcomes among the treatment group compared to the comparison group at post-intervention. Four of the efficacious interventions were mostly or solely delivered through technology [38, 42, 77, 80] and the remaining two used technology adjunctively [55, 82]. Four of the six RCTs with significant results involved active comparators while two had no-contact control comparison groups.

Twenty-seven of 33 treatment studies (79%) did not find significant differences between treatment and comparison conditions on weight outcomes at post-intervention. Two of these 27 studies (8%) reported significant improvement in measures of adiposity (e.g., percent body fat, BMIz) in both the intervention and active comparator groups at post-intervention, with one intervention delivered solely through technology [41] and the other using technology adjunctively [81]. Despite finding no differences between conditions, five studies reported significant improvements in measures of adiposity among those in the intervention groups [19, 51, 60, 66, 70], which all had active comparators. Two treatment studies that reported overall null findings did report significant effects for the intervention in subgroups. One trial reported significant intervention effects among treatment adherers compared to no-treatment control [78], and another reported significant intervention effects among boys, but not girls, compared to usual care [69].

Eight treatment studies reported on weight outcomes at an additional follow-up after the intervention, with one study using a cross-over trial and another only assessing the intervention group at long-term follow-up [43, 51]. Both studies, which used technology adjunctively and included active comparators, reported maintenance of weight status at follow-up for the intervention group, despite no significant differences with the control group at post-intervention. Two of the other eight studies reported significant treatment effects at post-intervention vs. the comparator; these treatment effects were sustained at 6-month [38] and 10- month follow-up [55], respectively. The first RCT [38] delivered the intervention mostly through technology compared to waitlist, and the other RCT used technology adjunctively compared to both an active comparator and waitlist.

Meta-analysis

A random effects model with inverse variance weighting was used to calculate the pooled mean effect size for n = 32 treatment trials (Fig. 5). The estimated mean effect size was small (d = ‒0.133, 95% CI = ‒0.199, ‒0.067) but significantly different from zero (p < 0.001). Heterogeneity of the effect sizes at post-intervention was low (I2 = 15.3%, 95% CI = 0.0%, 45.2%), confirmed by a nonsignificant test of heterogeneity (Q (31) = 36.58, p = 0.226). The estimated τ2 indicated low heterogeneity between studies (τ2 = 0.00, 95% CI = 0.00, 0.05).

Fig. 5: Forest plot of effect sizes of included treatment randomized controlled trials at post-intervention.
figure 5

Negative effect sizes represent greater effects of the treatment condition on outcomes compared to the comparator/control condition.

The estimated mean effect size for n = 6 treatment trials at long-term follow-up was also significantly different from zero (d = ‒0.352, 95% CI = ‒0.521, ‒0.184) (Fig. 6). Heterogeneity of the effect sizes at follow-up was low (I2 = 0.0%, 95% CI = 0.0%, 51.1%), confirmed by a non-significant test of heterogeneity (Q (5) = 2.60, p = 0.762). The estimated τ2 indicated low heterogeneity between studies (τ2 = 0.00, 95% CI = 0.00, 0.12).

Fig. 6: Forest plot of effect sizes of included treatment randomized controlled trials at follow-up.
figure 6

Negative effect sizes represent greater effects of the treatment condition on outcomes compared to the comparator/control condition.

Meta-regressions at post-intervention

There were no significant effects of comparator type (n = 17 active comparators, n = 15 waitlist control, p = 0.121), technology role (n = 14 adjunct to treatment, n = 18 mostly or solely technology-delivered, p = 0.555), or technology use (n = 19 provider-delivered telehealth, n = 9 non-provider-delivered technology, or n = 4 both) p = 0.362.

Random effects models demonstrated a significant effect of study type (point estimate = −0.17, 95% CI = −0.31, −0.03, QM (1) = 6.00, p = 0.014), where effects were larger for pilot trials (reference group) and/or RCTs with n < 100 (n = 20). Shorter intervention duration was related to greater intervention effects (point estimate = 0.01, 95% CI = 0.00, 0.02, QM (1) = 4.26, p = 0.040). Greater child age was related to greater effects of the intervention (point estimate = −0.03, 95% CI = −0.05, −0.01, QM (1) = 6.23, p = 0.012). Finally, there was a significant difference of effect size across the three categories of delivery: child-delivered (n = 21), parent-delivered (n = 3), or both parent- and child-delivered (n = 8), QM (2) = 8.81, p = 0.012. Specifically, study effect sizes favored the technology arm for child-delivered interventions compared to parent-delivered interventions (point estimate = 0.40, 95% CI = 0.13, 0.66, p = 0.003).

Post-hoc analyses

Publication bias was assessed using a funnel plot (Fig. 7) and Eggers’ test, which did not suggest the presence of funnel plot asymmetry or publication bias (intercept = −0.59, 95% CI = −1.25, 0.06, p = 0.086) [84]. Influence analyses involved examination of outliers, Baujat plot and diagnostics, and influence diagnostics. Outlier analyses suggested that excluding two studies, Armstrong et al. [23]. and Garza et al. [55] would reduce the heterogeneity of effect sizes from 15.3% to 0.0% and result in a non-significant test of heterogeneity (p > 0.05). Both studies were also identified as having small contribution to the heterogeneity of study effect sizes (11.48 and 5.13, respectively) in Baujat plots and diagnostics, with influence analyses identifying Armstrong et al. as the largest contributor to the heterogeneity of effect sizes, although excluding neither study would result in substantial changes in I2 or the estimated mean effect size. The pooled mean effect of n = 30 RCTs when excluding these two studies is still significantly different from zero (d = −0.128, 95% CI = −0.182, −0.074, p < 0.001). Finally, a sensitivity analysis excluding the 20 treatment RCTs that identified as pilot studies or had samples less than 100 demonstrated a pooled mean effect size of −0.084 (95% CI = −0.168, −0.001) which still significantly favored the technology treatment arm (p = 0.048) but was smaller in magnitude. Meta-analytic results for treatment RCTs are presented with all studies retained in analyses given the initial low heterogeneity of effect sizes and that the significance of the pooled mean effect remains significant whether potential outliers are included or excluded.

Fig. 7: Funnel plot of included treatment randomized controlled trials.
figure 7

Number Study: (21) Abraham et al. (2015, Hong Kong, n = 48). (22) Ahmad et al. (2018, Malaysia, n = 134). (23) Armstrong et al. (2018, USA, n = 101). (24) Bagherniya et al. (2018, Iran, n = 172). (25) Banos et al. (2019, Spain, n = 47). (26) Baranowski et al. (2019, USA, n = 200). (27) Bohlin et al. (2017, Sweden, n = 37). (28) Bruno et al. (2018, Spain, n = 52). (29) Chen et al. (2019, USA, n = 40). (30) Christison et al. (2016, USA, n = 80). (31) Currie et al. (2018, USA, n = 64). (32) Davis et al. (2016, USA, n = 103). (33) Fleischman et al. (2016, USA, n = 40). (34) Foley et al. (2014, New Zealand, n = 322). (35) Garza et al. (2019, USA, n = 71). (36) Gerards et al. (2015, Netherlands, n = 86). (37) Jensen et al. (2019, USA, n = 47). (38) Kulendran et al. (2016, United Kingdom, n = 27). (39) Mameli et al. (2018, Italy, n = 30). (40) Markert et al. (2014, Germany, n = 303). (41) Moschonis et al. (2019, Greece, n = 65). (42) Nawi et al. (2015, Malaysian, n = 97). (43) Norman et al. (2016, USA, n = 106). (44) Pfeiffer et al. (2019, USA, n = 1519). (45) Rifas-Shiman et al. (2017, USA, n = 441). (46) Staiano et al. (2017, USA, n = 41). (47) Staiano et al. (2018, USA, n = 46). (48) Sze et al. (2015, USA, n = 40). (49) Taveras et al. (2015, USA, n = 549). (50) Taveras et al. (2017, USA, n = 721). (51) Trost et al. (2014, USA, n = 75). (52) Wald et al. (2018, USA, n = 73).

Discussion

Technology use in pediatric weight management interventions is burgeoning, as evidenced by the identification of 54 unique RCTs published in the last 6 years alone. Studies included in this review utilized web-based platforms, smartphone apps, emails, telemedicine (phone calls or videochats), exergames, video games, text messages, and computerized decision tools, with ~25% of the studies employing multiple types of technology throughout the intervention. Twenty-two of these articles targeted prevention of overweight or obesity among pediatric populations across the weight spectrum, and 33 of the articles reported outcomes from treatment interventions. Most (89%) of the interventions were of high or moderate quality. A substantial proportion (62%) of the interventions were informed by at least one theory or evidence-based approach.

Prevention interventions

Meta-analyses of the 20 prevention trials did not find a significant mean effect of prevention interventions on weight outcomes at post-intervention. Heterogeneity of the true effects was moderate, but did not appear to be associated with the inclusion of pilot studies or studies with small samples according to sensitivity analyses [85]. The five prevention RCTs with significant post-intervention effects were compared to both active (n = 3) and no-contact (n = 2) comparators, involved exergames, an app, a web-based platform, e-mail, and a wearable sensor, and used technology to deliver the intervention (n = 3) or as an adjunctive component to the intervention (n = 2).

There was also not a significant mean effect of prevention interventions at long-term follow-up. Overall, the heterogeneity of comparators, technology type, study design, and intervention design preclude conclusions regarding the efficacy of digital prevention interventions for the maintenance of pediatric weight outcomes. Null prevention effects may provide important insights regarding the non-inferiority of mHealth/eHealth solutions compared to in-person interventions; however, more research is needed to understand the variability in effects.

Treatment interventions

Overall, the vast majority (76%) of treatment studies did not find significant differences between the comparator and intervention conditions on child weight outcomes. However, meta-analyses of the 32 treatment trials showed a small, albeit significant, effect of the mHealth/eHealth interventions on post-intervention weight outcomes. These results extend the findings from previous reviews, which reported low or no technology-based treatment effects on weight outcomes but significant treatment effects on weight-related behaviors (e.g., physical activity, diet) [12, 13, 86,87,88]. The low heterogeneity of effect sizes at post-intervention suggests the small treatment effects are similar across studies. Gold-standard in-person family-based behavioral interventions demonstrate efficacy for treating childhood obesity [89], with moderate to large effect sizes [90]. Only four studies in the present review compared gold-standard interventions to gold-standard plus technology interventions [40, 43, 47, 82]; however, one study used technology in both treatment arms [47], and only one study found enhanced effects in the technology condition [82]. More research is needed to compare effects of gold-standard treatments to those supplemented by or delivered solely through technology.

Significant effects of the digital treatment interventions were also found at long-term follow-up; however, only two of the six studies reported differences between comparator and treatment on weight outcomes at long-term follow-up. Although heterogeneity of the effect sizes was low, the small number of RCTs included in the long-term meta-analyses (n = 6) suggest that more research is needed to draw conclusions regarding long-term efficacy of digital treatment interventions on child weight outcomes.

There were significantly greater effects of the treatment interventions on weight outcomes for trials that were shorter in duration and had small sample sizes and/or were pilot trials. It is unsurprising that pilot trials were associated with greater treatment effects, as it has been found that the largest effects may come from smaller trials [85]; importantly, the small treatment effects of technology-based interventions were still significant in sensitivity analyses when removing these pilot trials. Previous meta-analyses of pediatric weight interventions [91] and mHealth weight interventions [92] have not found associations between duration and treatment efficacy; however, it has been shown that adult weight loss interventions with longer duration were more likely to report no intervention effects [93]. Although counterintuitive, the negative association between intervention duration and treatment effect found in this study could point to the inconclusive long-term efficacy of digital interventions found in this study, as well as documented issues with engagement and long-term adherence of digital interventions [94].

In addition, digital intervention effects varied based on the delivery target, such that treatment interventions that were delivered to the child had greater effects of the technology arm on child outcomes as compared to interventions delivered only to the parent. Relatedly, greater treatment effects were associated with greater mean child age; these findings coincide in that it is likely that the same interventions targeting older children are more likely to target the child as opposed to the parent. Previous meta-analytic findings on the effectiveness of parent/caregiver-delivered eHealth/mHealth interventions are mixed. A review of eight parent-focused eHealth interventions for pediatric obesity found no significant effect of the interventions on child weight-related outcomes [8]; however, results from a review on the effect of mHealth interventions on youth health outcomes suggests that treatment effects are stronger with the inclusion of parents/caregivers in the intervention [86]. Traditional (i.e., non-technology-based) childhood obesity interventions that are delivered to the parent have been shown to be effective on child weight outcomes [95], and parent-only interventions may be as efficacious as interventions delivered to both the parent and child [89, 96]. It is unclear why parent-only studies in this review did not outperform the comparators. The relatively small number of treatment studies in this subgroup (n = 3) highlight the need for more work to draw conclusions regarding the efficacy of parent-delivered technology interventions to treat childhood obesity.

Other subgroup analyses

Meta-regression analyses did not suggest that the effect of treatment or prevention interventions on weight outcomes differed by technology use (i.e., provider-delivered telehealth, non-provider-delivered technology, or both). Two studies in this review did not attempt to separately evaluate technology from the intervention; these studies manipulated intervention content or dosage between study conditions but employed technology as a method of intervention delivery in both conditions [34, 60]. Notably, durations and dosages (contact frequency) of the interventions were highly variable. Study designs made it difficult to separately evaluate the effect of the technology from other intervention components. This is particularly relevant when studies are comparing no-contact control with a multi-component, technology-assisted intervention, where it is unclear whether the intervention, separate from technology-use, is driving the effect. Furthermore, the content of the intervention varied substantially across the studies, although most targeted energy-balance behaviors and were informed by theory or evidence-based practice. Elucidating the impact of intervention content on outcomes is an important future direction for mHealth/eHealth research.

In addition, meta-regressions suggest that the effects of prevention and treatment studies are not different when technology is used as an adjunct or enhancement to the intervention as compared to when the intervention is primarily delivered through technology. Both solely technology-based interventions and interventions with technology-based adjunctive components were represented among prevention and treatment studies that reported positive effects on adiposity measures. However, given the variability in study designs, comparators, and interventions, further research is needed to evaluate the role of these factors on outcomes.

Other subgroup analyses suggested that the effectiveness of prevention and treatment studies did not differ when the intervention was compared to an active comparator condition (e.g., in-person treatment; attentional control) or a no-contact control condition (e.g., waitlist, usual care). These findings may provide important context in which to interpret the results; however, as more studies emerge, it will be valuable to explore potential differences in intervention effects for studies with different types of “active” comparators such as comparing effects between RCTs with active educational/attentional controls versus active, evidence-based intervention comparators. Many studies tested the efficacy of the technology-based intervention by comparing outcomes to those in non-active control conditions [18, 38, 46, 50, 52, 59, 62, 65, 75, 77, 78]. Among those studies that found positive effects, it is unclear whether that is driven by the type of technology; the sample population (e.g., clinical, adolescent); the intervention content, duration, dose; or a complex interplay of several or all these components. As research continues in this area, researchers should work to elucidate the factors associated with optimal outcomes, including utilizing novel methodologies to optimize mHealth/eHealth interventions [97].

Other studies in this review attempted to isolate the technology component, and several studies demonstrated the non-inferiority of technology-delivered interventions to other treatment modalities [68, 79, 82], such as print-based materials compared to web-based materials [41], telephone-delivered coaching compared to videochats [47], and in-person treatment compared to telephone-delivered treatment [22]. Technology-assisted interventions may provide important benefits that promote valuable treatment outcomes such as increased access to treatment, decreased reliance on personnel and consumable resources, and acceptability and feasibility to the families [6]. Therefore, the small treatment effects and null prevention effects may provide important insights into the non-inferiority of mHealth/eHealth solutions compared to in-person or more resource-intensive interventions.

Challenges and future directions

Engagement and adherence

Engagement and adherence have been shown to be important for weight loss in mHealth/eHealth interventions [98, 99]; however, studies do not consistently measure and report adherence data, and when reported, adherence rates are often suboptimal [75] and decline over time [68, 83]. To elucidate the potential of technology, measures of engagement such as how often a participant logs onto the study website or when a study text message is read could enable stronger conclusions about technology efficacy. Research is needed to understand how to increase motivation to engage in technology-delivered interventions, particularly for self-guided interventions [100]. Identification of technology-based strategies to increase and sustain engagement in treatment is an important future direction. Researchers should strive to creatively integrate features of gamification, entertainment, relaxation, and social connectedness into digital interventions to promote engagement. Finally, greater attention to targeted and patient-centered designs can inform engagement approaches.

Long-term efficacy

Results highlight the limited long-term efficacy of technology-based interventions for pediatric weight management. Of the 19 RCTs in this review that included a follow-up assessment, only four (21%) reported significant intervention effects at long-term follow-up, most (n = 3) of which used technology adjunctively and included an active comparator. Future research should focus on identifying strategies to extend the benefits of treatment beyond the intervention period, including risk factors for behavioral “relapse” and mechanisms that facilitate long-term treatment response. Greater attention to the environmental, social, cultural, and psychological context in which interventions are delivered can also provide critical insights [101]. Active (e.g., ecological momentary assessment) and passive digital data collection (e.g., wearable sensors, global positioning systems in mobile devices) combined with advanced analytic approaches (e.g., machine learning), can enhance the prediction of important microtemporal mechanisms that may partially explain the limited long-term efficacy of interventions. Ultimately, these discoveries could inform the tailoring of interventions that provide personalized, adaptive approaches to support youth in long-term weight maintenance [102].

Recruitment and attrition

Numerous studies in this review reported challenges with recruiting adequate sample sizes [24, 79] and were therefore underpowered to find intervention effects. Research is needed to understand how to address challenges involved in recruiting pediatric populations [103]. Attrition is also a noteworthy challenge highlighted in this review [18, 23, 43, 83], with some studies reporting as high as 70% attrition [45]. This challenge is not exclusive to mHealth/eHealth interventions and was reported across study conditions. Prior meta-analyses of four pediatric obesity mHealth interventions suggest that mHealth interventions may ameliorate drop-out rates [13]. Future research should examine correlates of drop-out, including non-response.

This review also suggests the relative nascency of other novel technologies in pediatric weight management interventions. For example, there is increasing evidence for the potential of virtual and augmented reality in changing health behaviors [104] and preliminary evidence for improving children’s health outcomes [105]. With the increasing accessibility and popularity of these types of media, this medium may be ripe for translation of treatment. Similarly, interventions that are informed by machine learning algorithms that tailor aspects of the intervention to an individual based on the variability of measured factors of the individual over time may prove to be important next steps or useful tools for future interventions [106]. Only one article in this review used a computerized algorithm to inform treatment decisions [66], although a recent review suggests that computer decision support tools may be useful in the management of pediatric obesity [107].

Strengths and limitations

The rigorous methodology following PRISMA guidelines and the collaboration with a medical librarian to inform the search strategy are strengths of this study. The trials in this review all employed a randomized controlled design, which allowed for examination of intervention efficacy on weight outcomes across the studies. Included studies were of moderate quality overall. We were able to synthesize within studies based on other subgroup analyses examining type of technology, delivery population, comparator condition, and technology use (i.e., whether technology is useful as an adjunct to more traditional treatment or whether interventions delivered solely through technology are superior), allowing for more nuanced understanding of the various mHealth/eHealth intervention designs. Research should further explore these subgroups.

Technology use in weight management interventions can greatly increase the scalability of treatment, both in terms of increased reach of interventions and reduced cost of delivery. Only evaluating the effect of an intervention on weight outcomes may not adequately convey the value of technology-assisted solutions in terms of their increased scalability as well as their effect on other health-related outcomes [108]. Indeed, previous research has shown that technology-based interventions are highly feasible and acceptable among pediatric populations and their families [6] and significantly impact weight-related health behaviors, self-monitoring behavior, and psychosocial functioning [5, 6, 12]. An assessment of the psychological and behavioral impact of technology-based interventions was beyond the scope of this review; however, future evaluations of treatment efficacy should consider these important outcomes. Cost-benefit analyses would also enable greater understanding of the utility of technology-based solutions in pediatric weight management.

Due to the relatively low number of studies that reported long-term outcomes, we were unable to examine moderators of the long-term efficacy of technology-assisted interventions for pediatric weight management. Heterogeneity in intervention content, dose of treatment, and study design including comparator type prevented conclusions regarding the efficacy of these components.

As the COVID-19 pandemic catalyzes health care systems and ongoing interventions to rapidly shift from in-person appointments to telemedicine and use of digital tools [109], evaluation of the non-inferiority of digital solutions and identification of specific disadvantages for one modality versus another may be particularly important inquiries. Moreover, results should be interpreted with a lens toward scalability, cost-benefits, and consideration of population impact when interpreting small effects, which may provide small but substantial population-level impact with less use of resources compared to traditional in-person treatment. An average weight loss of 0.07 BMIz units (compared to 0.04 BMIz units in the comparator arms) was observed among the 20 treatment RCTs in this review that measured BMIz as an outcome, which may not meet clinically significant thresholds, yet may still offer lifetime medical cost savings [110] and temper the projected impact of the COVID-19 pandemic on child weight gain [111]. The ubiquity of technology provides unique opportunity to increase access to care in ways that may be less invasive, burdensome, or costly; further, technological solutions can reach populations that may not otherwise have had access to necessary evidence-based treatment.

Conclusion

This review suggests that mHealth/eHealth interventions for pediatric obesity are viable solutions that may be effective at promoting short- and long-term decreases in adiposity outcomes, but evidence is inconclusive regarding the efficacy for prevention of pediatric overweight or obesity. Research should utilize novel study designs [97] and harness technology in innovative ways to address challenges in pediatric weight management interventions, such as adherence, engagement, attrition, and long-term efficacy. More research is needed to determine the comparative effectiveness of technology-based interventions to efficacious gold-standard interventions and elucidate the potential for mHealth/eHealth to increase scalability and reduce costs while also providing significant public health impact.