Dose-response-relationship of stabilisation exercises in patients with chronic non-specific low back pain: a systematic review with meta-regression

Stabilization exercise (SE) is evident for the management of chronic non-specific low back pain (LBP). The optimal dose-response-relationship for the utmost treatment success is, thus, still unknown. The purpose is to systematically review the dose-response-relationship of stabilisation exercises on pain and disability in patients with chronic non-specific LBP. A systematic review with meta-regression was conducted (Pubmed, Web of Knowledge, Cochrane). Eligibility criteria were RCTs on patients with chronic non-specific LBP, written in English/German and adopting a longitudinal core-specific/stabilising/motor control exercise intervention with at least one outcome for pain intensity and/or disability. Meta-regressions (dependent variable = effect sizes (Cohens d) of the interventions (for pain and for disability), independent variable = training characteristics (duration, frequency, time per session)), and controlled for (low) study quality (PEDro) and (low) sample sizes (n) were conducted to reveal the optimal dose required for therapy success. From the 3,415 studies initially selected, 50 studies (n = 2,786 LBP patients) were included. N = 1,239 patients received SE. Training duration was 7.0 ± 3.3 weeks, training frequency was 3.1 ± 1.8 sessions per week with a mean training time of 44.6 ± 18.0 min per session. The meta-regressions’ mean effect size was d = 1.80 (pain) and d = 1.70 (disability). Total R2 was 0.445 and 0.17. Moderate quality evidence (R2 = 0.231) revealed that a training duration of 20 to 30 min elicited the largest effect (both in pain and disability, logarithmic association). Low quality evidence (R2 = 0.125) revealed that training 3 to 5 times per week led to the largest effect of SE in patients with chronic non-specific LBP (inverted U-shaped association). In patients with non-specific chronic LBP, stabilization exercise with a training frequency of 3 to 5 times per week (Grade C) and a training time of 20 to 30 min per session (Grade A) elicited the largest effect on pain and disability.

Data extraction. The common effect estimators for pain intensity and disability were retrieved from each study. The intervention group baseline-to-post effects sizes (Cohens d) were calculated as the change in mean values from baseline to post intervention assessment divided by the baseline standard deviation values for the respective scale. All data of interest were retrieved from the individual study data; for this purpose, a data extraction form designed for this review was used. Data on training dose and frequency were retrieved according to the TIDieR checklist. One researcher recorded all the pertinent data from the included articles and the other author independently reviewed the extracted data for its relevance, accuracy and comprehensiveness. A consensus was used to address any disparities; a third reviewer (N.N.) was asked, if necessary, to address any disparities. Authors of those studies included in this review who had not reported sufficient details in the published manuscript, were personally addressed by e-mail requesting the provision of further data. The effect estimators for pain intensity and disability were calculated using either the visual analogue scale (VAS), the numeric rating scale (NRS) or the sum score, inherent of the scale/assessment tool (0-10, 0-24 or 0-100), as the calculation of the standard mean differences is scale independent. For such data, only the direction (lower values mean less pain, less disability) was normalised. For scale-dependent calculations (inverse weighting, calculated as sample size divided by the squared standard deviation of the baseline-to-post difference), z-transformed (0-10) variables were used. Missing standard deviations for the differences were imputed according to the procedure described by Follmann et al. 14 .
Study quality assessment. The Physiotherapy Evidence Database (PEDro; 11 criteria) scale was used to assess the methodological quality of all trials included. The PEDro scale is a valid and reliable tool to rate the internal study validity and methodological quality of controlled studies 15 . If available, the validated rating scores of the articles were taken directly from the PEDro database (website; 35 out of 46 articles). If not, both authors evaluated the articles, each criterion was rated as 1 (definitely yes) or 0 (unclear or no); potential disagreements were discussed between the two authors and resolved. Overall, the scale ranges from 0 (high risk of bias) to 10 (low risk of bias) with a sum score of ≥ 6 representing a cut-off score for studies with a sufficient study quality. As study quality was considered as a potential explanator of the effect size homogeneity, all studies, irrespective of the quality, were analysed. Risk of bias within the studies. The two review authors (JM and DN) independently rated the risk of bias of the outcomes pain and disability in the included studies by using the Cochrane Collaboration's tool Risk of Bias tool 2 16,17 . Studies' outcomes were graded for risk of bias in each of the following domains: sequence generation, allocation concealment, blinding (participants, personnel, and outcome assessment), incomplete outcome data, selective outcome reporting and other sources of bias. For the outcomes, each item was rated as "high risk", "low risk" or "unclear risk" of bias. Again, any disagreements were discussed between the raters. If a decision could not be reached after discussion, a third reviewer (N.N.), was included to resolve any conflicts. As the risk of bias was (indirectly, via the PEDro sum score) considered as a potential explanator of the effect size homogeneity, all studies, irrespective of the risk of bias, were analysed in the meta-regressions.

Risk of bias across the studies.
The calculation of the risk of publication bias across all the studies was indicated by using funnel plots/graphs 18  Data processing and statistical analysis. Data was initially plotted using scatterplot diagrams. The type of association between each independent and dependent variable was visually determined. In case of a linear association, data were processed as real values, thus, if a curve-linear association was determined, data were re-calculated using logarithmic transformations (log-association) and, respectively, Taylor-series (U-shapedassociations) to provide linearity for the regression calculation.
Sensitivity meta-regressions for dose-response analyses and the impact of study quality were conducted as described in Niederer & Mueller (2020) 6 . A syntax for SPSS (IBM SPSS 23; IBM, USA) was used (David B. Wilson; Meta-Analysis Modified Weighted Multiple Regression; MATRIX procedure Version 2005.05.23). Inverse variance weighted regression models with random intercepts (random effect model, fixed slopes model) with the dependent variables of pain intensity and disability effects (simple pre-post Cohen's ds) and the independent variables: intervention duration [weeks, U-shaped], intervention frequency [number of trainings/week, U-shaped], intervention duration [minutes, logarithmised], intervention total dose [minutes] were applied. The sample size (SE group) and the study quality PEDro sum score [points, linear] were considered as co-factors. Homogeneity analysis (Q-and p-values) and meta-regression partial coefficients B (95% confidence intervals and p-values) were calculated. All statistical analyses were tested against a 5% alpha-error probability level.
Effect estimators' level of evidence. The quality of the evidence revealed by the meta-analyses was graded using the tool established by the GRADE working group 19  www.nature.com/scientificreports/ low" (The estimate of effect is very uncertain), "low" (further research is likely to change the estimate), "moderate" (further research may change the estimate) or "high" (further research is very unlikely to change the estimate of effect) (plus interim values). The grading starts with the type of evidence (RCT = high, Observational = low, all other study types = very low) and is decreased or increased based on study limitations, inconsistencies, uncertainty about directness, imprecise data, reporting bias (decreasing items), or strong associations, dose-response findings, and confounder plausibility (increasing items) 19 .
Recommendations were derived using a clinical guideline developing tool 20 . Overall, four key factors were applied to determine the strength of the recommendations: Balance between desirable and undesirable effects (larger differences between desirable undesirable effects lead to stronger recommendations)-Quality of the available evidence-Values and preferences (higher variations lead to weaker recommendations)-costs (higher costs lead to weaker recommendations. Details that are more comprehensive can be found in 21 .

Results
Study selection. The database search was completed in 03/2020. Figure 1  www.nature.com/scientificreports/ Study characteristics and individual studies' results. Fifty (50) studies were included in the qualitative and in the quantitative analyses. Study characteristics and the main results are displayed in Table 2. For each of the studies included, methodological aspects, participants' characteristics and key results are presented. Overall, 2,786 participants, thereof n = 1,239 stabilisation exercise participants, were included in the analysis. All included studies adopted a randomised controlled design (RCT). The main inclusion criterion was (chronic) non-specific low back pain ≥ 4 weeks 22 , ≥ 6 weeks 23 , ≥ 7 weeks 24 , ≥ 8 weeks 25-27 , ≥ 12 weeks 28-55 , ≥ 24  weeks 56-58 and ≥ 2 year history 59 , whilst in 11 60-70 studies this information was not presented. The baseline pain,  effect sizes (Cohen's d, stabilisation exercise group only) for pain and disability are presented in Table 3.
Study quality and risk of bias within studies. Both the study quality and risk of bias ratings are presented in Table 2. The overall study quality ranged from 3/10 to 9/10 points, with a mean of 5.7 ± 1.4 points on the Pedro scale.
More detailed information on the meta-regressions are depicted in Fig. 2. The training period showed no systematic impact on the effect size for pain intensity ( Fig. 2A). Training frequency showed an inverted U-shaped association with the effect size (13% variance explanation) (Fig. 2B), training duration showed a logarithmic association with the pain effect size (23% variance explanation; Fig. 2C). Low quality evidence suggested that training 3 to 5 times per week leads to the largest effect of stabilisation exercise in chronic, non-specific low back pain patients. Quality of evidence was downgraded due to risk of bias (− 1), downgraded due to imprecise data (wide confidence intervals, − 1), downgraded (− 1) due to (some) uncertainty about directness, and upgraded due to dose-response-relationship (+ 1).

Risk of bias across studies.
The risk of bias across studies (publication bias) is, by means of a funnel plot, highlighted in Fig. 3. It reveals an unclear, but rather low, risk of publication bias.

Discussion
This systematic review with meta-regression examined the dose-response-relationship of stabilisation exercise interventions in chronic, non-specific low back pain patients and, thus, derived recommendations for the stabilisation exercises' training characteristics in this special cohort.
Summary of main results. The main findings of the presented meta-regression are that: (1) moderate quality evidence indicates that a training duration of 20 to 30 min elicits the largest impact on the effect sizes on both pain and disability of core-specific stabilisation interventions in non-specific chronic low back pain patients, (2) low quality evidence advocates that training 3 to 5 times per week leads to the largest effect of corespecific stabilisation exercise in chronic, non-specific low back pain patients with an inverted U-shaped association with the effect size and (3) no systematic impact of the training period (duration of intervention in weeks) on the effect size for pain intensity was found.
Comparison with other evidence. Saragiotto  www.nature.com/scientificreports/ sessions per week varied from 1 to 5. This partly covers the results of our presented meta-regressions. Nevertheless, a detailed analysis on the effect of training characteristics on pain reduction is missing in their systematic review 2 . The current evidence only proves the use of general and stabilisation exercise (covering sensorimotor, stabilisation and/or core stability) in the therapy of chronic non-specific low back pain 2 . Regarding the training period/duration (weeks of intervention), our results showed that the duration of intervention (in weeks) presented no systematic impact on the effect size for pain intensity. Taking the current knowledge on the effects and adaptation of sensorimotor training into account, a duration of about six weeks seems to be both feasible and effective. This is in accordance with our quantitative results (mean duration of 7.0 ± 3.3 weeks). However, future research is required to define evidence-based recommendations of this aspect. Low quality evidence supports an inverted U-shaped association of the training frequency (sessions per week) with the effect size on improvement of pain and disability in chronic, non-specific low back pain patients. The overall relationship between (the amount of) physical activity and low back pain is considered to be U-shaped. This means that both the absence of exercise and extremely high levels of physical activity (elite sports) may lead to an increase in the risk of developing (low) back pain. In contrast, a "normal" (medium) level of physical activity shows the lowest risk and, therefore, appears to be protective [2][3][4]8,9 . In this context, our findings of adopting a dose of 3 to 5 sessions per week covers this. In addition, moderate quality evidence indicates that a training duration of 20 to 30 min elicits the largest impact on the effect sizes on pain and disability; this may correspond to the patients' essential need of achieving pain reduction with the minimum effort (time). Nevertheless, this is partly in contrast to van Tulder's result 4 . They concluded that exercise interventions with a high dosage (> 20 h) have the highest effect. Van Tulder et al. 4 fail to point out how this dosage should be applied (duration, frequency). Supported by our findings, it may be more effective to reach this dosage with a high frequency, short bout type of intervention. One of the main reasons of failed treatment success in exercise therapy is the low adherence rate of the patients to their scheduled therapy 4 . Lack of time and long journey times to the therapy centre are commonly cited barriers to regularly participating in therapy sessions 72 . Therefore, patients and physiotherapists are constantly searching for the effective dose-response-relationship that could be reduced to the minimum required. Based on our results, we can recommend exercising for more than 2 sessions per week with a minimum of 20 to 30 min per session. Nevertheless, there is still a need for future research on the minimal dosage in the context of stabilisation exercise interventions for chronic, non-specific low back pain patients.
Practical relevance and recommendations. The training-dose and effect-response relationship between core-specific stabilisation exercise interventions and pain reduction or disability improvement in chronic, non-specific low back pain patients is of great interest to policy makers, health insurers and clinicians, as well as the persons affected. This review proved the (low to moderate) evidence, that a core-specific stabilisation intervention of 3 to 5 times per week, 20 to 30 min per session, has a positive effect on pain reduction and improvement of disability in low back pain patients. Conclusively, we suggest the following graded recommendations: Grade A recommendation: At the group level, stabilisation exercise is likely to be most effective to treat nonspecific low back pain when it is scheduled with a time per session of 20-30 min.  www.nature.com/scientificreports/ Grade C recommendation: At the group level, stabilisation exercise to treat non-specific low back pain is potentially most helpful when it is scheduled three to five times a week.
Future study. Nevertheless, the evidence of more detailed training specifica (training intensity: number of exercises per session, repetitions per exercise, sets per exercise, rest after exercise, etc.) remains unclear. Furthermore, the minimal clinically relevant dosage of core-specific stabilisation interventions in chronic, non-specific low back pain patients remains unclear; this may define a future area of low back pain research as there exists a societal pressure of consistently high low back pain prevalence across all lifespans.  www.nature.com/scientificreports/

Limitations
Limitations at the study and outcome levels. A common limitation in exercise trials is the limited possibility to blind the participants. This limitation is increased by the self-reported assessment of pain and pain-related function.
Limitations at the review level. We only screened the databases PubMed (Medline), Web of Knowledge and the Cochrane Library. Considering the topic of our review, almost all manuscripts of interest should be found therein [73][74][75] . However, expanding the search to even more databases, like EMBASE, PEDro, CINAHL; AMED, and CENTRAL may would have led to slightly more hits. The advantage of meta-regressions are, inter alia, that the interventional effect sizes are compared to each other to find a dose-response-relationship, the effect sizes are thus relativized to each other. The estimates found are valid for the isolated intervention group effects comparisons, given by the meta-regression. The mean effects are, given by the nature of the meta-regression, absolute and not in comparison to a control/comparator. The mean effect sizes (refer to the study description and meta-regressions) are thus not directly comparable to those found in meta-analyses where the effects are calculated in comparison to a control/comparator group.
The funnel plot analysis revealed an unclear, but rather low, risk of publication bias within our review. The findings of our (retrospective) meta-regression should be confirmed prospectively, at best adopting a prospective meta-analysis.
Sensitivity of the interventions' name. The interventions of the studies included into our meta-analysis are defined as stabilization exercise. Motor control exercises are classically defined as core-specific dynamic stabilization exercises with an a priori education on deep trunk muscles activation and/or the control of deep muscles activation during exercising. We only included studies with dynamic/exercise parts. When solely stabilisation exercises without pre-conditioning are performed, they are often called "coordination", "stabilisation" 5 , "sensorimotor" 76 or even as well "motor control" 2 exercise. As described above, the term "motor control exercise" may be slightly too sensitive for the interventions included into our review. In contrary, the terms "sensorimotor", "coordination" and "stabilisation" training/exercise may be too general. Consequently, we name the intervention "stabilisation exercise" to highlight that the stabilisation/active/dynamic parts of the originally described Table 5. Outcomes of the sensitivity meta-regressions. For each single analysis, effect sizes, number of included effect sizes, homogeneity, the regression coefficient B, its confidence interval (CI) and the corresponding p-value are displayed. Legend: LL, lower level, UL, upper level. www.nature.com/scientificreports/ www.nature.com/scientificreports/ as "motor control exercise"-theorem are adopted. Nevertheless, the intervention could also be called "motor control stabilization exercise" or "sensorimotor exercise".

conclusions
A training frequency of 3 to 5 times per week (low quality evidence) with a training duration of 20 to 30 min (moderate quality evidence) per session causes the largest impact on the effect sizes (both in pain and disability) of stabilisation exercise in low back pain patients. However, the training period showed no systematic impact on the effect size for pain intensity. Future work is required to enhance the quality of the evidence of our findings, possibly focussing on the definition of a minimum dosage.