Parkinson’s disease patients benefit from bicycling - a systematic review and meta-analysis

Many Parkinson’s disease (PD) patients are able to ride a bicycle despite being severely compromised by gait disturbances up to freezing of gait. This review [PROSPERO CRD 42019137386] aimed to find out, which PD-related symptoms improve from bicycling, and which type of bicycling exercise would be most beneficial. Following a systematic database literature search, peer-reviewed studies with randomized control trials (RCT) and with non-randomized trials (NRCT) investigating the interventional effects of bicycling on PD patients were included. A quality analysis addressing reporting, design and possible bias of the studies, as well as a publication bias test was done. Out of 202 references, 22 eligible studies with 505 patients were analysed. An inverse variance-based analysis revealed that primary measures, defined as motor outcomes, benefitted from bicycling significantly more than cognitive measures. Additionally, secondary measures of balance, walking speed and capacity, and the PDQ-39 ratings improved with bicycling. The interventions varied in durations, intensities and target cadences. Conclusively, bicycling is particularly beneficial for the motor performance of PD patients, improving crucial features of gait. Furthermore, our findings suggest that bicycling improves the overall quality-of-life of PD patients.


INTRODUCTION
Goal-directed physical exercise and general physical activity have been demonstrated to alleviate both motor and cognitive symptoms in Parkinson's disease (PD) in addition to the standard pharmaceutical and surgical treatments 1,2 . Among the diversity of physical exercise forms, the need to develop a goal-based rehabilitation has been highlighted 1 , as thus far there has not been sufficient knowledge to target or customize adjuvant forms of exercise to patient's individual needs 3 . Therefore, a more thorough understanding of the efficacy of different forms of exercises such as bicycling is important, as new forms and technologies around exercise therapies are being increasingly established [4][5][6] .
Exercise-based training in particular can be targeted to enhance functional mobility by utilizing enhanced strength, endurance, balance and flexibility to support efficient performance of specific tasks 3 . While there is no conclusive evidence that exercise would terminate disease progression, it can be considered as diseasemodifying when underlying pathological or pathophysiological disease processes are delayed and being accompanied by improvement in clinical signs and symptoms 3 . In addition to studies with patients, work on animal models indicate that physical activity can have neuroprotective effects on the brain by enhancing neuroplasticity and reinforce structural and morphological changes leading to the attenuation of age-related cognitive decline 7 .
In 2010 it was reported that some individuals diagnosed with PD, while indicating severe freezing of gait (FOG), were nevertheless able to ride a bicycle easily 8,9 . Since then, the ability to preserve the skill to ride a bicycle, while otherwise being severely limited by the symptoms of PD, has been shown to positively influence cardiovascular fitness, motor skills, overall coping, feeling of independency, social inclusion and cognitive skills [10][11][12][13] .
To investigate the current state of the art, we conducted a meta-analysis with a review on the literature on bicycling as an adjuvant form of exercise for PD. The goal of this review is to provide a characterization of the status of bicycling exercise regimen for PD patients and to identify features that need further research. In reference to the Patient, Intervention, Comparison, Outcome framework (PICO) 14 , this review aims to quantify if PD patients (P), benefit from bicycling intervention (I) compared with pre-and post-measures of the same population, or compared with the outcomes an alternative exercise intervention, or standard treatment given to another PD group (C). The relevant outcomes are measured as changes in physical and cognitive measures, and in quality-of-life (O).
the intervention was only imagined, or only performed using virtual reality without an actual ergometer. No limitations were set with respect to the control treatment, as long as the treatment did not involve bicycling.
Reviews and meta-analyses were excluded. Furthermore, studies were excluded if their primary outcomes measured neurophysiological or metabolic activity. While randomized control trials (RCT) have been the gold standard of empirical testing 15 , and the preferred study design to be included in a metaanalysis, also non-randomized trials (NRCT) were included here. The inclusion of NRCT studies was considered justified as they can be the main source of evidence for several intended effects of interventions 16 . Including NRCT studies is increasingly common and encouraged in particular among non-pharmacological studies, investigating the effectiveness of therapeutic interventions [16][17][18][19] .

Search strategy
The PubMed database for biomedical literature was searched for studies published between January 2010 and February 2020. This was set as the time frame, as the observation of preserved bicycling ability in a PD-patient with severe FOG was first reported 2010 8 . The keywords 'bicycl*', 'cycling; bik*' and 'Parkinson, bicycl*' as a MeSH term were used.

Study selection and data extraction
The study screening was done independently by two reviewers, M.T. and B.U.W., using the Covidence Systematic review software 20 . First, the imported articles were screened based on title and abstract, then based on the full text. Any occurring conflicts on inclusion were solved by the third reviewer, S.S.D. Upon inclusion, the qualitative and quantitative information about each study was extracted into three different tables: Intervention characteristics: Bicycle type, cadence (the number of pedal strokes in a minute, usually measured as rounds per minute, RPM), treatment session duration, overall treatment duration, exercise intensity in heart rate and perceived exertion.

Meta-analysis
Quality analysis. A versatile checklist developed for evaluating primary research papers, the Standard Quality Assessment Criteria tool (QualSyst), was used to estimate the quality of the included studies 21 . The QualSyst tool was deemed appropriate as it is developed in particular for meta-analyses including both RCT and NRCT studies, addressing the overall quality of the studies with a 14-item checklist concerning the internal validity of the studies, possible bias, as well as the quality of the reporting. The reviewers M.T. and B.U.W. assessed the included studies independently, answering each checklist question with Yes, Partial, No or Not applicable. This enabled the computation of a score through the QualSyst tool, assessing the overall quality of a study. The quality assessment was not used to set cut-offs for study inclusion, rather it was used as additional information about the overall quality of the included studies. The final score is based on the average of the score from both independent reviewers. A two-sample F-test for the variance of the scores as well as a paired t-test to test for a difference in the ratings given by the reviewers were conducted.
Publication bias. To address whether the included literature might have been subject to publication bias, the small sample bias method was applied for the primary outcome measures to test for the presence of a possible bias 22 . The method is based on the assumption that studies with high effect sizes will most probably get published, while studies with low effect sizes will not 23 . The risk of non-significant and small effect sizes is particularly high for studies with small sample sizes. This would mean that the sample of the included studies could show a lack of small studies featuring very small or negative effect sizes, while still including small studies with larger effect sizes and stronger statistical significances 24 .
Effect size of the treatment. The analysis of the treatment efficacy was based on the generic inverse-variance method which uses the effect size, and a weighted measure of variance for each study to calculate the pooled effect size describing the overall effect. The given weight for each study is the inverse of the variance. Here, the effect sizes were based on continuous outcome data, and they were pooled using a random-effects model 24 , assuming that not all studies come from the same population. The Hedges' biascorrected standardized mean difference (g) was used to calculate the effect size. To calculate the between-study variance, tau², the Sidik-Jonkman method (SJ) was chosen.
The analysis was conducted with the R 25 and RStudio software 26 , by using the meta 22 and metaphor 27 packages which were developed for meta-analyses. For the (R)CT studies, the individual effect size and the variance were calculated for the post-measures of the treatment and the control group. Pseudo-randomized studies, and studies that applied the same inclusion criteria for the treatment and control group of PD patients were included in the (R)CT-group, thus, the R in the acronym RCT is in parentheses. The rest of the studies consisted of two types of studies; some compared PD patients with healthy participants, and some applied a repeated design, comparing PD patients before and after treatment. As the included studies already would include repeated measure designs with only PD patients, in which the individual effect size would be calculated for the pre-and posttreatment measures, the outcomes of the rest of the studies were grouped together and analysed in the same manner, and thus are called repeated trials (RT). This means that the outcomes of the healthy participants were excluded, and only the individual effect sizes of the PD's pre-and post-treatment measures were compared. Ideally, an analysis comparing healthy participants and PD patients would be based on the comparison of difference measures of both groups. Nevertheless, this approach was not deemed feasible as a reliable calculation of the variance of each participant would have required an access to the individual data of each participant.
In case of studies with multiple measuring time points, the earliest and the latest time points were chosen. Also, in the case of studies which had more than two treatment options, cycling was contrasted, if available, with no-treatment or with standard care. In cases where the required information was not reported explicitly enough, the authors were contacted.
Primary and secondary measures. In the initial analysis, all included studies were grouped, and the effect size of the primary measures was tested. A primary measure was defined as it was stated in the corresponding original paper. If there were multiple primary measures or if a primary measure was not named explicitly, a measure was chosen for the analysis that was best aligned with the rest of the outcome measures of the metaanalysis. Next to investigating the primary measures, secondary outcomes were also analysed. First, outcome measures that could be defined to be functional were tested for their effect size. An outcome was defined as 'functional' if it could be considered to enable general movement and mobility of the body. Thereafter, the secondary outcomes were investigated in more detail, as it was investigated whether bicycling influenced four outcomes of   Sub-level analysis. Four sub-level analyses were conducted, firstly to investigate whether the results from the primary measures demonstrated differences as levels of design ((R)CT and RT), and secondly as levels of outcome type (motor and cognitive). Furthermore, it was tested whether the results depended on cadence (high and low), and treatment duration (immediate vs. long-term effect). The 'design' level was applied to investigate whether (R)CT and RT studies differ in their effect sizes. The distinction between motor and cognitive outcome measures was applied to test whether either of the outcome types would gain a larger benefit from a bicycling intervention. For the sake of clarity, here the term 'motor' is being used, even though the term includes several types of physical parameters. Cadence and treatment duration were applied as sub-levels to test whether certain treatment-specific features would indicate better outcomes. Cadence was categorized as low when it was ≤60 RPM, and as high if it was ≥61 RPM. An effect was considered 'immediate' if it was performed only once, and 'long-term' if treatment sessions were more than one. All sub-level tests were performed on the primary outcomes.
Measures of heterogeneity. Due to the different designs and patient groups, as well as due to the various types of combined primary measures, it could be expected that there is clinical and statistical heterogeneity present in the pooled effect size 30 . Next to the 95% confidence intervals (CI), the I² is inspected as it gives a percentage estimate of the variability not caused by the sampling error and has an approximate rule of interpreting the results as small (25%), medium (50%) or large (75%) effects 31 . In addition, to investigate between-study heterogeneity, measures were taken to inspect the effect size contribution of individual studies. First, confidence interval (CI)-based outliers were detected using the meta package in R 22 . Second, to find out whether the results would change by not including some studies, a sensitivity analysis based on the Leave-One-Out method was chosen. This method reports several measures of between-study heterogeneity to test how much of the results would change if each study was left out at a time. Finally, studies that were outside the average confidence interval or contributed a particularly high influence were removed from the final pooling of the respective effect.

Study characteristics
The Preferred Reporting Items for Systematic Reviews and Metaanalyses (PRISMA) diagram represents the literature search ( Fig. 1) 32 . See Table 1 below, and the Tables 5 and 6 in the supplementary material for further details on each study.

Bicycling intervention characteristics
The treatment duration varied between 1 and 12 weeks with an average of 5. Column PD medication ON/OFF entails whether the participants were instructed to keep their medication as prescribed, or whether withdrawal was requested. Column Testing time point indicates how many hours prior to testing the last PD medication was taken. Last column LEDD specifies the dose of the taken medication as measured in the Levodopa equivalent daily dose in milligrams. ** Dose reported but due to variability in the given range, LEDD cannot be calculated reliably.
NA means that the specific data were not available.
cadence (revolutions per minute, rpm), one reported it to be on a 'comfortable' level, while the remaining 18 reported the cadence. The cadence was binned in groups of 10 rpm, starting at 40-50 rpm and going up to 80-90 rpm. Nine out of the 18 studies reporting the cadence aimed at 70-80 rpm or at 80-90 rpm. An assisted bicycling intervention was reported by 41% of the studies, while 59% reported the intervention to have been non-assisted.

Quality analysis
The F-test for the equality of the variances of the quality scores of both reviewers revealed that there was no significant variance F(1, 21) = 1.2, p = 0.34. Also, the following paired t-test confirmed that the quality scores did not vary significantly t(42) = 1.0, p = 0.33. The given scores ranged from 0.64 to 0.96, the best possible score being 1.0. The average score across all studies was 0.81 with a standard deviation of 0.08, and a median of 0.8. For further details on the quality analysis, see Supplementary Tables 1 and 2.

Publication bias
There is no significant publication bias present in the included studies as confirmed by Egger's statistical test of the intercept on the funnel plot asymmetry (Intercept = 1.88; CI [−0.27, 4.04], t = 1.71], p = 0.10). In the case of publication bias, it could be expected that there are asymmetrically distributed small studies located low on the y-axis (high SE) and high on the xaxis (high ES) (Fig. 2).   Studies with a smaller sample size are expected to have higher standard error, and are therefore expected to be located at the lower end of the y-axis, while the studies with more participants are expected to show a lower standard error, and thus be at the upper end of the y-axis. SE = Standard error, Hedges' g = The Hedges' bias-corrected standardized mean difference which was used to calculate the effect size.

Primary measures
contributing to high heterogeneity and effect size. The latter was also marked by the Leave-One-Out-sensitivity method as a contributor to a high I². Thus, three studies were removed from the final effect size pooling of the primary measures. After this procedure, the overall effect size is smaller, but there is also no substantial heterogeneity in the results (k 19; SMD 0.35; 95% CI [0.21, 0.48], t = 5.47, p < 0.001, I² =0,0%) (Fig. 3) . Five studies were excluded as they did not report cadence.

Secondary measures
The forest plots A-E below (Fig. 4) depict the significant secondary outcome measures. The non-significant results of the secondary measures are presented in Supplementary Tables 3 and 4.

DISCUSSION
The present work highlights the beneficial effects of bicycling for patients suffering from Parkinson's disease. Outcomes measuring motor parameters improved more from bicycling intervention when compared to the outcomes assessing cognitive performance. Also, when outcomes were grouped based on functionality across primary and secondary measures, a medium-sized improvement was demonstrated. We cannot address whether bicycling is best applied as a goal-oriented form of exercise, or whether it is beneficial also in the form of general physical activity. Nevertheless, it was indicated that interventions that are implemented more than once lead to better outcomes, thus demonstrating that longer-term regimens should be preferred over one-time sessions aiming at immediate effects when designing bicycling interventions. Overall, it is clear that bicycling improves motor outcomes in PD, and perhaps to a lesser extent, cognitive outcomes. The effect size (SMD 0.43) based on the total score of PDQ-39 is an encouraging indication about the benefits of bicycling going beyond solely physical improvement, thus benefitting coping in daily life, and the self-rated overall quality-of-life. The PDQ-39 questionnaire assesses the overall self-reported quality-of-life of PD patients and it consists of 8 dimensions in a wide variety of measures related to difficulties in daily living (mobility, activities of daily living, emotional well-being, stigma, social support, cognition, communication and bodily discomfort) 36 . As the outcome measure 'quality-of-life' of this meta-analysis was not significant, it would be beneficial to address the PDQ-39 subscales in order to understand in which aspects of well-being and life-quality the improvement takes place. The difference in the PDQ-39 and the quality-of-life measure could possibly be that in the latter measure there is a too wide variety of aspects included, as they range from depression and disabilities to general daily living and well-being. For further details on the measures included to the outcome quality-of-life please see Supplementary Table 6.
There is previous evidence that moderate-to high-intensity physical exercise is well tolerated by PD patients, leading to better outcomes than low-intensity training 37,38 . In this meta-analysis it was addressed whether the primary outcome measures indicated a difference in the effect size depending on whether the cadence was high or low, but no difference in the outcomes was found. However, not all studies reported the cadence, and cadence alone is not a sufficient measure of training intensity. Other measures of intensity, such as heart rate and rate of perceived exertion were reported rather variably, either not at all, in different units, or they were merely monitored, thus drawing further conclusion about the role of intensity is not feasible based on the data at hand. Despite some of the here-included studies already systematically varying intensity and other exercise-programme-related parameters, more comparative studies are needed to better understand the customizability of bicycling. This is an important notion, as bicycling has the potential of catering to both high-and low- Fig. 3 Primary outcome measures. The forest plot demonstrates how each study contributes to the effect size of the primary outcomes. The overall effect is marked with the diamond symbol on x-axis of the effect size scale. N = Number of patients; SMD = standardized mean difference; SD = standard deviation; g = the Hedges' bias-corrected standardized mean difference, which was used to calculate the effect size; CI = 95% confidence interval; weight = the weight based on the inverse of the variance given to each study; I² = percentage of variability. intensity exercise while allowing the customization of the intensity of skeletomuscular activation, and overall mobility by varying the ratio of cadence and resistance.
Recent meta-analyses have reported that FOG can benefit from physiotherapy and from physical exercise in general 2,39 . Due to lack of FOG being an outcome measure in the studies included here, this meta-analysis cannot provide any information about the influence of bicycling on FOG. Thus, it would be crucial to further investigate the possible benefits of bicycling in particular on patients suffering from freezing for an enhanced customization of a possible bicycling intervention.
Furthermore, many exercise protocols in the reviewed studies implemented a recumbent or a stationary bicycle, meaning that the patients' ability to balance was not being as challenged as it would be on a regular bicycle. Nevertheless, the results demonstrate that balance improved as an outcome of the applied interventions. Thus, when developing technically more advanced forms of bicycling exercises, it might be worth aiming at regimens where balancing is similarly challenged as on a regular bicycle, as it could be expected to benefit the balancing outcome even more. Moreover, it has been reported that balance training reduces fear of falling 40 , which is known to be one of the most disabling symptoms in PD. Thus, an improved balance as an outcome of bicycling could be expected to enhance other life-limiting challenges of PD patients as well 41,42 .
Patients' own motivation, and possible barriers of exercising are a major factor in the success of physical exercise and overall activity 43 . Importance of considering safety and preference features have been suggested to be a decisive factor, in whether clinician promotes treadmill or cycling to a patient 44 . In the included studies, no conclusions can be drawn about the subjective ratings towards the exercise itself. Thus, for a successful clinical practice there clearly is a need to assess patients' own judgement of the exercise programme, as well as the overall suitability in terms of practical implementation into one's own daily life. Furthermore, for future studies it would be beneficial to assess for any differences in targeting exercise to early, middle or later disease stages 39,44 .
The main concern when including NRCT studies is that the baseline measurements of the different groups are not equal due to a lack of randomization in the treatment allocation, or due to differences in experimental designs thus possibly leading to biased results 45 . To observe and minimize any possible bias, several methodological precautions were taken. Firstly, a thorough and versatile assessment of quality designed to include also NRCT studies was applied. Furthermore, random-effects model was chosen over a fixed-effects model to counterbalance the possibly heterogeneous patient population. Also, various measures of heterogeneity and sensitivity were applied to point at any studies contributing to a large heterogeneity. Lastly, the primary outcomes were inspected on the sub levels of study designs to test whether the design led to differences in the found effect sizes.
The quality assessment criteria, QualSyst, was applied to evaluate the quality of reporting, the internal validity of the included studies, and the certainty of the findings of individual studies. Since the F-test, the subsequent t-test of the assessment done by the two reviewers were non-significant it can be concluded that the reviewers agreed sufficiently well on the outcomes of the assessed items. Furthermore, on average the reviewed studies scored good ratings. The between-study heterogeneity assessment and the sensitivity analysis, with the subsequent removal of identified studies and their respective outcome measures from the pooling of the effect-size are considered as an indication of the certainty of the results. Conclusively, the certainty of the overall results of the presented studies would mainly benefit from enhancing the design, favouring RCT would lead to an increased overall controllability. Furthermore, increasing sample size and the unification of certain measures as well as intervention protocols could increase the certainty and overall quality of the findings.
The present work demonstrates that bicycling can lead to versatile improvements, yet it also seems that the effects of bicycling are rather specific, and when it comes to a more detailed understanding, or prescribing physical exercise regimen based on personalized needs and preferences, the current knowledge remains scattered. More studies are needed to directly address the potential benefit of bicycling on the most common, functionally and psychologically disabling symptoms such as falling and FOG 41,46 . Overall, in order to understand in which situations bicycling is best applied, over other forms of exercise, more scrutiny on the reporting and controlling of the intervention, and the outcomes is needed. This would be particularly important in order to define the optimal intensity and cadence of bicycling exercise, as well as to recognize the optimal stage of disease progression at which the training could be most beneficial. As the currently available pharmaceutical medication for PD only treats the symptoms, at best improving the daily coping of the patients while not terminating the disease progression 47 , developing well-targeted adjuvant forms of physical exercise is crucial.

CONCLUSION
Taken together, this review provides evidence that bicycling is a versatile form of physical exercise for PD patients. Considering the clinical relevance of the findings, the results support the application of bicycling, in particular to improve gait-related parameters of balance, walking speed and overall walking capacity. Furthermore, based on the outcome measure PDQ-39, the benefits of bicycling go beyond physical improvement, resulting in an increased quality of daily living. In addition, the results indicate that the effects of bicycling are based on longer-term exercise rather than on immediate effects of single sessions. Therefore, bicycling is a meaningful way to improve the lives of patients suffering from Parkinson's disease.

DATA AVAILABILITY
All data analysed in this work are based on already published data, which is referenced and, where applicable, presented in the main manuscript and supplementary material. A pre-registration of the meta-analysis can be found in the International Prospective Register of Systematic Reviews (PROSPERO) with the number CRD42019137386.