Introduction

Freezing of gait (FOG) is an extremely debilitating problem that is common among people with Parkinson’s disease (PD)1,2. FOG is characterized by episodes in which walking cannot start or is interrupted, despite the effort to move forward2,3. This severely impairs function, independence, and quality of life1,4. FOG is considered the leading contributor to falls in PD, further contributing to its major negative impact4,5,6.

Multiple theories and models have been proposed to explain the causes of FOG2,7,8,9, however, the precise mechanisms that trigger FOG are not yet well understood. Many attempts at ameliorating FOG frequency and severity have also been made. These include anti-parkinsonian medications, cueing, exercise, and invasive and non-invasive brain stimulation3,10,11,12,13,14,15,16,17,18,19. In addition, interventions designed to ameliorate FOG severity by treating anxiety and depression or improving executive function and other cognitive domains have been investigated5,11,20,21,22,23. While FOG may partially be resolved in response to these therapies, it generally persists and worsens over time, with minimal or no long-term resolution2,7,15,18,24. The sheer number of approaches and unsuccessful studies underscores both the clinical significance of FOG and the absence of an effective therapeutic approach.

To make a robust, clinically meaningful impact, we suggest, therefore, that an “upstream” approach to treating FOG may be needed: prevention. Treating FOG before it occurs has not yet been attempted, likely for several reasons. Prevention requires relatively long, prospective studies. Moreover, it is not yet clear how FOG could be prevented. Still, if modifiable, early signs can be identified, this could, perhaps, lead to the targeted application of FOG prevention among patients who have a greater risk of developing FOG. A growing number of observational studies have prospectively evaluated factors associated with FOG incidence, providing a timely opportunity to review the literature and to start to consider the possibility of preventing or lowering the risk of the development of FOG. Briefly, several factors have been suggested to be associated with FOG incidence. These include, for example, longer disease duration, increased depression or anxiety, higher levels of levodopa daily equivalent dose, reduced sleep quality, worse cognitive function, and the PIGD motor subtype25,26,27,28,29,30. Therefore, in this systematic review and meta-analysis of observational studies, we aimed to summarize and synthesize studies that report on the risk factors and potential early signs of the future development of FOG, among patients with PD who do not yet have FOG.

Results

Study selection

The search yielded 2240 records. After duplicate removal, 1068 distinct records were identified. Of these, 38 were eventually included in the systematic review, and 35 were included in the meta-analysis (Fig. 1). Table 1 summarizes the studies that examined clinical, motor, and non-motor potential predictors and Table 2 summarizes other predictors such as genetic and CSF parameters.

Fig. 1: Study selection and PRISMA Flow.
figure 1

Study selection; PRISMA Flow Diagram describing the studies that were initially identified in the search and those that were included in the meta-analysis.

Table 1 Clinical, motor, and non-motor predictors of the development of FOG.
Table 2 Brain imaging and other modalities and their association with the future development of FOG.

Quality assessment

The overall quality score for each study is presented in Tables 1, 2 (rightmost column). Quality scores ranged between 4 and 8 stars with a median [quartiles] of 6.5 [6–8]. Of the 38 studies, 29 were rated as high quality (76.3%, see Supplementary Table 2 for the full quality assessment).

Systematic review

Most of the studies evaluated a large number of potential motor and non-motor predictors using different scales, procedures, and tests that assessed several disease features (e.g., measures reflecting disease severity, motor function, cognitive function, anxiety, sleep, depression, hyposmia, and anti-Parkinsonian medications) (Table 1). Of those, several factors were associated with the future development of FOG. For example, Zhao et al.29 (n = 350, 37.7% developed FOG) found that longer disease duration, higher levels of levodopa equivalent daily dose (LEDD), and a more depressive state at baseline were significantly associated FOG incidence. A 100-mg relatively higher level of LEDD at baseline was associated with a 44% increased ‘2-year risk’ of FOG. Not surprisingly, all studies did not agree. For example, in the prospective study by Chung, Yoo, Lee, et al.28 in 257 patients who were followed for at least 2 years, none of the predictors identified by Zhao et al.29 were reported as being associated with FOG incidence.

Imaging and other possible predictors were also explored (Table 2). White matter hyperintensities increased the risk of developing FOG more than 3-fold (HR:3.29: CI:1.79–6.05), even after controlling for age, sex, DAT uptake, and LEDD31. Lower DAT uptake values were significantly related to a higher FOG incidence32,33. A positive APOE ε4 allele was moderately associated with a higher risk of developing FOG34. Lower levels of CSF Aβ42 (inversely related to brain amyloid load) were associated with an increased risk of FOG in two studies35,36. Likely because different studies examined disparate possible risk factors, agreement across these studies was often lacking.

Meta-analysis

Thirty-five studies were included in the meta-analysis (8,973 participants). Among those, 2175 people (24.2%) developed FOG by the end of follow-up (mean follow-up: 4.1 ± 2.7 years) (Table 3, note that variables that were reported only in two studies are presented in Supplementary Table 3). Non-modifiable predictors included older age at disease onset, longer disease duration, and lower DAT uptakes (both in the caudate and putamen). Figure 2 summarizes the significant predictors. Figure 3 presents the Forest plots of selected significant predictors of FOG.

Table 3 Main meta-analysis results presented by categories.
Fig. 2: Summary of significant risk factors for incident FOG according to the type of data available with effect size and 95% confidence interval of random effects models.
figure 2

In each panel, the factors are organized in alphabetical order. A Standardized mean difference; (B) Relative risk; C Hazard ratio. A list of all of the abbreviations used can be found in the Supplementary Table 1.

Fig. 3: Forest plots of selected significant predictors of the future development of FOG.
figure 3

A represents one of the anxiety predictors; (B) represents one of the depression predictors; (C) represents the use of medication predictors; (D) represents one of the cognitive domain predictors.

Several modifiable PD features were also identified (Fig. 3). For example: (1) higher baseline levels of depression and/or anxiety (Fig. 2A: GDS, HAMA, HAND, STAI and Fig. 3A, B); (2) poorer baseline motor function, gait, and balance (Fig. 2A: Berg balance test, H&Y, MDS-UPDRS part3, UPDRS part 3, PIGD score; Fig. 2B: H&Y level 3, Fig. 2C: MDS-UPDRS part 3); (3) higher LEDD at baseline (Fig. 3C); (4) use of Levodopa and COMT inhibitors (Fig. 2B); (5) a worse cognitive state at baseline increased the risk for the future development of FOG (Fig. 2A: MOCA, MMSE and Fig. 3D); and (6) non-motor features such as autonomic dysfunction, reduced sleep quality (measured by ESS and RBDSQ, Fig. 2A), olfactory deficiency (measured by UPSIT, Fig. 2A) and worse PDQ39 scores (Fig. 2A), also predicted FOG development.

The results of the multivariable analysis were similar to those of the main analysis (Table 4). Even when simultaneously taking into account and adjusting for age at disease onset, disease duration, depression, cognitive function, motor severity, and LEDD levels, all of these factors were still significantly associated with FOG incidence. Most results were similar when MOCA was replaced by MMSE and when STAI was replaced by HAMA.

Table 4 Multivariable analysis results (vs. main analysis).

Heterogeneity

Overall, the majority of analyses had small to moderate heterogeneity (36/62, 58.1%). Considerable heterogeneity was assessed in 13 risk factors (20.9% of analyses). The leave one out (LOU) analysis was calculated for 12/13 analyses (‘hyposmia’ had only two studies). Out of the 12 LOU analyses, the heterogeneity of 7 features was reduced to small/moderate, 4 to substantial (levodopa use, LEDD continuous and PIGD and intimidate PD subtypes), and only 1 (PD duration) remained considerable. Overall, the effect sizes did not differ from those in the main analysis.

Publication bias

Out of 62 analyses, Egger’s intercept calculation was applicable to 32. Of those, only 3 analyses had a high risk of publication bias (i.e., Berg balance score, Putamen DAT, and H&Y [Level 1, RR]). The intercepts and p-values are presented in Supplementary Table 4.

Subgroup analysis

Four subgroup analyses were performed: (1) by study quality score (≥6; 26/35 studies, median [IQR] = 10 [9–10.25]), (2) by study design (prospective: 21/35 studies), (3) by follow-up period (≥2 years: 30/35 studies, median [IQR] = 3.5 [2–5]), and (4) by PD duration at baseline (<2 years: 19/35 studies, median [IQR] = 2 [0.7–6]). Several modifiable outcomes that were identified as statistically significant predictors in the main analyses were included in the subgroup analyses: motor and disease severity, LEDD, depression and anxiety, cognitive state, and quality of life. Generally, the results did not differ from those shown in the main analysis (Supplementary Table 5). However, MOCA, MMSE, and MDS-UPDRS part 3 (ON) lost their significance when assessing studies with a follow-up period of more than 2 years. In addition, levodopa use was no longer a significant risk factor in studies including patients with longer disease duration at baseline (over 2 years PD duration, RR: 1.17 [0.98–1.40]). Nonetheless, the risk for developing FOG in newly diagnosed patients using levodopa at baseline was greater compared to the main analysis (RR new PD: 2.3 [1.77–2.97], RR main analysis: 1.54 [1.06, 2.25]).

Discussion

The present systematic review and meta-analysis generate new insights into risk factors that are associated with the incidence of FOG in patients with PD. Multiple factors were identified that were different in people with people who later developed PD, compared to those who did not, raising the possibility of prediction. Interestingly, some of these risk factors are modifiable, suggesting that perhaps even prevention can also be considered. These findings carry important implications for the clinical management of PD patients, enhancing our understanding of the underlying pathological mechanisms contributing to the development of FOG, as well as offering potential avenues for predicting, delaying, and perhaps even preventing FOG.

The inter-relationships between FOG incidence, disease duration, and severity are complex and challenging to disentangle. The meta-analysis shed some light on these relationships. According to previous reports, disease duration, and disease severity (e.g., scores on MDS-UPDRS part III) increase the risk of developing FOG1,37,38,39,40. Although less prevalent, FOG may also occur early in the disease1,36,37,38,39,40. In line with these reports, the meta-analysis found that an increase in disease duration was associated with an increased risk for FOG development, even after adjusting for motor and non-motor baseline symptoms and LEDD (Table 4), supporting the idea that FOG risk increases as disease duration and severity increases. However, in the subgroup analysis (Supplementary Table 5), we found that even in patients with less than 2 years of PD, a relative increase in disease severity (i.e., MDS-UPDRS part III), anxiety, depression, balance impairment, cognitive impairment, and higher levels of LEDD still predicted the development of FOG. This suggests that already among people with relatively short disease duration, certain factors exist that are associated with an increased risk of developing FOG in the future. Furthermore, in the multivariable analysis, the severity of those symptoms was independently associated with an increased risk of developing FOG (Table 4). Exclusion of patients with DBS was not specified as an exclusion criterion, but since we excluded patients with FOG at baseline, the included studies had very few advanced PD patients; the median disease duration at baseline (study entry) was only 1.8 years [IQR: 0.125–4.25 y]. Nevertheless, one should bear in mind that disease duration cannot be a proxy of disease severity, for example in the case of ‘malignant form’ which generally accounts for 9–16% of cases of PD or even atypical parkinsonism that was diagnosed by mistake as PD (less than 0.1% of cases).

The findings of this review support the idea that the pathological processes contributing to the development of FOG may be present years before FOG becomes clinically evident, at least in some subpopulations. Consistent with this notion, in several of the reviewed studies that matched the subjects at baseline with respect to disease duration and the severity of motor symptoms, non-motor factors differed in individuals who later developed FOG and those who did not during 5–10 years of follow-up. The role of non-motor symptoms in FOG is not surprising. Indeed, affect, depression and anxiety have been shown to be increased in people with FOG, compared to people with PD who do not have FOG41,42. Extending that idea, the present analyses support the position that these alterations in mental health not only are related to and may trigger specific FOG episodes, among people who experience FOG, but in addition, these non-motor alterations may be in the causal pathway and lead to the future development of FOG.

The complexity of the relationship between disease severity and duration becomes more entangled when considering the debated contribution of anti-parkinsonian medications to the development of FOG43,44. Regarding levodopa usage, there are arguments both in support of and against its role in increasing the risk of FOG. The current analysis, like several past studies43,44,45, shows that patients prescribed levodopa had a higher risk of developing FOG, compared to those who were not (RR: 1.54). Some have explained that levodopa likely does not enhance nigral neurodegeneration and its potential to damage dopamine cells is minimal46. On the other hand, although LEDD might just be a reflection of disease severity and not a risk factor by itself, in the multivariate analysis, we found that a higher dose of LEDD was associated with increased risk for FOG, even when adjusting for other potential confounders. Furthermore, Koehler and colleagues47 speculate that high-frequency oscillatory features of FOG are probably induced by levodopa. Recently, Jansen et al.48 introduced the idea that LEDD can exacerbate the risk of developing FOG. Strikingly, they found that FOG was 6 times more common in a levodopa-treated cohort than in a naive cohort. Conversely, Gilat et al.45 argue that FOG can manifest before levodopa intake (e.g., it is present in 5–17% of naïve patients). This discrepancy highlights the need for longitudinal studies to determine how the interplay between evolving pathology and dopaminergic therapy affects the development of FOG. Still, it is important to keep in mind that LEDD was associated with FOG development. To further investigate these relationships and dependencies, there is a need for future pre-specified studies. Since isolating the effects of medication from disease severity can be challenging, caution should be exercised when interpreting and acting on these findings. Moreover, there is a need for future work on the association between FOG and specific medications, e.g., benzodiazepines, SSRIs/SNRIs, in addition to levodopa.

In addition, the use of COMT inhibitors was associated with a significant increase in the risk of developing FOG, as evidenced by an RR of 2.58 (based on two studies). Furthermore, in line with the existing debate43, and despite what was found in previous randomized controlled studies5, the present analysis showed a non-significant result for the association between dopamine agonist therapy and FOG incidence. Moreover, while previous research has shown that MAO-B inhibitors have been associated with a decreased risk of developing FOG5, the current analysis observed a non-significant increased risk. Similarly, the use of anticholinergic medications was not significantly associated with FOG incidence (recall Supplementary material). These findings should be considered when prescribing PD medications.

Imaging studies have described multiple pathways and brain regions that are associated with FOG in cross-sectional studies32,49,50. Here, we found that alterations in the nigro-striatal pathway were related to the development of FOG33,49. This pathway, involving the connections between the substantia nigra and the striatum, plays a crucial role in motor control. However, the behavioral findings suggest that other neural circuits, possibly related to limbic and frontal networks, are also important. These circuits are associated with emotional processing and executive functions and are also involved in gait control and navigation. For instance, the current meta-analysis identified multiple non-motor features such as depression, anxiety, poor cognitive state, and sleep disturbances that were associated with an increased risk of developing FOG in the future. It is not yet clear from this analysis if all of these factors are markers that just predict future FOG development or if they also contribute and play a role in the causal chain. Nonetheless, as noted above, given that non-motor changes like greater depressive symptoms and anxiety have already been shown to be increased in people with PD who have freezing, compared to those without8,41,51, perhaps these factors also are causally involved in the future development of FOG, as postulated in a recent model that distinguished between FOG-triggers and longitudinal factors that set the stage for the future development of FOG8.

A recent study reported that mutations in the GBA gene were associated with increased FOG incidence52. This paper was not included in the present review and the meta-analysis because of its publication date. Nonetheless, it highlights the potential role of genetics in evaluating and understanding FOG risk. We also note that that a positive APOE ε4 allele and lower levels of CSF Aβ42 were associated with an increased risk of FOG34,35,36. It was suggested that the allele directly impacts the development of α-synuclein pathology and regulates α-synuclein pathology independent of its established effects on Aβ and tau in mouse models. Thus, it is possible that APOE ε4 carriers have higher quantities of tau and α--synuclein pathology and more severe white matter hyperintensities than APOE ε4 non-carriers, which may affect the neural circuitry associated with FOG. Moreover, the APOE ε4 allele is thought to exert a harmful effect on glucose metabolism and microglial homeostasis. Considering previous findings, these changes may contribute to a faster progression of PD, including cognitive impairment, that was found to predict FOG. Nevertheless, the association of such changes due to the APOE ε4 allele with FOG remains unclear and needs confirmation and further investigation.

Another possibility to consider is that perhaps risk factors that are specific to FOG might be identified, rather than working with thresholds and the degree of symptoms that are common among people with PD (e.g., depression level). Developing a FOG-specific composite index that incorporates multiple risk factors, including depressive symptoms, anxiety, motor function, DAT in the caudate and putamen, other imaging findings, and perhaps genetics, for example, could be a valuable approach, analogous to the likelihood ratio used to identify prodromal PD. This index could provide a comprehensive assessment and help to stratify patients based on their individual risk levels. Initial attempts at developing such composite scores have shown promise, particularly when combining imaging with other measures36,53. Considering the low ability to differentiate posterior putamen, ventral putamen and caudate uptake, it would be interesting in the future to explore DAT uptake levels at the different regions of the striatum and how they might correlate with FOG. Additional, prospective observational studies are needed to validate and refine composite indices for FOG risk, both with and without the inclusion of relatively expensive tests like imaging or electroencephalography.

Early identification of people with PD with a relatively increased risk of developing FOG may inform and lead to early interventions to reduce FOG and its devastating consequences in a sub-group of patients with a high FOG risk. Hence, the present findings contribute to the emerging concept of precision medicine for the treatment of PD. Personalized medicine aims to tailor medical interventions to individual patients based on their specific characteristics, including genetic, environmental, and clinical factors54,55,56,57. By identifying unique risk factors, other associated factors, and underlying pathological processes associated with FOG development, clinicians can potentially adopt a precision medicine approach to the treatment of FOG. This could enable and inform a proactive stance in managing FOG, aiming to prevent or delay its emergence rather than only addressing it reactively only after FOG becomes manifest.

One potential preventive approach that should be further explored could involve early treatment of non-motor risk factors such as anxiety, depression, and executive function deficits58,59,60,61, keeping lower dosages of daily levodopa when possible, and focusing on interventions to improve gait early in the disease course. Addressing these modifiable non-motor symptoms early on may help alleviate their impact on gait, and reduce the risk of developing FOG. Exercise and physical activity, as well as behavioral, psychological, and pharmacological interventions, can partially alleviate these symptoms15,43,61,62,63,64. Relatively long-term prospective intervention studies are required to evaluate these important possibilities. Nonetheless, in the meantime, aerobic and multimodal exercise, for example, have minimal negative consequences and multiple positive benefits65,66,67 - irrespective of FOG - and may merit consideration for altering the natural course of the development of FOG, even before randomized controlled trial evidence is obtained.

These possibilities emphasize the importance of a comprehensive assessment that includes both motor and non-motor symptoms in PD management. Nevertheless, many of the identified risk factors are common in PD and aging in general, making it crucial to define clear thresholds to identify when these factors become significant contributors to FOG development e.g., at what level does anxiety become a risk factor for FOG or does that occur only if other factors are also present? (as in the composite risk indices discussed above). FOG poses a significant risk for falls and loss of independence in individuals with PD. Therefore, it is important to explore ways to delay or prevent its development. While there is no single “magic bullet” to fully eliminate the troublesome symptoms associated with FOG, a multi-modal therapeutic approach that addresses the identified risk factors - both motor and non-motor - may prove effective in reducing the risk of FOG development.

This extant literature and the present meta-analyses have several limitations. For example, three studies were excluded from the meta-analysis due to a lack of information that could be integrated with other publications or the absence of the numbers of freezers and non-freezers. Many of the studies recruited subjects at different time points along the natural history of the disease. Ideally, a natural history study starting at disease diagnosis, in untreated patients, or even earlier, could help to clarify the discrepancies between studies. Nonetheless, the subgroup analyses partly addressed this issue. The difference in how FOG was measured also may have contributed to inconsistencies across the studies. About half of the studies used self-report to identify FOG rather than using objective measures. Indeed, in a study among 9072 patients with PD, 51% had FOG based on the NFOG-Q, compared to only 23% based on the UPDRS40, highlighting the important role of the method of determining FOG. Similarly, the absence of a detailed description of the PD phenotype, tremor-predominant, PIGD, mild-motor, intermediate, diffuse-malignant subtypes68 and report of which classification system was used may have impacted the findings. Work on outcome measurement is needed to address these issues. The duration of follow-up also likely played a role in the differences among studies, though approximately 86% of studies had more than 2 years of follow-up, suggesting that long-term prediction is possible. Indeed, the results of the subgroup analysis according to the follow-up period generally did not differ from the results of the main analysis. 74% of studies were of high quality, however, subgroup analysis based on study quality did not change the effect size compared to the main analysis, suggesting that study quality was generally sufficient. In addition, even after addressing the considerable heterogeneity using the ‘leave one out’ method, about 25% of the analyses had substantial to considerable heterogeneity while almost 75% did not. Furthermore, the risk of bias was present in only 3/32 analyses. It is essential to keep in mind that the different subtypes of FOG were not considered (e.g., ON versus OFF medication freezing, akinetic vs. trembling type). Also, the limitations of observational studies, as compared to RCTs, should be considered along with the limited number of studies that included potentially intriguing, very strong risk factors (e.g., falls RR:11.8; white matter hyperintensities HR:3.29). Future studies are needed to address these important issues.

Despite these limitations, the current findings based on nearly 9000 subjects with an average follow-up of 4 years suggest that multiple risk factors are linked to the future development of FOG. The present findings and the multivariable analysis suggest that patients who have specific risk factors are more likely to develop FOG and that the risk of FOG incidence is not just a simple reflection of disease severity. While not all of the identified factors are modifiable, many apparently are, at least to some degree (Fig. 4). Therefore, in addition to efforts to ameliorate FOG in those patients who already experience this debilitating symptom, perhaps it is time to shift the paradigm and start to consider a personalized, preventive approach to treating FOG by intervening in patients who have an increased risk, before FOG develops and becomes clinically apparent.

Fig. 4: Summary of the significant predictors of FOG incidence, grouped with respect to those that are putatively modifiable and those that are not.
figure 4

When possible, we recommend keeping levodopa and LEDD at minimal effective doses. When anti-parkinsonian medications are prescribed, the patient’s response should be carefully monitored. A list of all of the abbreviations used can be found in the Supplementary Table 1.

Methods

Search strategy, Inclusion criteria and data extraction

A systematic literature search was performed by a qualified librarian in Medline, EMBASE, CINAHL, and Web of Science databases through December 31, 2022, with no year or language limitations. (Additional search strategies are presented in Supplementary Material Methods). Briefly, studies were eligible for inclusion if (a) they used a cohort design (prospective or retrospective), (b) they included adult patients who had received a diagnosis of PD without FOG at baseline, (c) data on possible FOG predictors were measured at baseline, and (d) incident FOG was assessed at follow-up.

Published study-based level data were extracted independently by two investigators (Y.B. and T.H.) and reviewed by a third investigator (SS). Disagreements were resolved via discussion. Reference lists of the included studies were explored for additional studies. No articles needed translation. We included in the meta-analysis studies that were grouped-based level (FOG vs. No-FOG) and their results were presented in the main text or additional information (n = 35/38). FOG was generally defined as a “brief, episodic absence or marked reduction of forward progression of the feet despite the intention to walk”2. In the included studies, FOG was detected in several subjective ways. These included self-report via (1) a single question “Do you feel that your feet are glued to the floor?”; (2) using the FOG-Q (item 3); or (3) the NFOG-Q questionnaire (“Did you experience Freezing episodes over the past month”), after showing them a video with different kinds of freezing episodes. Objectively, FOG was identified via clinical observation (MDS-UPDRS item 3.11) or a neurological examination by a certified neurologist. The Preferred Reporting Items for Systematic-Reviews and Meta-Analysis (PRISMA) guidelines were followed. The definitions used in the original studies for the identification of FOG and all predictors were accepted. The meta-analysis was pre-registered at PROSPERO, study ID: CRD42022325489.

Study quality assessment

All included studies were assessed for methodological quality using the Newcastle - Ottawa Scale (NOS) for cohort studies69. This tool is based on a system of stars (*) awarded for each criterion that is fulfilled. Quality is assessed on the selection of the sample including the representativeness of the exposed cohort and the selection of non-exposed cohort, ascertainment of exposure (maximum 4 stars), comparability of the cohorts on the basis of study design and analysis (maximum 2 stars) and the assessment of the outcome (maximum of 3 stars). The maximum number of stars is 9; studies were graded as high quality, for scores of 6 and up, and low for scores <6.

Statistical analysis

Possible predictors were compared between people who developed FOG and those who did not. When available, data on baseline characteristics were analyzed. The standardized mean difference (SMD) was calculated using mean and standard deviation (SD) for continuous variables, and the risk ratio (RR) was calculated using numbers and proportions for categorical variables (see Supplementary Table 1 for a list of abbreviations used). SMDs of 0·2, 0·5, and 0·8 are considered small, medium, and large, respectively70. In the absence of baseline information, the Hazard Ratio (HR) was calculated using a log transformation of the effect size, and 95% confidence intervals (CI). Data were meta-analyzed when at least data from two different studies were available. When available, data was stratified by characteristics of the assessment (e.g., on/off medications; drug-naïve at baseline was considered as off medications). A random-effects meta-analysis model was applied to estimate the overall magnitude and statistical significance of an effect. The random-effects model was preferred to a fixed-effects model because the included studies included potential sources of heterogeneity arising from differences in cohorts and analytic methods. The confounding effects of several disease features that are known to be correlated with each other (e.g., age of disease onset, disease duration, motor [MDS-UPDRS part 3] and with non-motor symptoms [depression, anxiety, cognition], and drug therapy) were assessed using a multivariable meta-analyses. Multi-collinearity was assessed using “vif” function in the “metafor” package showing low values (<5) for all variables included in the multivariable analysis. Also, to avoid residual collinearity, the multivariable analysis included only one variable from each domain (e.g., only MOCA or MMSE, but not both in the same model).

Variability between studies was evaluated utilizing the statistical test of homogeneity, I2. The magnitude of study heterogeneity was determined according to I2 level, where values of <25%, 25–50%, 50–75% and >75% were considered small, moderate, substantial, and considerable71. When heterogeneity was deemed considerable, a sensitivity analysis using the ‘leave one out’ (LOU) method was performed, the study suspected as the cause for the high heterogeneity was detected and the random effect size was calculated without it. Publication bias was assessed using visual inspection of Funnel plots (data not shown) and Egger’s regression intercepts, when applicable. Publication bias was considered present when the p-value was less than 0.1. Four sub-group analyses were performed: (1) high (or low) quality studies (cut off: score 6), (2) prospective (or retrospective) design, (3) relatively long (or short) follow-up periods (cut off: 2 years), and (4) short (or longer) duration of PD at baseline (cut off: 2 years). Data analyses were performed utilizing R software version 4.1.1 using the ‘meta’, ‘metafor’ and ‘dmetar’ packages. There was no funding source for this study.