A blinded, controlled trial of objective measurement in Parkinson’s disease

Medical conditions with effective therapies are usually managed with objective measurement and therapeutic targets. Parkinson’s disease has effective therapies, but continuous objective measurement has only recently become available. This blinded, controlled study examined whether management of Parkinson’s disease was improved when clinical assessment and therapeutic decisions were aided by objective measurement. The primary endpoint was improvement in the Movement Disorder Society-United Parkinson’s Disease Rating Scale’s (MDS-UPDRS) Total Score. In one arm, objective measurement assisted doctors to alter therapy over successive visits until objective measurement scores were in target. Patients in the other arm were conventionally assessed and therapies were changed until judged optimal. There were 75 subjects in the objective measurement arm and 79 in the arm with conventional assessment and treatment. There were statistically significant improvements in the moderate clinically meaningful range in the MDS-UPDRS Total, III, IV scales in the arm using objective measurement, but not in the conventionally treated arm. These findings show that global motor and non-motor disability is improved when management of Parkinson’s disease is assisted by objective measurement.


INTRODUCTION
Although objective measurements are central to therapeutic decision making in most areas of medicine 1 , this is not the case in Parkinson's disease (PD), where until recently, continuous objective measurement was lacking. Consequently, therapeutic decisions in PD depend on clinical expertise and the person with PD's (PwP) ability to report their symptoms. Bradykinesia is the main target of current PD therapy, so objective measurement should ideally measure the severity of bradykinesia, its variation with the timing of therapy, and the presence, timing and severity of dyskinesia. Modern sensors have now provided tools for continuous measurement of bradykinesia and dyskinesia 2 and in previous pilot studies, we examined the contribution of one of these (the Parkinson's KinetiGraph or PKG, Global Kinetics Corporation TM , Australia) on guiding therapy 3,4 . These studies suggested that PwP gain benefit when PD management is assisted with objective measurement and provided power and inclusion criteria for a controlled study. The study reported here compares the management of PD by doctors using information provided by objective ambulatory measurement and conventional assessment, with management using conventional assessment alone. The primary outcome was changes in the Unified Parkinson's Disease Rating Scale (MDS-UPDRS) Total score and secondary outcomes were changes in the MDS-UPDRS III subscore, the Parkinson's Disease Questionnaire 39 (PDQ39) score and the Severity of predominantly Non-dopaminergic Symptoms in Parkinson's Disease (SENS PD) scale. Successful outcome would be important not only for the usual care of PwP but also an important step towards telemedicine and tele-monitoring of PwP.

RESULTS
Two hundred and eighty-two participants registered an interest in the study and 200 of these met eligibility criteria and were enrolled in the study (see Fig. 1 and "Methods" for full details of study structure including inclusion and exclusion criteria).
Comparison of the PKG+ and PKG− arms at entry There were 97 subjects in the PKG+ arm (assessment of PD made using clinical evaluation as well as with access to PKG information) and 103 in the PKG− arm (assessment of PD made using only clinical evaluation) and 46 were withdrawn during the study: 7 for protocol violations, 11 referred for deviceassisted therapy and the remainder were participant-initiated withdrawal for social, personal or non-PD medical reasons. All 46 withdrawn cases were enrolled and had clinical scales and a PKG (which was judged as "out of target" in all cases) prior to the first visit. Nine of the participant-initiated withdrawals occurred before the first visit: the remainder withdrew prior to the final visit, so no final clinical scales or PKG were available from these cases. As the study's analyses depended on measuring changes from first to final visit, all subject initiated withdrawals and cases referred for device-assisted therapy were excluded from the data set because clinical scales from the final visit could not be obtained. The demographics and scores on clinical scales of PwP who completed the study were similar in the two arms ( Table 1). The demographics of enrolled subjects is shown in Supplementary Table 1: there was no difference between enrolled and analysed score of any parameter (statistics not shown) of subjects in each arm. As well, there were no differences in the MDS-UPDRS Total and III scores between subjects attending the clinics or between rural and city subjects (one-way ANOVA, data not shown), with the exception that the scores in rural subjects in the PKG− arm tended to be more severe.
Change in clinical scales and PKG from first to last visit in the PKG+ and PGK− arm The difference between clinical scores and PKG values at entry and exit of each arm were examined ( Table 2). In the PKG+ arm, the MDS-UPDRS Total scores (the primary outcome) improved by 8.5 points (95% CI 3.4-14, P = 0.001). In terms of secondary endpoints, the MDS-UPDRS III improved significantly by 6.4 points (95% CI 3.6-9.2, P < 0.001), but neither the PDQ39 (4.7, 95% CI −0.2 to 9.5, P = 0.07) nor the SENS PD (1.2, 95% CI 0.1-3.6, P = 0.13) reached statistical significance. The change in the means of both MDS-UPDRS scores were of moderate clinical importance 5 (14% and 18%, respectively) and the improvement in PDQ39 scores was 17%. In contrast, the changes in MDS-UPDRS Total, III, PDQ39 and SENS scores in the PKG− arm were not statistically significant (8%, 7% and 13%, respectively). As might be expected, PwP with the highest MDS-UPDRS III scores at the first visit tended to have the greatest improvement at the final visit (Fig. 2). The average change in scores when the first MDS-UPDRS III > 35 was in the range of large clinically important differences in the PKG+ arm but not in the PKG− arm.
Decision to treat compared to PKG findings in the PKG+ and PKG-arms Doctors in the PKG+ arm were directed to treat according to the PKG report, unless there were clear clinical grounds to act otherwise. Thus, the difference in the concordance between the PKG report and the doctors' actions in the two arms is an indication of the PKG influence on clinical decisions. Furthermore, instances where doctors in the PKG+ arm did not follow the PKG report are cases where doctors decided that the PKG gave an incomplete clinical picture. To examine this, cases were sorted according to whether their PKG was reported as being in or out of target (Fig. 1). In the PKG− arm, 52% of the 23% cases reported as "in target" were treated (almost all for bradykinesia) whereas only 17% of the 24% cases in the PKG+ arm that were "in target" were treated (mostly for dyskinesia not involving the upper limb). The clinical scores of "in target" cases in the two arms were similar and remained unchanged 3 months later (data not shown). Of PwP whose PKGs were reported as "out of target", 12% were not treated in the PKG+ arm, whereas 30% were not treated in the PKG− arm.  Outcomes in subjects who were "out of target" at the first visit The differences from first to final visit in MDS-UPDRS Total, MDS-UPDRS III, PDQ39 and SENS PD scores of subjects who were "out of target" on the first visit in the PKG+ arm were all significant (Table 4) with moderate clinically meaningful differences in the means 5 . None of these scales changed significantly from first to final visit in the PKG− arm. The difference between the primary and secondary endpoints scores at the first and last visit in subjects who were out of Target (Table 3) or who were bradykinetic at first visit (Table 3) were estimated for each individual and these differences in the PKG+ arm were compared with those in the PKG− arm (Table 3).
In those who were out of target, there was a significant difference between the PKG+ and PKG− arm for MDS-UPDRS Total (7.3, 95% CI 3.4-11.2, P = 0.002) and III (5.3, 95% CI 2.4-8.3, P = 0.0004), but not for PDQ39 and SENS PD (although these approached significance in people who were bradykinetic at onset: P = 0.07 and P = 0.05 respectively).
Outcomes in subjects who were "out of target" due to bradykinesia at the first visit Subjects in each arm were further segmented into those in whom the PKG report at the first visit (see "Methods, PKG reporting, targets and interpretation") indicated that the problem was one of either bradykinesia (Category 1, 2, 3, 5: see "Methods") or dyskinesia (Category 4, 6, 7: see Methods). In cases reported as being out of target because of bradykinesia ( Fig. 3), the MDS-UPDRS III, IV and Total and PDQ39 all improved significantly in the PKG+ arm (n = 40) but not in the PKG− arm (n = 48). The greatest change between first and final MDS-UPDRS III and MDS-UPDRS Total scores in the two arms were in PwP reported as being above bradykinesia target all the time and having dose-related variation ("Out of target" classification 2 on the PKG report). The difference between scores from the first and final visit of each arm were then compared ( Table 3, Bradykinesia at 1st visit. PKG+ (1st-last visit) vs PKG− (1st-last visit)). The difference between the mean change in MDS-UPDRS Total in the PKG+ and PKG− arm was 7.9 (95% CI 3.5-12.3, P = 0.003, T-test) and for MDS-UPDRS III was 6.0 (95% CI 2.5-9.5, P = 0.0008, T-test). PDQ39 and SENS PD both approached significance (P = 0.065 and P = 0.052, respectively). There were too few cases where treatment changes were directed at only dyskinesia (n = 9 in PKG+ and n = 10 in PKG− arms) to examine the effect of using the PKG on management of dyskinesia. When the PKG scores of all PwP in the PKG+ arm were considered, only the Active mBKS (the PKG score for bradykinesia -see "Methods" for definitions of PKG scores) and percent time over target (PTOT) were significantly changed, but no PKG parameters were significantly different in the PKG− arm (Table 2). When only those PwP whose initial PKG scores were out of target were considered, all the PKGs' bradykinesia scores (mBKS, AmBKS and PTOT) were reduced significantly in the PKG+ arm but not the PKG− arm (data not shown). In PwP treated for high bradykinesia scores (Fig. 3), there was marked change in the mBKS, AmBKS (not shown) and PTOT. As discussed below, the PKG scores were more sensitive when focussed on changes in bradykinesia.
All subgroup analyses described above were planned prior to the commencement of the study.
Levodopa equivalent daily dose (LEDD) scores Although the use of medications was not an endpoint in this study, it is of interest to know whether the differences in outcomes was achieved through differences in medication use. There were large increases in LEDD in both arms ( Table 2)   both arms, indicating that improvement in bradykinesia could be achieved without an increase in dyskinesia. In the PKG+ arm, D2 agonists were changed in 33% (reduced in 3%) of PwP compared to 18% (reduced in 8%) in the PKG− arm (P = 0.0001, Fishers Exact) and with the median change in LEDD being 75 (interquartile range (IQR): 60-150) in the PKG+ arm and 38 (IQR: −163 to 60) in the PKG− arm. The average reduction in levodopa dose interval was~40 min in the PKG+ arm compared with 20 min in the PKG− arm (P = 0.03, SEM = 18.1 min, 95% CI = 32 min). There was a non-significant trend to increase the number of daily doses in the PKG+ arm. As there were improvements in MDS-UPDRS III and IV in the PKG+ arm ( Table 2) using similar LEDD, more targeted use of dopaminergic therapies reflected in changes in the D2 agonist dose and levodopa dose interval are presumably responsible.

Variability in doctors
Although the effect of the PKG on doctors' decisions was not an endpoint in this study, it is of interest to know whether individual doctors may have biased the findings. The effect of the PKG on doctors' decision making was examined by comparing the difference in the mean from the first and last visit of all PwP assessed by an individual doctor with the 95% CI of the whole arm (Fig. 4a, b). Both the confidence intervals of the PKG+ arm and the standard errors of individual doctors in the PKG+ arm were smaller than in the PKG− arm, implying that PKG information resulted in greater consistency in decisions. The distribution of each doctor's scores lay within the confidence intervals, but the outliers were distributed equally above and below the confidence interval indicating that undue influence on results by one doctor was unlikely. However it does lead to the speculation that the two doctors lying above the confidence interval were greater compliers with the PKG instructions, whereas the two that lay below the confidence interval were least influenced and indeed their scores lay close to the centre of the PKG− 95% CIs. Examination of their documented decisions suggest that this was the case.
Although the median number of visits (2) was the same in both arms, there were more visits in the PKG+ arm (P = 0.01, χ 2 test). When only PwP whose PKG was reported as out of target was considered, the median number of visits was one higher in the PKG+ arm (median 3: IQR = 3) compared to the PKG− arm (median = 2 (IQR = 2) and this difference was significant (P = 0.0009, χ 2 test).
Subjects referred for device-assisted therapies were excluded from the study because the device-assisted therapies would be unlikely to be implemented and optimised in the time of the study and so any effect of objective therapy would not be captured in the analyses. In general, these subjects' MDS-UPDRS IV scores (mean 9.2 ± 4.5 SD) and PKG scores (median dyskinesia score (mDKS: mean 8.7 ± 7.6 SD): percent time in dyskinesia (PTD: mean 20.4 ± 13.6 SD, upper limit > 16.2) bradykinesia (PTOT: mean 12.7 ± 9.1 SD, upper limit > 12.4)) were higher than the study population.

DISCUSSION
The main findings of this study were that therapeutic decisions supported by objective measurement resulted in reduced bradykinesia (MDS-UPDRS III), motor complications (MDS-UPDRS IV) and improved global motor and non-motor disability as measured by the MDS-UPDRS Total (the primary endpoint). The changes in these scores in the PKG+ arm compare favourably with clinical trials of therapy for PD: among pharmacological agents, only levodopa had a similar effect on size of UPDRS III 6,7 and UPDRS total scores 7,8 (see Table 5 of ref. 5 ) and 6 months after deep brain stimulation, the UPDRS III in the "on" state (as was done in the current study) improved by 4.0 (±10) MDS-UPDRS points 9 compared to 6.4 (±1.4) in this study (Table 2). While the change in PDQ39 score in the PKG+ arm just failed to reach statistical significance (P = 0.07), the mean change in this study (4.7) compares favourably with that of the best change (~2.2) from levodopa 7 and the 17% change in PDQ39 in this study also compares favourably with that seen in DBS (24%) 9 . Retrospective power analyses indicate that all the endpoints would have been significant if 100 subjects in each arm had completed the study. However, limited resources and logistics closed the study after 2 years.
The improvements in clinical scores in the PKG+ arm were achieved with increments in LEDD that were similar to those used  in the PKG− arm, probably related to greater use of D2 agonists and shorter intervals between doses of levodopa. This suggests that the benefit in the PKG+ arm may have led to more strategic deployment of dopaminergic agents to target complications indicated in the PKG. It is noteworthy that the MDS-UPDRS IV and dyskinesia scores did not deteriorate in either arm with increased dose, which counters the expectation that increasing dopaminergic stimulation will necessarily lead to an increase in dyskinesia.
This study was based on the hypothesis that assessment and treatment using standard methods would not result in a significant improvement, because most PwP in Australia are cared for by a neurologist: indeed 44% PwP in this study were usually treated by a Movement Disorder specialist. Doctors in the PKG− arm did not significantly improve participant's scores from what their more experienced counterparts in the community had already achieved, most likely because management was already at the best level of "standard of care". Even so, there were improvements in the MDS-UPDRS III and total scores in the PKG− arm (2.6 and 4.3, respectively, Table 2) that just reached the threshold for minimally clinically important difference (2.5 and 4.3 as threshold for MDS-UPDRS III/Total respectively) 5 . Fellows were chosen for this study to reduce the risk of bias: the similarity in their clinical experience was relatively uniform and more similar than their experienced counterparts, and importantly more likely to comply with the directives of the PKG. Training in neurology in Australia is quite uniform and the experience in PD management of doctors in the two arms was, if anything, slightly greater in the PKG− arm. It is unlikely therefore that a bias due to experience of the doctors would have systematically influenced the outcome. Furthermore, most doctors at the PKG− clinics saw the PKG as an unproven entity and there is no reason to suspect their motivation was other than to treat patients to the best of their capacity.
The inclusion criteria ensured that most subjects would have similar severity of disease and fewer subjects whose motor symptoms could not be treated. Table 1 indicates that there was no statistical difference between the arms and Fig. 2 indicates PwP with the greatest bradykinesia provided the greatest opportunity to gain improvement in scores. All PwP wore a PKG logger prior to each visit and received the usual dose reminders and they were blinded as to whether they attended a PKG+ or PKG− clinic. Furthermore, the balance between PwP from city and regional patients was balanced in the two arms.
The written PKG report was provided to ensure that information recorded by the PKG was available regardless of the PKG+ doctor's ability to interpret the PKG. Reporters were blinded to which arm the PKG came from or any other knowledge of the PwP and their report was entirely based on information in the PKG. It was designed to support the information being sought from the PwP by clinical assessment, including severity of bradykinesia or dyskinesia, whether there were fluctuations, their timing with respect to dosing and whether the best response to a dose was satisfactory (i.e. in target). These questions are central to any therapeutic intervention. The "out of target" PKG classification was designed to highlight this information and the one-day training programme in treatment of PD which was provided to doctors in both arms discussed in detail the type of clinical response these classifications might require.
Clearly it was not possible to blind the doctors to the information in the PKG; however, PwP were blinded as to whether they were participants in the PKG+ or PKG− arm. Nor, for the reason outlined in the "Methods", was it possible to have PwP assessed with and without the PKG in one clinic. However, it would seem that PwP were similar in each arm ( Table 1), and that the mix of urban and regional participants in each clinic were of similar severity. Because PwP chose to attend the closest clinic and, having chosen to make clinics either PKG+ or PKG− to diminish PKG bias on doctors, the options for randomising subjects to treatment were removed. However, the two arms appear to be similar and the profile of subjects attending each clinic was similar although there was an insignificant trend for greater severity of disease in the PKG− arm, especially rural subjects. Furthermore, increased severity of disease at the first visit implied greater opportunity to make a significant difference providing objective measurement was available (Fig. 3): the difference between the two arms may have been more marked if the PKG+ arm had subjects with greater disease severity at first visit.
The pilot studies that guided the design of this project 4,10 reported improvement in non-motor scores. This was not found in this study, although the SENS PD scores did approach significance. These pilot studies led to the exclusion of people with low cognitive scores and aged >75 because the risk of non-motor symptoms that contraindicated changing dopaminergic therapy was much higher in this age group. However, it may have also excluded subjects who had therapeutically addressable nonmotor symptoms. It is important to note that limits around age and cognitive scores were aimed at reducing the cost and logistics of the study by reducing the number of people who would be enrolled and subsequently be excluded because of referral for device-assisted therapy or contraindications to dopaminergic therapy. We estimate the inclusion criteria to cover approximately 2/3 of the PD population. The exclusion criteria should not be taken to mean that objective measurement is not relevant in excluded subjects: indeed, the results of this study suggest that all subjects in whom changing dopaminergic therapy could be considered warrant measurement. As nearly half the participants were usually managed by movement disorder specialists, it also suggests most PwP would benefit from measurement, regardless of the expertise of their treating clinician.
An individual's PKG scores relative to the target range drove the decision to treat in the PKG+ arm and the final MDS-UPDRS III score of 28 in this arm reflects treating to this target. Experts can opine on the appropriateness of a UPDRS III score of 28 as a target or whether a lower score would be desirable, but evidence from further studies should set the target. It is relevant that information such as that synthesised within the PKG report relates to the timing and pattern of how and when the BKS exceeds target, relative to medications (Fig. 2). This provides information about dose interval and dose size for levodopa and not simply a binary trigger of whether or not to intervene.
In Table 2, the Active Median Bradykinesia Score (AmBKS) and percent time out of target (PTOT) were the only PKG scores to change significantly in the PKG+ arm. However, in PwP treated for high bradykinesia scores (Fig. 2), changes in mBKS, AmBKS (not shown) and PTOT (Fig. 2) were marked. It is likely that the AmBKS and PTOT, both of which account for inactivity, are more sensitive than other PKG scores, such as mBKS, which do not have this correction. As well, mBKS falls when dyskinesia is present (and vice versa): thus, mBKS increases in individuals treated for dyskinesia, obscuring statistical change in a population where both excess dyskinesia and bradykinesia are being addressed. When only bradykinesia is the target of therapy (Fig. 2), a very clear fall in mBKS was apparent.
In this study, assistance from objective measurement significantly improved therapeutic management of PwP. Considering other areas of medicine, this should not be a surprising finding and suggests objective measurement in routine clinical care and in clinical trials should be considered, including in the era of telemedicine. It raises the question as to whether novel therapies should be compared in populations whose conventional therapy has already been optimised using objective measurement.

METHODS
This blinded, controlled study was approved and overseen by the St Vincent's Hospital, Melbourne, Human Research & Ethics Committee (approval No. HREC/17/SVHM/298) and conducted between March 2018 and December 2019. Subjects provided written consent according to the Declaration of Helsinki. Study design is described below. This Trial was registered on ANZCTR (https://www.anzctr.org.au/Trial/Registration/TrialReview.aspx?id=373960), registration number ACTRN12618000197235.

Participating doctors and clinics
The aim of the study was to test whether outcomes for PwP were improved when doctors used information provided by the PKG. Ensuring that doctors were of similar expertise was therefore important, so 15 doctors in advanced neurology training (8 doctors) or recently made Fellows (7 doctors) of the Royal Australasian College of Physicians participated as the doctors in the study. Doctors at this level were chosen because experience in PD management was similar. The numbers having done a Movement Disorder Fellowship in the PKG+ arm was four compared to eight in the PKG− arm whereas the level of training in the PKG+ arm was three still in training and five recently made Fellows compared to four trainees and four Fellows in the PKG− arm. Experience in PD ranged from first year of exposure (trainees) to 2 years for some fellows but overall the experience in PD was higher in the PKG− arm. All doctors attended one day of training in the assessment and management of PD, emphasising the use of history to identify motor and non-motor features of PD, contraindications to and side effects of anti-Parkinson's medications, and recognition of candidates for device-assisted therapies. Doctors in the PKG+ arm received a further day of training in interpreting the PKG.
All doctors were supervised by a Movement Disorder specialist, thus ensuring that the trainee/fellow worked in a clinic that provided experience in the management of PD, including with device-assisted therapies. In the study, doctors worked independently of the supervisors, whose role was to be available to provide advice on specific management issues. Although participating doctors who were Fellows of the College could practice independently, access to a supervisor is a requirement for trainees. For consistency, therefore all participating doctors had supervisors.
A requirement for enrolment of PwP was that they had not been assessed and managed previously using the PKG. It was also desirable that doctors with experience using the PKG were in the PKG+ arm. The PKG is used almost exclusively in Movement Disorder clinics in Australia, and approximately half of the participating clinics (predominantly in Melbourne) use the PKG extensively and these became the PKG+ clinics. On the other hand, clinics where fellows had infrequent experience with the PKG became PKG− clinics. The study was conducted at 12 clinics (7 in major cities, 3 in regional centres (2 affiliated with a major city service)), equally divided between the PKG+ and PKG− arms. A median of 12 PwP (IQR = 6.5) attended each site, with each doctor seeing an average of 8 PwP. Three doctors worked at two different sites. Note that one doctor was in the PKG− arm in year 1 and the PKG+ arm in year 2.
Most PwP did not usually attend one of the participating clinics and were reluctant to travel across town so they were allocated to the most conveniently located clinic. This meant that randomisation was not possible. However, there were no differences in the MDS-UPDRS Total and III scores between subjects attending the clinics or between rural and city subjects (one-way ANOVA, data not shown), with the exception that the scores in rural subjects in the PKG− arm tended to be more severe. PwP were blinded as to whether they were attending a PKG+ or PKG− clinic and all subjects wore a PKG logger prior to each visit: note that PKGs were performed in each study arm with the only difference being that only the PKG+ doctor had knowledge of the information from the PKG at each visit.

Study structure
Recruitment was through patient advocacy organisations (state and national Parkinson's associations and Shake It Up Australia) and social media. Consent to be contacted and preliminary screening was provided on a study specific website. After contact, 282 participants were further screened (Fig. 1) and reviewed for eligibility. Inclusion criteria required PwP to have (a) idiopathic PD of ≥4 years or taking ≥four doses of levodopa/day (because PwP with early disease or fewer doses had a much greater likelihood of being in target 4 ); (b) aged 59-75 years (because most PwP under 59 who also met the above criterion were also likely to be candidates for DBS 4 and PwP aged over 75 had a high incidence of contraindications to increasing dopaminergic therapy 4 ); (c) ability to attend a study clinic and willingness to change medication according to the advice of the study doctor. Exclusion included (a) treatment with, or under consideration for, device-assisted therapy; (b) Montreal Cognitive Assessment score ≤ 21 (the aim was not to exclude dementia but to reduce the numbers of PwP whose chances of having contraindications to increasing dopaminergic therapy was high, and previous pilot study 4 showed that scores ≤21 was the cut off to achieve this); (c) history of orthostatic hypotension or hallucinations or other symptoms that would prevent increases in PD medications; (d) having a previous PKG assessment in routine clinical care. At the screening visit, all consenting and eligible participants were assessed with clinical scales (Fig. 1), medications were recorded, and a PKG logger was provided. PwP were then allocated to the most conveniently located clinic and were blinded as to whether this was a PKG+ arm or PKG− arm: i.e. whether the treating doctor had access to the information from the PKG at each visit. Doctors in neither arm had access to scores from clinical scales, which were performed by the same certified assessor who was blinded to the doctor's assessments. This was to reduce missing data and assessor variation. MDS-UPDRS III was performed during the screening visit and at the last visit when subjects were in the "ON" state.
At the first consultation, PKG+ doctors assessed PwP using history, examination and PKG information to decide whether the PwP's motor features were in target (no further treatment required) or out of target (Fig.  1). In the latter case, a plan for changing treatment was provided and a PKG logger was worn prior to the next consultation 5 weeks later. The same assessment protocol was followed until the PKG data were in target: a maximum of five visits were permitted, inclusive of the first assessment. This was designated the "final or last" visit and PwP then exited the study and clinical scales and PKG were performed. PKG− doctors followed a similar protocol, except their assessment was entirely clinical and without access to PKG information. If the PwP refused to change therapy at the first visit they were excluded from the study, as willingness to change medications was part of the inclusion criteria. If there was refusal after several attempts to improve control, then this case was included in the study. Previous pilot studies 3,4 were used to determine the sample size and selection criteria. These calculations showed that power to achieve the primary endpoint MDS-UPDRS Total would be achieved with 75 in each arm although 100 in each arm was required to achieve power for PDQ39. The study was terminated when there were no more eligible participants volunteering in participating regions.
The PKG system The PKG system consists of a wrist-worn data logger, algorithms 11 that produce data points for bradykinesia (BKS) and dyskinesia (DKS) every 2 min over the 6 days the logger was worn and a series of graphs and scores that synthesise this data into a clinically useful format known as the PKG report. The BKS and DKS are plotted against the time of day and the time when medications are due is also provided. The numerical output is described in detail in other publications 4,10-19 . All outputs are derived from data recorded between 09:00 and 18:00 of the six recording days and those relevant to this study are summarised below. Where relevant, the upper limit obtained from aged matched, non-PD controls is shown in brackets.
mBKS: The median of all BKS from the 6 days while the logger was being worn and PwP was not asleep.
AmBKS (<23): Active mBKS. The median of BKS < 42 (which removes most inactivity 19  PKG reporting, targets and interpretation All PKGs were reported by one of the authors (K.E.K., M.K.H.) in a standard format, blinded to the study arm. The reporting was qualitative but referred to target ranges that separate "controlled" PD from "uncontrolled" PD, established by a consensus of a panel of four neurologists experienced in treating PD, then trialled in a previous study 4 and subsequently supported by expert panels 23,24 . The Bradykinesia target was BKS < 26, corresponding to the shaded area in Fig. 5: note that this is different to the mBKS which is the average of this activity of 09:00-18:00, but refers the moving average bradykinesia score over the day being in excess of the target at the point in time. A BKS of 26 corresponds to a UPDRS III of approximately 30 (ref. 17 ). The Dyskinesia target was a DKS < 7 corresponding to an Abnormal Involuntary Movement Score of~9 (ref. 11 ). The study protocol did not require tremor to be treated if PKG bradykinesia scores were in target. The PKG report along with the PKG was available to the PKG+ doctors at the time of each study visit. PKGs were reported as being (a) in target; (b) out of target; (c) likely out of target but resolve potential artefact; (d) likely in target but resolve potential artefact. If the PKG was reported as "out of target", a further seven point classification was provided, for the purpose of statistical sub-analyses (see Fig. 5 for reference).
1. Global Bradykinesia. BKS > 26 at all times between 09:00 and 18:00, without dose-related variation: mBKS > 26. 2. Global bradykinesia and wearing off. BKS > 26 at all times between 09:00 and 18:00, but with dose-related variation. mBKS may be >26. 3. Bradykinesia only as wearing off. The BKS > 26 at the time of one or more, for at least 30 min. Re-emergence of tremor is supporting evidence. 4. Peak dose dyskinesia alone. DKS > 7 at the time of one or more doses for at least 30 min, providing not artifactually elevated by walking or exercise. 5. Predominantly bradykinesia, but with peak dose dyskinesia. 6. Predominantly peak dyskinesia dose, but with bradykinesia. 7. Global dyskinesia. DKS > 7 at all times between 09:00 and 18:00, without dose-related variation. mDKS may be >7. Protocol required doctors in the PKG+ arm to follow the PKG findings when deciding whether to change treatment, according to these criteria unless: • The doctor's clinical findings show that the PKG report is incorrect: for example, cervical dyskinesia which would not be detected by the PKG, was observed and treated by the doctor.

•
The doctor considered that a referral for device-assisted therapies was indicated because of the severity of motor complications and/or futility of further changes to oral medications.
• A contraindication has been identified, including a reasonable concern that it will be induced by a change in therapy.
• PwP declines further change. As consent was given to change dosage according to protocol, the doctor was required to attempt reasonable persuasion to agree to changes.
• Futility. Attempts at earlier visits to improve scores have failed.
• All five visits have been used.
In the PKG− arm, doctors were required to use standard clinical practice to assess whether treatment was adequate or whether further treatment was required, with the same caveats to changing therapy that applied to the PKG+ arm. In both arms, a review appointment was only for addressing treatable motor symptoms/scores: non-motor symptoms could be treated, but a further "in-study" visit should not be scheduled if the sole aim was to address a non-motor symptom.

Statistical analysis
Each variable was assessed with a two-tailed, heteroskedastic t-test. An exact probability value (P) is provided and significance was set at P < 0.05. The difference in the means is provided along with a 95% confidence interval (95% CI) for the difference of the means. Statistics are usually quoted as (difference in standard error of means, lower and upper 95%, P value). There was no missing data for the primary and secondary outcomes, with the exception of PDQ39, where two data points were missing from the PKG+ arm and three data points from the PKG− arm and was handled by listwise deletion. It was planned prior to the study that the endpoints would be considered for (a) all subjects (in target and out of target) that completed the study (i.e. had first and final visit scores); (b) subjects whose scores were out of target at first visit and; (c) subjects whose scores indicated they were bradykinetic at first visit. Comparisons between final scores in the two arms, as well as differences between first and final score were both planned. Assessments of number of visits was performed using a χ 2 test. Mann-Whitney test was used when numbers were less than 35 and the distribution was not normal (Fig. 2).

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article. . The Y-axis shows the severity of DKS (increasing severity upward from the dark green line adjacent to the zero) and BKS (increasing severity downward from the dark blue line adjacent to the zero). The median BKS and DKS of non-PD subjects is shown with the arrow. The shaded green and blue zones show the target range. BKS in the shaded pink region are usually associated with inactivity. The vertical red lines are the times that reminders were delivered, and the diamonds show the time when consumption of medications was acknowledged. In this example the median BKS is outside of target at the time of the first dose. The response to each dose just reaches target and there is "wearing-OFF" after 2 h (1st dose) and~3 h (2nd dose). This case was reported as having bradykinesia above target with "Bradykinesia only as Wearing Off" according to the 7-point classification described in the text.