Differential diagnosis of parkinsonism: a head-to-head comparison of FDG PET and MIBG scintigraphy

[18F]fluorodeoxyglucose (FDG) PET and [123I]metaiodobenzylguanidine (MIBG) scintigraphy may contribute to the differential diagnosis of neurodegenerative parkinsonism. To identify the superior method, we retrospectively evaluated 54 patients with suspected neurodegenerative parkinsonism, who were referred for FDG PET and MIBG scintigraphy. Two investigators visually assessed FDG PET scans using an ordinal 6-step score for disease-specific patterns of Lewy body diseases (LBD) or atypical parkinsonism (APS) and assigned the latter to the subgroups multiple system atrophy (MSA), progressive supranuclear palsy (PSP), or corticobasal syndrome. Regions-of-interest analysis on anterior planar MIBG images served to calculate the heart-to-mediastinum ratio. Movement disorder specialists blinded to imaging results established clinical follow-up diagnosis by means of guideline-derived case vignettes. Clinical follow-up (1.7 ± 2.3 years) revealed the following diagnoses: n = 19 LBD (n = 17 Parkinson’s disease [PD], n = 1 PD dementia, and n = 1 dementia with Lewy bodies), n = 31 APS (n = 28 MSA, n = 3 PSP), n = 3 non-neurodegenerative parkinsonism; n = 1 patient could not be diagnosed and was excluded. Receiver operating characteristic analyses for discriminating LBD vs. non-LBD revealed a larger area under the curve for FDG PET than for MIBG scintigraphy at statistical trend level for consensus rating (0.82 vs. 0.69, p = 0.06; significant for investigator #1: 0.83 vs. 0.69, p = 0.04). The analysis of PD vs. MSA showed a similar difference (0.82 vs. 0.69, p = 0.11; rater #1: 0.83 vs. 0.69, p = 0.07). Albeit the notable differences in diagnostic performance did not attain statistical significance, the authors consider this finding clinically relevant and suggest that FDG PET, which also allows for subgrouping of APS, should be preferred.


INTRODUCTION
Radionuclide or molecular imaging techniques support the differential diagnosis of parkinsonism 1,2 and have been incorporated into current international diagnostic criteria [3][4][5][6] . The recommendations refer to both [ 18 F]fluorodeoxyglucose (FDG) PET as well as [ 123 I] metaiodobenzylguanidine (MIBG) scintigraphy for the differential diagnosis in uncertain clinical cases of suspected neurodegenerative parkinsonism. The selection of a specific technique relies on the particular clinical presentation and question (see guidelines, e.g. pathological MIBG scintigraphy as supportive criterion for Parkinson's disease [PD] or degeneration of putamen on FDG PET as a feature of possible multiple system atrophy [MSA]).
FDG PET is used to image regional cerebral glucose metabolism as a marker of neuronal activity that may be altered through neurodegeneration and disease-specific network changes. The identification of cerebral metabolic patterns related to specific neurodegenerative diseases 7 allows to separate Lewy body diseases (LBD; i.e. PD, PD dementia [PDD], and dementia with Lewy bodies [DLB]) from atypical parkinsonian syndromes (APS) with high diagnostic accuracy 8,9 . Alternatively, the involvement of the autonomous nervous system can be assessed by imaging postganglionic cardiac innervation with MIBG and planar scintigraphy. Cardiac MIBG uptake is markedly reduced in patients with LBD when compared to healthy controls and APS 10 . Unlike with FDG PET, a further differentiation between the APS subgroups MSA, progressive supranuclear palsy (PSP), and corticobasal syndrome (CBS) is not possible with MIBG scintigraphy. However, the preferred method is unclear and a direct comparison of both methods is still lacking.
Against this background, the present multicenter study compares the performance of FDG PET and myocardial MIBG scintigraphy in the differential diagnosis of neurodegenerative parkinsonism in the same patient population. Although reduced myocardial innervation is a common finding in LBD in contrast to preserved innervation in non-LBD, most available data on MIBG scintigraphy addresses the differentiation between PD and MSA. Thus, the present study addresses both aforementioned distinctions.
patients who underwent dopamine transporter SPECT during the time course of clinical observation showed a pathological scan result, confirming neurodegeneration (clinical diagnosis of LBD in n = 12 and non-LBD in n = 11). There were no group differences in terms of sex, age at first imaging, and time of clinical observation between patients with LBD and non-LBD as well as between patients with PD and MSA. However, not unexpectedly, symptom duration at first imaging was longer in patients with LBD (vs. non-LBD; p = 0.001) and PD (vs. MSA; p < 0.001), respectively ( Optimal thresholds for delineating non-LBD from LBD and MSA from PD were >2.0 for FDG PET (LBD vs. non-LBD: J = 0.68; PD vs. MSA: J = 0.67) and >1.3 for MIBG scintigraphy (LBD vs. non-LBD: J = 0.44; PD vs. MSA: J = 0.40), leading to a comparable sensitivity of 94% (85-100%) and 96% (75-100%) for FDG PET and 91% (47-100%) and 93% (46-100%) for MIBG scintigraphy, respectively. In contrast, the specificity was notably higher for FDG PET (74% [53-95%] and 71% [47-94%], respectively) than for MIBG scintigraphy (53% [32-95%] and 47% [29-94%], respectively). Corresponding positive and negative predictive values and positive and negative likelihood ratios for the present patient population are listed in Table 2. These also suggest superiority of FDG PET over MIBG scintigraphy (we refrained from additional pairwise statistical comparisons given the preceding comparisons of ROC analyses). Typical findings for both modalities are shown in Fig. 3.
Using the aforementioned thresholds, five and two FDG PET ratings were classified respectively as false-positive (i.e. imaging diagnosis of non-LBD in case of clinical LBD) and false-negative (i.e. imaging diagnosis of LBD in case of clinical non-LBD). In the case of MIBG scintigraphy, there were nine false-positive and three false-negative ratings. Of note, in four false-positive cases and one false-negative case, the FDG PET and MIBG scintigraphy ratings were congruent.
Finally, within the groups of patients with APS according to the clinical follow-up diagnosis (n = 31), FDG PET ratings indicated MSA (n = 26), PSP (n = 3), CBS (n = 1), or PD (n = 1). The FDG PET ratings agreed with the clinical follow-up diagnosis in 87% of cases (χ² = 12.37, p < 0.01) with the exceptions being n = 3 MSA and n = 1 PSP by clinical diagnosis that were diagnosed with PD, PSP and CBS, and MSA by FDG PET ratings, respectively.

DISCUSSION
The present study suggests that FDG PET provides higher diagnostic accuracy than MIBG scintigraphy for the differentiation between LBD and non-LBD as well as between PD and MSA using the blinded follow-up diagnosis as reference. The differences between both modalities did not reach statistical significance (p = 0.06 and p = 0.11, respectively), although FDG PET performed significantly better than MIBG scintigraphy when considering the individual ratings of the more experienced rater #1 (LBD vs. non-LBD, p = 0.04; PD vs. MSA, p = 0.07).
These results extend prior studies by providing a within-subject comparison of FDG PET and MIBG scintigraphy and match recent, independent meta-analyses suggesting modest superiority of FDG PET 7,10 . However, specificity of both modalities in the present study were lower than previously described: for delineating non-LBD from LBD, the calculated sensitivity of FDG PET (94%) was in the upper range of the expected sensitivity (91%, 95%-CI: 72-98%) based on a recent meta-analysis 7 , whereas the specificity (74%) was at the lower limit of the expected range (91%; 95%-CI: 70-98%). MIBG scintigraphy discriminated both groups with 91% sensitivity and 53% specificity, which is comparable sensitive (83%; 95%-CI: 60-94%) but less specific (89%; 95%-CI: 82-95%) compared to the values reported by a recent meta-analysis 10 (note that figures from this reference were adjusted to meet the present definition of positive [non-LBD] and negative cases [LBD]). Aside from the actual selection of the diagnostic cut-off values, the difference between aforementioned studies and the present study may also be related to the retrospective design of our study. We included only patients investigated with both FDG PET and MIBG scintigraphy for diagnostic purpose in clinical routine. In the participating tertiary reference centers, patients fulfilling this prerequisite are usually highly complex cases with clinically Data are presented as mean value ± SD. Diseases without Lewy bodies: MSA, n = 28; progressive supranuclear palsy, n = 3; drug-induced parkinsonism, n = 2; normal pressure hydrocephalus, n = 1. LBD Lewy body disease, non-LBD disease without Lewy bodies, PD Parkinson's disease, MSA multiple system atrophy. *p = 0.001, **p < 0.001, two-sample t test.
inconclusive findings.  11 . By calculating the average score for both raters, however, the consensus rating provided an 11-step ordinal scale, which almost approximates a continuous scale so that the effect is probably negligible.
Besides the more favorable diagnostic performance, an additional advantage of FDG PET over MIBG scintigraphy is the possibility to further differentiate between the APS 7 . This also applies to the present study, where 87% of patients with APS were  correctly classified as either MSA or PSP by FDG PET. Although not contemplated by the present study, FDG PET may also be used to detect cortical involvement in PD and predict future cognitive decline [12][13][14] . This may be of tremendous prognostic importance 15 and of particular clinical relevance, e.g., in the context of deep brain stimulation, where the risk of cognitive deterioration should be estimated before surgery 16,17 . MIBG scintigraphy has several handicaps in clinical practice. First, its reliability is limited due to the fact that patients with early PD (Hoehn & Yahr 1-2) show a normal MIBG uptake in more than 25% of cases 18 and up to 30% of patients with MSA may show an impaired cardiac innervation 19 . In combination with neuropathological findings (i.e., alpha-synuclein related pathology in thoracic sympathetic ganglia in patients with advanced MSA 20 ), the latter suggests overlapping mechanisms of peripheral denervation in both diseases and thus a conceptual limitation of cardiac MIBG scintigraphy. Second, cardiac denervation is not specific for LBD but may also be present in patients with congestive heart failure, ischemic heart disease, or diabetic neuropathy 21,22 . These conditions hamper proper differential diagnosis and need to be ruled out before MIBG scintigraphy, which might require additional diagnostic procedures and restricts the target population of MIBG scintigraphy, especially in elderly patients. Lastly, several mechanisms of interference of various classes of drugs with MIBG exist and can influence cardiac tracer uptake. Although standard dopaminergic treatment is not critical, several common cardiovascular agents, antipsychotics, and antidepressants have interactions and need to be discontinued according to their half-life up to 4 weeks prior to imaging 23,24 . In our sample, a diligent review of patient charts ruled out that patients had relevant comorbidities or received any interfering medication at time point of imaging. Therefore, we can exclude these confounders. However, if not considered carefully, the diagnostic accuracy of MIBG scintigraphy might be even lower than currently recorded.
Considering FDG PET, earlier studies indicated that its diagnostic accuracy is not relevantly affected by disease-stage and common medications 7 , while comorbidities with morphological brain alterations (e.g., gross atrophy, ischemic lesions) may complicate image interpretation. Furthermore, FDG PET might have one disadvantage towards MIBG scintigraphy in certain clinical settings: If disease-specific metabolic patterns are identified by visual reads and not automated classification methods 7 , the diagnostic accuracy of FDG PET may rely on the experience of the reader. Correct ROI placement on planar MIBG images, in turn, is easier to learn and makes image evaluation less susceptible to the reader's experience than is the case for FDG PET. This is also suggested by the present study, in which the independent rating of the more experienced investigator indicated statistically significant superiority of FDG PET, which was not the case for the less experienced investigator. However, the actual difference between both readers was small (see above), which we attribute to the auxiliary use of easy-to-interpret voxel-based statistical analyses. With the support of three-dimensional stereotactic surface projections or single-subject SPM analyses, non-expert investigators can achieve a diagnostic accuracy, which is similar to the performance of experts 25,26 .
The retrospective nature of the present study implies an inherent risk of bias. We addressed this by creating clinical vignettes containing comprehensive clinical data in line with current diagnostic criteria. These were filled in by the same expert at every center, which were then evaluated by two blinded movement disorder specialists in consensus. However, having received both imaging techniques, the patient populations probably entail a selection bias towards complex cases, which probably leads to more conservative estimates of the diagnostic accuracies of the enrolled methods (see above). Of note, the order of both examinations was roughly balanced which argues against a possible bias. Another limitation related to the retrospective design is the time gap between scans. However, the exclusion of six subjects with a time gap >3 months did not reveal a relevant effect on the study outcome (data not shown in detail). Another potential source of bias is the use of different collimators at each study center, what we accounted for by using a linear correction method 27,28 . However, separate ROC analyses for each center are in agreement with the overall results suggesting that the use of different collimators did not have a major impact on the results (see Supplementary Table 2).
Given the limited accuracy of the clinical diagnosis of LBD and APS as illustrated by clinicopathological studies [29][30][31][32][33] , the reliance on the clinical diagnosis represents another limitation. We tried to minimize the risk of incorrect clinical diagnosis by including all accessible patient information into the vignettes. However, the observation that a substantial fraction of incorrect FDG PET and MIBG scintigraphy ratings occurred unanimously (in particular, see false-positive cases in Fig. 1 with highly confident FDG PET ratings and high H/M ratios) may suggest that actually the clinical diagnosis was false-negative in these FDG PET and MIBG scintigraphy-positive cases. This underlines the general need for future prospective studies including the post mortem histopathological diagnose as reference in the field of imaging in parkinsonian syndromes 7 .
Taken together, in the present study involving a clinically challenging patient population, FDG PET and MIBG scintigraphy correctly classified approximately four out of five and two out of three patients, respectively. Albeit this notable difference did not attain statistical difference, we consider this finding clinically relevant and suggest that FDG PET, which also allows for subgrouping of APS, should be preferred.

METHODS Patients
The present study represents a retrospective analysis of imaging data from two university hospitals. The patients came from three tertiary referral centers specialized in movement disorders. We screened the records of the Department of Nuclear Medicine, Medical Center -University of Freiburg and the Department of Nuclear Medicine, University Hospital Würzburg between 2012 and 2018 for patients who received both FDG PET and MIBG scintigraphy. Patients who underwent both imaging techniques for the differential diagnosis of LBD and APS, and had a clinical follow-up ≥1 year were eligible for this study (inclusion criteria; in total n = 54; Freiburg n = 27; Würzburg n = 27). Incomplete clinical data, interfering medication at the time of MIBG scintigraphy and/or corrupted image data served as exclusion criteria. Forty patients (74.1%) completed both scans within one week and forty-eight patients (88.9%) within three months. The mean time between both scans was 10 weeks ±34 weeks.
The clinical follow-up diagnosis served as reference standard. To standardize this process across all three referral sites, one movement disorder specialist (N.S.) blinded to FDG PET and MIBG scintigraphy reviewed all available information from the first to the last patient contact of each subject (i.e., time of clinical observation). The information included prior patient history and clinical follow-up (≥1 year), all physical and neurological examinations, clinical charts (incl. past and present medications), tilt-table test, Schellong test, urodynamic study, and imaging results other than FDG PET and MIBG scintigraphy (i.e. transcranial sonography, MRI, CT, and dopamine transporter SPECT), and was incorporated into a standardized clinical case vignette derived from the current diagnostic criteria of LBD and APS [3][4][5][6]34 . In analogy to an earlier study 35 , we developed the structure of the vignette for this study specifically, which contained   Fig. 3 Typical FDG PET and MIBG scintigraphy findings in individual patients with Parkinson's disease and multiple system atrophy. a Spatially normalized transaxial FDG PET slices at the level of cerebellum, basal ganglia, and dorsal frontoparietal cortex. Datasets were thresholded for optimal display. b Anterior view of planar MIBG scintigraphy. The patient with PD shows a typical relative hypermetabolism of the bilateral putamen and no cardiac MIBG uptake. The MSA patient is characterized by hypometabolism of the putamen (particularly on the left side), the bilateral cerebellum, and a preserved cardiac uptake on MIBG scintigraphy. PD Parkinson's disease, MSA multiple system atrophy. . PET emission data were acquired 30 min post-injection on the Biograph mCT64 and 50 min post-injection on the Philips Gemini TrueFlight in 3-dimensional mode for 10 min in accordance with current procedural guidelines for FDG brain imaging 36 . PET data were reconstructed iteratively and fully corrected for randoms, scatters, and photon attenuation using a low-dose CT for attenuation correction and vendor-specific reconstructions methods. We performed FDG PET data analysis with an in-house pipeline written in MATLAB (The MathWorks, Inc.) and employing statistical parametric mapping routines (SPM12, Wellcome Department of Cognitive Neurology, University College, London). We spatially normalized the scans to an FDG PET template image in Montreal Neurological Institute (MNI) brain space, followed by proportional scaling of voxel-wise FDG uptake to mean global brain parenchyma FDG uptake. Two board-certified nuclear medicine physicians (P.T.M. and C.L.), blinded to clinical information, then independently rated FDG PET scans of all 54 patients by using 30 transaxial slices (4.5 mm thickness) covering the entire brain. Slices were displayed in a standardized fashion (maximum adjusted for optimal display, minimum set to 5% of maximum, monochrome "hot metal" color scale). As supportive analyses, readers had access to three-dimensional stereotactic surface projections (3D SSP; Neurostat software) displaying the deviation of each individual's regional cerebral FDG uptake from agematched healthy controls (color-coded Z score 0-7; decreases only) 37 and the results of single-subject SPM analyses showing significantly (p < 0.05, k > 50 voxels, no correction for multiple comparisons) hyper-and hypometabolic regions in individual scan compared to healthy controls of comparable ages examined with identical scanners (Gemini TrueFlight: 21 females/14 males, age: 77.6 ± 5.9 years; Biograph mCT: 5 females/5 males, age: 61.9 ± 14.4 years) 25 . The raters interpreted FDG PET scans in two consecutive levels 9 based on previously published disease-specific patterns of regional cerebral glucose metabolism 7 : 1. First level: readers classified each scan as indicative of LBD or APS using a 6-step score (0/1/2: definite/probable/possible LBD; 3/4/5: possible/probable/definite APS). 2. Second level: readers categorized APS-positive scans (scores 3-5 at first level) as being indicative of MSA, PSP, or CBS.
After independent evaluation of all scans, the raters averaged the scores (yielding an 11-step scale) and reached a consensus second-level diagnosis.

MIBG scintigraphy
We reviewed patient charts to rule out medication possibly interfering with MIBG scintigraphy 23,38 . MIBG studies were acquired on a dual-headed gamma camera (E.CAM [Siemens Healthineers, Erlangen, Germany], n = 10) and two dual-headed SPECT/CT systems (Symbia T2 [Siemens Healthineers, Erlangen, Germany], n = 27; Brightview XCT [Philips Medical Systems Inc. Cleveland, OH], n = 17). Systems were equipped with either a low-energy high resolution collimator (LEHR; E.CAM and Brightview XCT; n = 27, 192 ± 27 MBq MIBG) or a medium-energy low penetration collimator (MELP; Symbia T2; n = 27, 183 ± 7 MBq MIBG). Anterior and posterior planar images were obtained for 10 min at 240 min after injection. Two board-certified nuclear medicine physicians (C.L. and J.B., blinded to clinical information) evaluated MIBG uptake semi-quantitatively by calculating the delayed heart-to-mediastinum ratio (H/M) using the planar anterior images. They independently defined region-ofinterests (ROI) of the heart and the mediastinum by manually adjusting a circular ROI to the left ventricle and a rectangular ROI to the upper mediastinum using PMOD Version 3.7 (PMOD Technologies Ltd., Zurich, Switzerland). The H/M of each patient was calculated by dividing the mean counts per pixel in the cardiac ROI by the mean counts per pixel in the mediastinal ROI. We then linearly converted the H/M ratio of the patients who had been investigated on the system with the MELP collimator (n = 27) into H/M ratios for LEHR collimators 27,28 and calculated the mean H/M ratio of both raters for each patient.

Statistical analysis
We used the statistics software R 3.3.3 [http://www.R-project.org/] for statistical analyses. Inter-rater agreement was evaluated with the intraclass-correlation coefficient (ICC) and Cohen's (weighted) kappa (κ) as applicable (R package 'psych', version 1.8.12). Between-group differences were assessed using two-sample t test, Mann-Whitney U-test, and Cohen's effect size d (R package 'effsize', version 0.7.4). Receiver operating characteristic (ROC) analyses (R package 'pROC', version 1.10.0) were employed to assess and compare the diagnostic performance of the two methods by the area under the ROC curve (AUC). We defined diseases other than LBD (non-LBD; i.e., APS in the vast majority of cases) and MSA, respectively as positive cases for calculating sensitivity and specificity. We selected the thresholds for interpreting FDG PET and MIBG scintigraphy in order to maximize Youden's J (J = sensitivity+specificity−1). The association between the clinical follow-up diagnosis and FDG PET ratings was assessed with Pearson's χ² test in the subgroup of APS patients.

Ethical approval and patient consent
Cardiac MIBG scintigraphy and FDG PET scans were performed as part of the clinical work-up. All patients gave written informed consent to the diagnostic procedures. All procedures performed in humans were in accordance with the principles of the Declaration of Helsinki and its later amendments or comparable ethical standards. The retrospective analysis was approved by the local institutional review boards of the University Hospital Freiburg and the Julius-Maximilian-University Würzburg.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.