Introduction

Parkinson’s disease (PD) is characterized by the progressive loss of dopaminergic neurons in the nigrostriatal pathway1,2. Patients meet the clinical criteria for PD when 60–70% of the neurons in the substantia nigra have degenerated and approximately 80% of the striatal dopamine content has been lost3,4. At present, the diagnosis of PD is based primarily on clinical features. Conventional neuroimaging modalities are of proven value in assessing Parkinsonian patients. 123I-ioflupane striatal dopamine transporter SPECT (DaTSCAN) has been validated as a tool for the differential diagnosis of PD and non-degenerative tremors, and 123I-metaiodobenzylguanidine SPECT and 18F-fluorodeoxyglucose PET have been used for the differential diagnosis of PD and atypical Parkinsonism. The very high cost and limited availability of these technologies prevent their widespread and systematic use in routine clinical practice. Moreover, these methods have not been demonstrated to have any prognostic value5.

Magnetic resonance imaging (MRI), with different modalities having diverse tissue-specific sensitivities, is useful for investigating brain degeneration to differentiate between PD and other Parkinsonian syndromes. T2* relaxometry and susceptibility-weighted mapping can be used to quantify the iron load and nigral degeneration. Patients with PD have an abnormally low T2 to T2* ratio and, reciprocally, an abnormally high R2 to R2* ratio6, while resting state functional connectivity maps describe functional networks; the main finding is that these networks have fewer connections in patients with PD7. However, due to the fact that Parkinson's disease’s diagnosis is based on clinical examinations, in addition to the non-availability of all these MR sequences in daily routine and the lack of consensus on the interpretation of images through dedicated software, none of these methods is currently used in clinical practice for diagnostic, therapeutic or prognostic purposes.

Structural imaging, using T1-weighted sequences, has also been investigated and different studies have reported changes in grey matter volume and cortical changes, mainly linked to disease-related cognitive decline8,9. The techniques used can be classified into regions-of-interest (ROI)-based approaches with manual labelling, automatic or semi-automatic segmentation, voxel-based whole-brain morphometric analysis using voxel-based morphometry (VBM) or tensor-based morphometry (TBM), and surface or shape-based approaches mainly for cortical thickness analysis. The main results of these studies are inconsistent, however, suggesting that this imaging technique is not sensitive enough and is of little diagnostic value10. Conversely, optimized neuromelanin-sensitive T1-weighted scans have revealed stage-dependent substantia nigra signal reduction in PD as a putative marker of neuromelanin loss11.

In the present study, we investigated post-processing of T1-weighted images using texture analysis after hypothesizing that the computed features in different key anatomical structures could be clinically meaningful in differentiating between PD patients and controls and show a specific progressive pattern between two critical periods of the disease: diagnosis (early-stage) and severe L-dopa-related complications (late-stage). A number of studies of MRI sequences in patients with Alzheimer’s disease have already found that image texture analysis is more sensitive than conventional atrophy measurements, such ROI volumetry and VBM12,13. In a longitudinal study of PD patients with a 2-year follow-up using T2-weighted images, Sikiö et al.14, described texture analysis as a quantitative method for detecting structural changes in brain MR images and this was significantly related to clinical scores for PD severity. The Unified Parkinson's Disease Rating Scale (UPDRS) 1 score evaluating the cognition, behaviour and mood, correlated with texture features extracted from the area of posterior corona radiata and from the substantia nigra. UPDRS-II which scores the self-evaluation of the activities of daily living correlated with texture features from the putamen, the dentate nucleus and the thalamus while UPDRS-III score evaluating the motor examination, correlated with features in the area of basilar pons. Li et al.15 described the use of quantitative susceptibility-weighted and R2* MR maps to discriminate between patients and healthy controls. More recently, our group has described some texture features as markers of cognitive decline with a better sensitivity than ROI- and VBM-based techniques16.

Results

Volumetry and morphometry

There was no significant difference in volume of any of the five brain regions between the two PD groups and the healthy controls. The results of these analyses are shown in Table 1. When compared using the VBM approach, the three groups exhibited significant differences only when the control group was compared with one of the PD patients groups (Fig. 1). These differences were mainly in the deep grey matter structures. No significant difference was found between the two PD patient groups.

Table 1 Region of interest-based volumes (expressed in cm3) and results of ANOVA for intergroup comparisons after normalization against total intracranial volume.
Figure 1
figure 1

Voxel based morphometry analysis results indicating regions where structural changes between the groups were found using the contrast gray matter in the group 1 > gray matter in the group 2. (a) When the healthy group was compared against the early-stage Parkinson’s disease group. (b) When the healthy group was compared against the late-stage Parkinson’s disease group. Images created by SPM12, Welcome Trust Centre for NeuroImaging, London, UK; http://fil.ion.ucl.ac.uk/spm/software/spm12/).

Texture features

The ANOVA analyses run on the texture features in the five brain structures revealed significant differences between the three groups, with different profiles. For group pair-wise comparisons, significance was considered at p < 0.02 after FDR correction.

For the substantia nigra, texture features were significantly different between the three groups and pair-wise post-hoc tests showed that the different features expressed as contrast, entropy, sum variance (sumV) and sum square (sumSQR) were significantly different between the groups. Figure 2 illustrates the distribution of the significantly different texture features. In the striatum, different texture features exhibited pair-wise differences between the three populations. For the putamen (Fig. 3), it was the case for the mean, sumV and correlation features, while for the caudate nucleus (Fig. 4), it was the case for sumA, entropy, sumSQR and sumV. For the thalamus, pair-wise differences were found for the standard-deviation, sumSQR, sumV and entropy features (Fig. 5). Lastly, for sub-thalamic nucleus, this particular structure in terms of volume, two texture features: entropy and sumSQR showed pair-wise differences between the three groups.

Figure 2
figure 2

Comparison of texture features in the substantia nigra between healthy controls (CTRL), early-stage PD and late-stage PD patients.

Figure 3
figure 3

Comparison of texture features in the putamen between healthy controls (CTRL), early-stage PD and late-stage PD patients.

Figure 4
figure 4

Comparison of texture features in the caudate-nucleus between healthy controls (CTRL), early-stage PD and late-stage PD patients.

Figure 5
figure 5

Comparison of texture features in the thalamus between healthy controls (CTRL), early-stage PD and late-stage PD patients.

Individual classification using texture features

The regression analysis using the LASSO method showed that three texture features from second-order statistics: entropy, sumSQR and sumV, computed in the five brain regions were the best independent predictors in building a classification model. These features, as it was shown in Figs. 2, 3, 4 and 5, were significantly different between the three populations with a gradual progression from the healthy controls to late-stage PD patients. The discriminatory powers of these texture features between the two PD patient groups were analysed using ROC analysis. The results are summarized in Table 2.

Table 2 Receiver operating characteristic (ROC) analysis of the three texture features entropy, sum square (SumSQR) and sum variance (SumV) for discrimination of the two PD patient groups.

Texture feature and clinical scores

The correlation analyses with the clinical scores: MDS-UPDRS III and Hoen-Yahr score in off-treatment conditions as well as the MDS-UPDRS total score focused on the same features. After FDR correction, entropy showed significant negative correlations with the three clinical scores (r = − 0.50, p < 0.0001, r = − 0.30, p = 0.01 and r = − 0.45, p = 0.001, respectively) in the substantia nigra. For the putamen, a correlation was found only for the Hoen–Yahr score (r = − 0.30, p = 0.01). For the caudate nucleus, entropy was correlated with the MDS-UPDRS III (r = − 0.41, p = 0.0004) and MDS-UPDRS total (r = − 0.36, p = 0.002) while for the thalamus, it correlated significantly with the three scores (r = − 0.48, p = 0.01, r = − 0.28, p = 0.01 and r = − 0.44, p = 0.002, respectively).

For the feature sumV, significant negative correlations were found with the three clinical scores in the substantia nigra (r = − 0.38, p = 0.002, r = − 0.28, p = 0.01, r = − 0.37, p = 0.002, respectively), caudate nucleus (r = − 0.64, p < 0.0001, r = − 0.49, p < 0.0001, r = − 0.64, p < 0.0001, respectively) and thalamus (r = 0.67, p < 0.0001, r = -0.39, p = 0.001, r = -0.64, p < 0.0001, respectively). For the putamen, this feature correlated with the two MDS-UPDRS scores (r = -0.35, p = 0.003 for MDS-UPDRS III and r = -0.35, p = 0.002 for MDS-UPDRS total).

Finally, for sumSQR, significant correlations with the three clinical scores were found only for the substantia nigra (r = − 0.45, p < 0.0001, r = − 0.27, p = 0.003, r = − 0.40, p = 0.001, respectively). For the caudate nucleus, a significant correlation was found for the MDS-UPDRS III (r = − 0.26, p < 0.015). For the thalamus, correlations were found for the two MDS scores: (r = − 0.45, p < 0.0001 for the MDS-UPDRS III and r = − 0.39, p = 0.001 for the MDS-UPDRS total).

Discussion

This study investigated T1-weighted MR image texture features that could potentially be used as imaging markers in PD. This imaging sequence is standard in neuroimaging. The method was applied to three populations matched in terms of age and sex. The first population comprised healthy controls while the other two consisted of PD patients at two pivotal stages of the disease: diagnosis (early-PD), when a surrogate biomarker is needed, and late-stage PD with severe L-dopa-related complications, which represents the endpoint for the classical segmental motor handicap of PD. The texture features were computed in different grey matter structures considered as key structures in PD. In addition to the substantia nigra, the primary site of the disease, the putamen and caudate nucleus were also considered as they are directly downstream from the substantia nigra17. The thalamus was considered as the master relay station for brain structures, connecting the basal ganglia to the cortical regions, traditionally associated with the disease, and finally, the sub-thalamic nucleus was investigated as it has an important role in the motor system and is of clinical interest as a target in deep brain stimulation. For each structure, 12 texture features from first- and second-order statistics were computed enabling us to obtain a broad description of the grey matter variations inside in each structure.

The structures considered are functionally different in terms of their involvement in PD and its progression at different stages. The substantia nigra and sub-thalamic nucleus have small volumes and may be less suitable for texture analysis. Nevertheless, our results show that some features differed significantly between the three populations. These results confirm the working hypothesis that texture features that quantify grey matter variations can be more sensitive than atrophy measurement methods such as ROI-based volumetry and morphometry. By comparison, in the ROI-based approach, although a global volume diminution can be observed from the healthy group to the late-stage PD group (Table 1), no statistically significant differences were found, while in the VBM approach, differences were observed only when the healthy group was considered against the PD groups (Fig. 1). This approach acts voxel by voxel and consequently can be considered as close to the texture analysis formalism.

The results were more significant in the substantia nigra than in the putamen and to a lesser extent in the caudate nucleus, thalamus and sub-thalamic nucleus. This is congruent with the classical pattern of degeneration predominant in the substantia nigra and putamen at the time of diagnosis and progression in these areas together with the progression of motor handicap18. The discriminatory powers of the texture features for the classification of PD patients were also examined using ROC analysis and showed that in most cases the AUC was > 0.5.

Our study also showed that texture features were correlated with classical motor handicap scores in PD. This result suggests that grey matter variations quantified by texture could be clinically meaningful.

By design, texture analysis seems to be less affected by the accuracy of the boundaries of the brain structures than volumetry methods. The most widely used segmentation techniques are still prone to low precision and systematic bias, whereas manual segmentation is time consuming and susceptible to inter-observer variability19. In a previous study20, we reported the results on the stability and reproducibility of texture features in the hippocampus and entorhinal cortex for discriminating patients with cognitive impairment after a stroke. VBM approaches, that operate on the whole brain, are not reliant on structure definition, but may be hampered by differences in experimental design as the choice of multiple comparison correction techniques lead to differing results21,22. Furthermore, in PD, texture features were reported to have higher sensitivity for the detection of slight cognitive slowing16.

Our analysis showed that among the 12 texture features considered, three second-order features: entropy, sumV and sumSQR, consistently showed significant differences with a gradual decrease from the healthy control group to late-stage PD. These results are consistent with those of Li et al.15, where different texture features computed in the substantia nigra on quantitative susceptibility and R2* maps were able to differentiate PD patients from healthy volunteers. These authors identified entropy as one of the most discriminating features, with significantly lower values in the PD patient group. Entropy represents a measure of randomness in the MR signal distribution while sumSQR and sumV reflect signal variation. However, without an anatomo-histological validation study associating a biological signature to each feature, it will remain difficult to have a complete understanding of these texture features.

In-homogeneities and inter-machine variability in T1-weighted signal distribution that can hamper the use of texture analysis were mitigated by the application of normalized grey levels using a N3 algorithm. This processing allowed the standardization of grey level value distributions. Furthermore, the three texture features that appeared as potential markers of PD were obtained from second-order statistics. In contrast to first-order statistic features that are computed directly from the signal values, these features are computed from the co-occurrence matrix making them less sensitive to signal variations.

Finally, it is important to consider the clinical implications of the results of this study. Different studies have attempted to evaluate structural imaging as a tool for the early diagnosis of PD and for differentiating between PD and atypical Parkinsonism. However, despite these efforts, the reality is that there is not enough statistical separation of single-structure measurements between PD and non-PD subjects to be of clinical use10. The current trends suggest considering a pattern of atrophy across several structures23. The approach proposed here is in line with these trends; the combination of different features, computed in different structures of interest, is more suitable for machine learning and prediction models to make individualized patient decisions. A multi-parametric and multi-modality solution, involving different MR sequences (T1 and R2*) and/or SPECT images may enhance the predictive ability of the model24,25.

Methods

Study population

The PD population was enrolled consecutively, from the movement disorders department of Lille university hospital, following the inclusion and non-inclusion criteria, described below, of early PD at the time of diagnosis and advanced PD at the time of motor fluctuations. In a second step, healthy controls were selected to match to the age and sex ratio of the PD patients. The healthy controls were recruited among spouses and caregivers who did not have neurological pathologies or other serious conditions (progressive inflammatory or cancerous pathology). There were no secondary exclusions after inclusions for medical (claustrophobia, incidentaloma, etc.) or technical (artifacts, etc.) reasons.

Thirty-nine de novo diagnosed patients before any symptomatic treatment, mean disease duration 0.45 year, and median Hoehn–Yahr stage 2 (min = 0 and max = 3) and late-stage patients with severe motor fluctuations and candidate for second-line treatments (i.e. apomorphine pump, deep brain stimulation, DUODOPA), mean disease duration 8.45 years, and median Hoehn–Yahr stage 2.5 (min = 1 and max = 5) were included. All patients met the Movement Disorders Society (MDS) clinical criteria for the diagnosis of PD26. Patients with severe cognitive impairment or dementia [Montreal Cognitive Assessment score < 22 and as defined by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) criteria], patients with psychiatric disorders (psychosis, hallucinations, compulsive disorders, substance addiction, bipolar disorder, severe depression, according to the DSM-IV), as assessed in a semi-structured interview with a psychiatrist, and patients with severe brain atrophy or abnormal MRI results for any reason other than PD were excluded from the study.

The demographic description of the population and clinical characteristics of the patients are summarized in Table 3.

Table 3 Demographic and clinical characteristics of the three study populations.

The study was approved by the institutional review board (Comité de Protection des Personnes (CPP) Nord Ouest IV, Lille, France; study reference: 2013-A00193-42). All participants provided their written, informed consent before participation. The study complied strictly with the methods, guidelines and regulations described in the approved protocol.

Image acquisition and structures of interest

All patients and controls were scanned using 3 T MRI systems (PHILIPS Healthcare, Best, Netherlands) with an 8-channel sensitivity encoding (SENSE) head coil. High-resolution 3D T1-weighted images were acquired in the sagittal plane with 1 mm2 isotropic pixel size, repetition time = 7.2 ms, echo time = 3.3 ms, flip angle = 9°, field of view = 240 × 256 mm2; acquisition matrix = 256 × 256; slice thickness = 1 mm and 176 continuous slices.

A loss of nigral dopaminergic neurons is strongly correlated with the motor impairments that characterize PD27. However, regions of cell loss also include the caudate nucleus28, thalamus29 and putamen30. For this study, we considered the following deep grey matter structures: substantia nigra, putamen, thalamus, caudate nucleus and sub-thalamic nucleus. For the putamen, thalamus and caudate nucleus, the left and right parts of each hemisphere were extracted from the images using Freesurfer. The substantia nigra and sub-thalamic nucleus were segmented using in-house software that implements an atlas-based approach for the substantia nigra atlas31 and the sub-thalamic nucleus atlas32. The results were checked visually and corrected as appropriate.

Volumetry and morphometry

Classical MRI approaches based on brain atrophy measurement were investigated to test for group differences. ROI based volumetry and VBM were used. For the first, bilateral volumes of each structure, described above, were estimated from the segmentation data. Normalization was done by dividing each individual volume by the intracranial volume (ICV)33. The ICV was estimated as a part of the static FreeSurfer pipeline using a method described in Buckner et al.34.

For the VBM method, images were processed using the SPM12 DARTEL toolbox (Welcome Trust Centre for NeuroImaging, London, UK; http://fil.ion.ucl.ac.uk/spm/software/spm12/) with default settings35.

Texture features

Image texture can be described using different features. In this study, under the assumption that the neural loss induced by the disease progression affects the signal distribution in the region of interest, texture was captured using features from the first order statistics and in order to take into account the neighbourhood in gray levels variation, second order statistics, derived from the co-occurrence matrices were used (Fig. 6). In total, twelve texture features were computed: four from first-order statistics and six from second-order statistics, used in our previous investigation16. The first-order parameters included mean grey level, standard deviation (SD) of grey levels, kurtosis (a measure of whether intensities are heavy-tailed or light-tailed, relative to a normal distribution) and skewness (a measure of a lack of symmetry of the signal intensity). The second-order features (also known as Haralick texture features) quantify the relationships between pairs of neighbouring voxels in the image. The features were derived from the grey level co-occurrence matrix (GLCM); a spatial relationship was defined as the relative direction in a given direction d. In this study, the GLCM matrix was estimated by considering four directions (θ = 0°, 45°, 90° and 135°) and a distance d = 1. Using this matrix, the following features were computed: homogeneity, contrast, entropy, correlation, variance, sum average, sum variation and inverse different moment (IDM). All the features and their computation are described in Table 4.

Figure 6
figure 6

Texture features extraction scheme. For each brain structure represented as a region of interest segmented on the T1w MR images, gray levels variation is captured without taking into account voxels neighbouring using first order statistics and by considering the neighbourhood by converting the gray levels into a co-occurrence matrix and then extracting the second order statistics.

Table 4 Texture features considered, together with their significance, equations and models.

For each brain structure and each feature, calculation was done for the right and left sides and then averaged.

Texture features may be affected by MR signal in-homogeneities and inter-machine variability in T1-weighted sequences making their reproducibility questionable. In order to ensure reproducibility, the images were corrected for field bias and in-homogeneities using the nonparametric non-uniform intensity normalization algorithm (N3). This processing allowed standardization of grey level value distributions (FreeSurfer software package (version 6.0)36.

Statistical analysis

Texture features and ROI-based volumes were compared between the three groups using ANOVA with significance fixed at p < 0.05. If appropriate, t tests with false rate discovery (FDR) correction were then run for pair-wise comparisons. For the VBM, t tests were performed to identify differences in whole brain grey matter volume. Clusters were considered significant with a threshold in terms of size fixed to 10 voxels and after controlling for family-wise errors rate using FDR.

Regression analysis using the least absolute shrinkage and selection operator (LASSO) method was run on the statistically significant texture features to select the best independent predictors. The LASSO analysis was run by considering the three groups. The individual classification capabilities in separating the two PD groups of the selected features were subsequently measured using area under curve (AUC) analysis.

Correlations between the texture features and classical clinical motor scores (MDS-UPDRS III, Hoehn–Yahr score and MDS-UPDRS total) were investigated using Spearman’s correlation coefficient with significance fixed at p < 0.05 and corrected using FDR.

Texture features were computed using in-house software and the XLSTAT software plug-in (AddinSoft, www.xlstat.com) was used to perform all statistical tests. The code package as well as all the data used in this study are available for download.