Identification of cerebrospinal fluid biomarkers for parkinsonism using a proteomics approach

The aim of our study was to investigate cerebrospinal fluid (CSF) tryptic peptide profiles as potential diagnostic biomarkers for the discrimination of parkinsonian disorders. CSF samples were collected from individuals with parkinsonism, who had an uncertain diagnosis at the time of inclusion and who were followed for up to 12 years in a longitudinal study. We performed shotgun proteomics to identify tryptic peptides in CSF of Parkinson’s disease (PD, n = 10), multiple system atrophy patients (MSA, n = 5) and non-neurological controls (n = 10). We validated tryptic peptides with differential levels between PD and MSA using a newly developed selected reaction monitoring (SRM) assay in CSF of PD (n = 46), atypical parkinsonism patients (AP; MSA, n = 17; Progressive supranuclear palsy; n = 8) and non-neurological controls (n = 39). We identified 191 tryptic peptides that differed significantly between PD and MSA, of which 34 met our criteria for SRM development. For 14/34 peptides we confirmed differences between PD and AP. These tryptic peptides discriminated PD from AP with moderate-to-high accuracy. Random forest modelling including tryptic peptides plus either clinical assessments or other CSF parameters (neurofilament light chain, phosphorylated tau protein) and age improved the discrimination of PD vs. AP. Our results show that the discovery of tryptic peptides by untargeted and subsequent validation by targeted proteomics is a suitable strategy to identify potential CSF biomarkers for PD versus AP. Furthermore, the tryptic peptides, and corresponding proteins, that we identified as differential biomarkers may increase our current knowledge about the disease-specific pathophysiological mechanisms of parkinsonism.


INTRODUCTION
There is currently no reliable objective test to discriminate Parkinson's disease (PD) during lifetime from the various forms of atypical parkinsonism (AP), which include multiple system atrophy (MSA), progressive supranuclear palsy (PSP), dementia with Lewy bodies (DLB), corticobasal syndrome (CBS), and vascular parkinsonism. Discrimination of these disorders based on the clinical presentation alone can often be puzzling, especially early in the disease course when symptoms overlap across the different parkinsonism conditions. The clinical diagnosis is based on the most recent specific criteria defined for each disease and includes clinical features, imaging, rate of disease progression, and response to dopaminergic medication [1][2][3][4][5][6] . However, many of these symptoms have not developed fully in early disease stages, explaining why the rate of misdiagnosis could be up to 20%, even in the hands of movement disorders experts 7 . Therefore, especially in early disease stages, reliable biomarkers are needed for accurate differentiation between PD and AP. Such a timely distinction is important, e.g., for patient counselling since the forms of AP usually have a faster disease progression than PD, with little or no clinical response to levodopa medication. Being able to reliably separate the different parkinsonian syndromes at the earliest possible stage is also critically important for research purposes, allowing the correct patients to be recruited into trials.
Cerebrospinal fluid (CSF) is a rich source for the identification of potential fluid biomarkers for neurodegenerative disorders due to its close proximity to the brain. The CSF composition may directly reflect pathological changes in the brain. Although several studies have identified potential biomarkers for parkinsonian syndromes, none have yet been implemented in clinical practice. So far, quantification of α-synuclein (α-syn) by real-time quaking-induced conversion (RT-QuIC) proved very useful to discriminate parkinsonian disorders with an underlying α-synucleinopathy, such as PD, DLB, and MSA, compared to other types of proteinopathies, such as the tauopathies PSP and CBS 8,9 . However, this assay could not discriminate PD from MSA or DLB. Quantification of neurofilament light chain (NfL) in either CSF or blood may discriminate PD from AP [10][11][12] , but additional biomarkers may help to increase the specificity to discriminate PD from AP.
The aim of this study was to identify proteins, that could assist in the discrimination of PD from AP in relatively early stages of the disease, and assess their diagnostic value in our cohorts. Such biomarkers may alert clinicians for a timely diagnosis of AP which is more rare than that of PD. The identification of proteins in our study was based on tryptic peptide biomarkers, that are produced after enzymatic digestion of CSF proteins with trypsin and enable mass spectrometric analyses. We used non-targeted (shotgun) proteomics for the discovery of protein biomarkers and targeted (selected reaction monitoring; SRM) mass spectrometry (MS) for validation of our findings. We performed our discovery and validation experiments using patients from a unique longitudinal cohort followed up for up to 12 years. Importantly, all participants had an uncertain diagnosis at the time of inclusion, thereby replicating the clinical challenge faced by clinicians to provide a correct diagnosis, i.e., at a phase in the disease process when 1 many clinical symptoms are overlapping and where diagnostic biomarkers could be very useful.

RESULTS
CSF proteomic profiling Using shotgun proteomics, 5,543 tryptic peptides were identified in the PD, MSA, and non-neurological controls groups. Of these 5,543 peptides, 191 peptides had significantly different levels (pvalue < 0.05) between PD and MSA.

SRM assay development and validation
For further validation by SRM, we focused on differential tryptic peptides from the comparison of PD vs. MSA, and therefore 34 tryptic peptides were selected from the untargeted discovery study (Table 1) based on the criteria described in the "Methods" section. During method development, two heavy labelled peptides (FPPEETLK and DLGGFDEDAEPR) could not be robustly detected and were excluded. Therefore, our final SRM assay consisted of 32 tryptic peptides, representing 31 different proteins. The SRM assay was robustly validated and all parameters, such as intra-and inter-assay coefficient of variation (CV), sample stability during measurement, and digestion in different days were within our acceptance criteria of a maximum of 20% variation between replicates (see Supplementary Table 1). Results were considered satisfactory and confirmed the stability of the sample preparation and the equipment during measurement days.
Tryptic peptides levels in CSF from PD, AP, and controls For group comparisons, we considered MSA and PSP as one group (AP) because of the relatively low number of PSP cases (n = 8) in our study. Total protein concentration was higher in the AP group (mean = 579 mg/L) compared to PD (mean = 533 mg/L) and controls (mean = 426 mg/L, p < 0.001), due to high total protein levels in the PSP group (see Table 2). Age was positively correlated to the levels of 23/32 peptides in the PD group and to 18/32 peptides in the non-neurological control group, with correlation coefficients ranging from 0.3 to 0.6 (p < 0.05). Therefore, age was included as covariate for group comparisons. Clinical assessment of disease severity (Unified Parkinson's Disease Rating Scale (UPDRS)) positively correlated with 6/32 peptides in the AP group, with correlation coefficients ranging from 0.4 to 0.6 (p < 0.05). No significant correlation of (other) clinical parameters with the levels of any of the tryptic peptides in the PD or AP group was observed.
For 14/32 peptides we could confirm our findings from the discovery experiment and replicated the differences in these tryptic peptide levels between PD vs. MSA and PD vs. AP (Table 3 and Fig. 1). The remaining 18/32 peptides did not yield any differences between PD and AP or the observed differences were in the opposite direction as the shotgun experiment. All 14 differential tryptic peptides were present at lower CSF levels in AP compared to both PD and controls, with ratios of PD vs. AP ranging from 1.2 to 1.6. One of these 14 peptides (VLEYLNQEK) also had lower CSF levels in PD compared to controls, while the other 13 tryptic peptides had similar levels in PD and nonneurological controls. Among these 14 peptides, for only 1 peptide (VGIPENAPIGTLLLR) levels were different between men (mean = 0.08) and women (mean = 0.11; p = 0.023) in the PD group, but not in other groups. The diagnostic accuracy of the 14 peptides to discriminate PD from AP, i.e., the AUC of the ROC, was moderately high and ranged from 0.60 to 0.76 (Table 3). The strongest potential biomarkers included tryptic peptides belonging to Protocadherin Fat 2, Amyloid-beta precursor protein, Protein O-linked-mannose beta-1,2-N-acetylglucosaminyltransferase 1, and Contactin-1.

Multi-parametric analysis
We investigated if the 14 tryptic peptides, either in combination with or without other previously established protein biomarkers and clinical data, could improve the discrimination between PD and AP. Four decision trees models were generated by random forest modelling based upon four different datasets containing: (1) the 14 tryptic peptides which were differentially expressed in PD vs. AP, (2) the 14 tryptic peptides and previously identified biochemical markers such as NfL, α-syn, amyloid β42, total tau, phosphorylated tau, and RT-QuIC analysis of misfolded α-syn; (3) the 14 tryptic peptides, the above-mentioned biochemical markers and clinical assessments, such as UPDRS, ICARS, MMSE scores; (4) a combination of the previously identified biochemical markers (as in model 2) with clinical assessments (as in model 3). An overview of the biochemical markers and clinical parameters that were included, next to tryptic peptides, in the models for the discrimination of PD and AP is provided in Supplementary Table 2

DISCUSSION
In this study, we used untargeted MS to identify tryptic peptides in CSF as potential biomarkers that could discriminate parkinsonian disorders, and performed an independent validation of our findings by targeted MS. For this purpose, we purposely included only patients with clear signs of parkinsonism but an uncertain diagnosis at the time of inclusion and CSF collection, but in whom a silver standard diagnosis was made 3−12 years later based on the rate of progression, response to treatment and possible development of red flags. This approach served to replicate the challenge that clinicians face in everyday clinical practice when a clinical diagnosis has to be established in movement disorders patients with only partially developed clinical syndrome. Under such circumstances, having reliable diagnostic biomarkers would be very helpful.
Both untargeted and targeted MS methods proved to be reliable and robust methods to identify tryptic peptide biomarkers and provided a relative quantification of the levels of these peptides in CSF. We developed a protocol for the evaluation of SRM analysis of tryptic peptides in CSF. The newly developed assay procedure was very robust since it proved to be very stable during several measurement days (CV < 10%), it was reproducible across different sample preparation days, and was resistant to multiple freeze/thaw cycles. Therefore, this SRM assay may be useful for other CSF biomarker studies as well.
The SRM assay confirmed our findings for many tryptic peptides from the discovery experiment, illustrating the robustness of the shotgun proteomics for biomarker identification. For 14 tryptic peptides, we found lower CSF levels in AP compared to nonneurological controls and PD, both in the discovery and validation     experiments, and they individually discriminated PD from AP with a diagnostic value up to 76%. Multivariate analysis by random forest modelling did not increase the discriminative value between PD and AP when only peptides were included in the model. The lower discriminative value generated by random forest modelling compared to individual tryptic peptides (53% vs. 76%) could be explained by the low number of variables (14 peptides) included in the analysis, and on top of that, the model was developed in 70% of our cohort and validated in remaining 30%. However, by including more variables, such as other CSF protein biomarkers and/or clinical assessments, the random forest algorithm was capable to provide a better discrimination between disease groups, increasing the accuracy to 86%. Interestingly, very comparable AUC values (0.86−0.88) were obtained for the models 2, 3, and 4, suggesting that a combination of the tryptic peptides identified in the current study with established protein biomarkers (NfL, α-syn, amyloid β42, total tau, phosphorylated tau, and RT-QuIC analysis of misfolded α-syn) has similar additional diagnostic value as clinical data in combination with these established markers. These models including CSF tryptic peptides and clinical assessments offers a great advantage to help clinicians to identify a correct diagnosis of parkinsonian disorders, but need to be tested in independent cohorts. Aside from a potential role in differential diagnosis, several of the 14 identified tryptic peptides, which are derived from 13 different proteins (Table 3), have a known role in neurodegeneration, which sheds new light on potential disease mechanisms in PD vs. AP. Six out of 13 proteins (Protocadherin Fat 2, Cadherin-2, Protocadherin gamma-C5, Neuronal cell adhesion molecule (2 tryptic peptides), Fibulin-1, Contactin-1) are involved in cell−cell adhesion, an important mechanism of synaptic function maintenance 13 . Two other tryptic peptides/proteins found in our study, SLIT and NTRK-like protein 1 14,15 , and Amyloid-beta precursor protein 16 , also play a role in synaptogenesis. Dysfunctional synapses contribute to neurodegeneration 17 , and dysregulation of these proteins may add to such dysfunction in AP syndromes. A meta-analysis on imaging studies showed that presynaptic dopaminergic function is 34% lower in PSP as compared to PD and MSA 18 . Moreover, in a study published after this metaanalysis, evidence-based on DAT SPECT data was obtained supporting a faster decline of presynaptic function in MSA compared to PD as well 19 . Our findings added several potential molecular biomarkers to this imaging-based evidence of synaptic dysfunction. Studies using immunohistochemistry on brain tissues, animal and in vitro studies may be useful in confirmation of altered expression of the proteins in AP and their localization.
The adhesion protein Cadherin 2 may play a protective role in dopaminergic neurons [20][21][22] . Loss of Cadherin-2 compromises neuronal differentiation, via the Wnt signalling pathway 23 . Lower levels of Cadherin-2 have previously been found in CSF from PD patients compared to controls 24 . We could, however not replicate this difference in PD vs non-neurological controls, but we did find lower levels in AP vs PD. We could not retrieve any studies investigating the role of Cadherin 2 in MSA or PSP. However, Cadherin 2 is involved in the process of myelination in oligodendrocytes 25,26 , which are the affected neurons in MSA. Lower levels of Cadherin 2 in MSA compared to PD at early disease stage could be involved in the more rapid disease progression of MSA compared to PD, but further studies need to clarify the Cadherin 2 levels in MSA.
Lower CSF levels of the peptide LTVFPDGTLEVR (Leucine-rich repeat and immunoglobulin-like domain-containing nogo receptor-interacting protein 1, LINGO-1) in AP compared to PD could be related to demyelination in MSA as compared to PD. LINGO-1 is a transmembrane protein that negatively regulates oligodendrocyte differentiation and axon myelination 27 . The regulation occurs by inhibition of the RhoA pathway, decreasing the expression of myelin basic protein (MBP) 27 . Functional studies demonstrated the presence of LINGO-1 in dopaminergic neurons and oligodendrocytes [27][28][29] . A meta-analysis identified LINGO-1 polymorphisms related to decreased risk of PD, but not of MSA 30 . In MSA, accumulation of misfolded α-syn occurs in oligodendrocytes, which are the cells responsible for myelin maintenance. Myelin dysfunction in MSA precedes α-syn accumulation and neuronal loss 31 , therefore myelin dysfunction might be an important early mechanism of neurodegeneration in MSA. In previous studies of our group, we found increased levels of MBP in the CSF of MSA patients compared to PD patients 32,33 . Although the specific mechanism underlying the lower LINGO-1 levels in MSA compared to PD remains unclear, abnormal levels of the peptide LTVFPDGTLEVR may be an indication of early disturbances in oligodendrocyte myelin production in MSA, consistent with the increased CSF MBP levels in MSA.
The peptide VLEYLNQEK (secretogranin-2), was the only peptide in our study, which discriminated PD from both controls and AP. Secretogranin-2 is a protein that is cleaved into peptides and secreted in vesicles, releasing the neuropeptide named secretoneurin, a peptide that stimulates dopamine release in striatal neurons and basal ganglia 34,35 . Therefore, disruptions in secretogranin-2 levels might be related to altered levels of dopamine release in the synaptic cleft. Recently, one study showed co-localization of secretogranin-2 with aggregated α-syn and phosphorylated tau in brain tissue of a PD animal model, suggesting an involvement of these proteins in synaptic trafficking 36 . A previous proteomics study identified lower CSF levels of secretogranin-2 in PD compared to controls 24 , consistent with our results. The secretogranin-2 might be useful as an early biomarker to demonstrate dopamine disturbances in parkinsonian syndromes.
The remaining three identified tryptic peptides were derived from proteins involved in the regulation of cellular communication (Multiple epidermal growth factor-like domains protein 8), extracellular structural function (Extracellular matrix protein 1), and protein glycosylation (Protein O-linked-mannose beta-1,2-Nacetylglucosaminyltransferase 1), with no known relation with neurodegeneration or previous link with parkinsonism.
Several previous studies aimed to discriminate PD from AP by using CSF proteomic profiling. In one study 2,000 (poly)peptides in CSF of PD, AP (MSA, PSP, and CBD), and controls were analyzed using the method of surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF MS) 37 . In this study, none of the features could discriminate PD from controls, whereas four proteins or protein fragments (ubiquitin, beta2-microglobulin, and two fragments of secretogranin-1) discriminated either MSA or PSP from PD/controls. Four peptides of secretogranin 1 were identified in our discovery experiment, and at lower levels in MSA compared to PD, confirming these previous findings. However, the peptides belonging to this protein did not qualify for our SRM assay, and therefore we could not confirm it in our validation experiment. In yet another study, using Orbitrap MS, 5,043 protein-derived tryptic peptides were identified in CSF in a discovery cohort of PD, AP (MSA, PSP, and CBS), and controls 38 . The number of peptides is quite comparable to our findings (5,043 vs. 5,543 peptides in our study). In their discovery and validation experiments, up to 90 peptides were detected at significantly lower levels in AP compared to controls (p < 0.05), but there were no differences for PD vs. AP or PD vs. controls, as we observed in our study.
Few limitations may apply to our study. First, the long storage time of CSF samples may have affected our results. A previous study investigated the stability of CSF proteins up to 12 years storage on −80°C 39 , and no differences were found over time. Furthermore, all PD and AP samples in our study were retrieved in the same period, and therefore, we do not expect that storage time is a major factor that may have affected the results of the differential levels in these patients. A second limitation may be related to the final diagnoses of the patients, which were based on clinical assessments and not on neuropathological examinations. However, given the very long follow-up of the patients in our cohorts (up to 12 years), and the independent assessment by two experienced movement disorder specialists, we believe that the rate of misclassification has been reduced to a minimum. Importantly, the long follow-up time allowed us to consider the rate of progression, response to therapy, and development of any red flags into the diagnostic process. We also included brain imaging findings in the diagnostic process. Based on these clinical parameters, a reliable 'silver standard' diagnosis can be made in most patients. Third, for 18 tryptic peptides, selected from our discovery experiment for validation, the results did not match in both experiments, which reinforces the need of robust independent validation studies before conclusions can be reached, which applied to the remaining 14 tryptic peptides. Fourth, apart from MSA, we only had a limited number of other cases with other causes of AP, such as PSP, CBD, or DLB, in our validation study, due to the relative low representation of these patients, for which also CSF was available, from our longitudinal cohort. Since AP comprises a heterogeneous group of disorders, including both synucleinopathies and tauopathies, the small number of AP cases other than MSA also limits the translation of our findings to potential disease mechanisms in e.g., PSP, and will probably mainly reflect changes in MSA.
One of the strongest aspects of our study is the use of two independent cohorts of patients for discovery and validation. In addition to that, we also performed our validation using a different MS technique (SRM) than in the discovery, and we confirmed the consistency of 14 tryptic peptides to discriminate PD from AP. A second strong point is the unique longitudinal study, in which patients were initially included with clear parkinsonian symptoms, but with an uncertain diagnosis at baseline, i.e., at a time in the diagnostic process where fluid biomarkers are needed most. As such, our cohort offers excellent opportunities for fluid biomarker discovery and validation, as we demonstrate here. Besides providing new insights for potential biomarkers to help clinicians to discriminate parkinsonian disorders, this may also provide new insights into differences in the underlying pathophysiological processes for PD as compared to AP.
In summary, proteomics is a powerful tool to identify peptides in CSF for discrimination of parkinsonian disorders. Our newly developed SRM assay proved to be very robust and offered a reliable relative quantification of tryptic peptides in CSF. Our validation experiment confirmed the potential of 14 CSF peptides to discriminate PD from AP, already at an early disease stage when there is still a high level of uncertainty about the underlying aetiology of the specific movement disorder. The discriminative value of these tryptic peptides could be enlarged by the combination with existing biochemical markers or clinical assessments. Finally, our study may provide new insights into the underlying pathophysiological processes of each disorder.

Patients and samples
For both the discovery and validation experiments, we included participants from a longitudinal study 40 , who all had clear clinical signs of parkinsonism, but with an yet unclear diagnosis at the time of inclusion, and who had been recruited from our movement disorders outpatient clinic between January 2003 and December 2006 at the Radboud University Medical Centre (Nijmegen, the Netherlands). In total, 25 CSF samples were included in the discovery experiment (PD, n = 10; MSA, n = 5; non-neurological controls, n = 10). For validation of our initial findings of the discovery phase we used 110 CSF samples from PD (n = 46), MSA (n = 17), PSP (n = 8), and non-neurological controls (n = 39).
At the time of inclusion, patients underwent a structured standardized neurologic examination by movement disorders specialists. Lumbar puncture, was performed within 6 weeks after the initial visit. The design of this study, methodology, and patient inclusion have been extensively described 40 . After three and 12 years of inclusion, the diagnosis of all participants was critically revised again and a silver standard clinical diagnosis was established by two independent movement disorder specialists. To establish this diagnosis, the clinical experts used the most recent clinical criteria at that time [1][2][3][4][5][6]41,42 , combined with the now available long-term response to therapy, the rate of disease progression, and the possible development of red flags, which may alert clinicians to an alternative diagnosis. Patient characteristics are presented in Table 1. For correlations of the newly identified biomarkers from the present study with other, established protein biomarkers, we used previously published data on NfL, α-syn, total tau, phosphorylated tau, amyloid-β42, and α-syn RT-QuIC 8,12,40,43,44 . For details on the assays used for quantification of these protein biomarkers, see Supplementary Methods.
Clinical assessments at baseline and after 3 and 12 years of follow-up included the Hoehn and Yahr scores 45 , Unified Parkinson's Disease Rating Scale (UPDRS) 46,47 , Mini-Mental State Examination (MMSE) 48 , and International Cooperative Ataxia Rating Scale (ICARS) 49 .
For comparison, we selected a group of non-neurological control patients who had underwent a lumbar puncture because of a suspected central neurological disorder. All selected control cases were free of neurological disease, as determined after careful examinations. Moreover, their CSF composition, such as leukocyte and erythrocyte count, glucose, blood pigments, lactate, and (if assessed) oligoclonal immunoglobulin G bands were all within the reference ranges for their age group.
All CSF samples included in this study were collected in polypropylene tubes, centrifuged at 800 × g, aliquoted, and stored in polypropylene tubes at −80°C until use. All patients with PD or AP provided written informed consent and the study was approved by the local Medical Ethics Committee (Arnhem-Nijmegen; file no. 2002/188). The use of CSF leftovers from the control patients who had been seen as part of daily care in research projects was approved by the local Medical Ethics Committee.

Mass Spectrometry-shotgun proteomics profiling
Total protein concentration in CSF was determined by using the 2D Quant kit (GE Healthcare Life Sciences, UK), according to the manufacturer's protocol, and 400 µg total protein was used as input for profiling. All samples were loaded on an affinity removal column for the depletion of the 14 most abundant proteins (MARS-14, Agilent Technologies, Santa Clara, CA, USA). After tryptic digestion, CSF samples were fractionated in 20 fractions using high pH reversed-phase C18 LC and each fraction was subsequently analyzed by nanoflow liquid chromatography (Bruker Daltonics; nano-Advance) connected online to an ultra-high resolution quadrupole time-of-flight tandem mass spectrometer (Qq-TOF; Bruker Daltonics; maXis 4G ETD) as described previously 50 .
Raw MS data were analyzed by MaxQuant software version 1.5 51 with pre-defined Qq-ToF parameter settings against the RefSeq (release 55) human protein sequence database. We set cysteine carbamidomethylation as a fixed modification, whereas N-terminal acetylation, methionine oxidation, and deamidation of glutamine and/or asparagine were set as variable modifications. For further statistical analysis, only peptides with intensity above the detection limit in at least 75% of the samples in one of the groups (PD, MSA, or non-neurological controls) were used.

Mass spectrometry−targeted proteomics using SRM
For the selection of tryptic peptides for the SRM assay, additional criteria were used: (1) p-value below 0.05 determined by Mann−Whitney U test comparing PD vs. MSA; (2) ratio of intensity (PD:MSA) of at least 1.5; (3) intensity values above MS detection limit in at least 75% of samples in both PD and MSA groups; (4) peptide length of maximal 20 amino acids; (5) uniqueness (assignment to only one protein); (6) information available in Uniprot 52 or PeptideAtlas 53 ; (7) exclusion of peptides with susceptibility to post-translational or chemical modifications, such as methionine and cysteine oxidation, a potential deamidation site, or N-terminal cyclization.
The CSF samples for SRM and MS analysis were processed in randomized order using 50 µL of CSF from each patient as input. Prior to protein digestion, samples were subjected to overnight freeze-drying to concentrate the sample. On the next day, the sample was reconstituted 10.6 µL, which was subsequently diluted four times with 31.6 µL of 50 mM ammonium bicarbonate and incubated with 1 µL of trypsin (1 μg trypsin/ 50 μg protein) for 4 h at 37°C. Digestion reaction was stopped by adding 4.8 µL of 10% trifluoroacetic acid. A cocktail of synthesized isotope-labelled "heavy" peptides (JPT, Germany) on the C-termini of the target peptides at either a lysine ( 13 C 6 15 N 2 ) or arginine ( 13 C 6 15 N 4 ) residue was added to each sample to allow peptide identification and relative quantification. Samples were cleaned by passing them over a 0.22 µm filter and stored at −80°C until MS analysis.
Samples (2 µL) were subjected to LC-MS analysis in randomized order on the Acquity MClass UPLC Xevo TQ-S (Waters), coupled with an ionKey/MS system using a Waters peptide BEH C18, 130 Å, 1.7 μm, 150 μm × 100 mm ionKey column for chromatographic separation using a 30 min linear gradient of acetonitrile ranging from 3 to 35% with 0.1% formic acid at a flow rate of 2 μl/min.
To optimize SRM settings in the SRM method development step, we used a pooled trypsin digested CSF sample spiked in with a cocktail of heavy labelled peptides (final concentration of 10 fmol for each peptide), and specifically, the cone voltage and collision energy were optimized for each peptide fragment. For each peptide, we started with a selection of at least 10 peptide fragments per precursor (transitions). For the final multiplex SRM assay, at least 2 transitions with the highest signal intensity and lack of interference were selected for each peptide target. For each peptide fragment, retention time windows of 1 min were used, allowing both endogenous and heavy labelled peptides to have at least 8 data points per chromatographic peak with an average dynamically dwell time of 250 ms.
Our newly developed SRM method was validated using a pooled digested CSF sample mixed with a cocktail of heavy labelled peptides and the following criteria were investigated: (1) linearity to provide a calibration curve, by using a dilution series of the cocktail of heavy labelled peptides (0, 0.625, 1.25, 2.5, 5, 10, 20, and 40 fmol) spiked into pooled digested CSF in three replicates. Peptide fragments with a linear regression coefficient (R 2 ) below 0.7 were excluded. The calibration curve was used to determine the best heavy labelled peptide concentration for the clinical samples and a new peptide cocktail was prepared; (2) intra-assay variation < 20% for 1 pooled digested CSF sample injected five times on the same day; (3) interassay variation < 20% for 1 digested CSF sample measured on 10 different days; (4) inter-assay sample preparation < 20% for 5 identical aliquots of pooled CSF samples, all digested and measured on the same day; (5) sample stability on the autosampler which was set at 10°C by injecting 1 sample repetitively from the same plate every 4 h for 24 h; (6) freeze/ thaw effect < 20% for 1 pooled and digested CSF sample subjected to up to 5 freeze/thaw cycles; (7) freeze/thaw effect < 20% for a pooled CSF sample subjected to 3 freeze/thaw cycles prior to the digestion procedure. CV was calculated between technical replicates and a CV of 20% was regarded as acceptable. Results of these 7 criteria for the SRM method validation for all included tryptic peptides are shown in Supplementary Table 1a−1g respectively. To correct for possible variation between the days of sample preparation of the clinical cohort, two pooled CSF samples were included as quality controls in each digestion cycle (see Supplementary Table 1h).
Total protein concentration in CSF of the clinical cohort was determined by turbidimetric benzethonium chloride method using a Cobas 8000 instrument (Roche Diagnostics, Switzerland) for automated measurement.

Data analysis
Skyline software version 20.1 (MacCoss Lab, University of Washington, USA) was used to process raw data from SRM assay to confirm peak detection, correct integration, and calculation of the peak area 54 . For data analysis, the relative quantification was determined by calculation of the ratio between endogenous and heavy labelled peptides. We normalized each ratio of endogenous : heavy labelled peptides for total CSF protein concentration as these markers were identified in the proteomics profiling where also a normalization on total protein content was applied.
Analyses were performed in R software version 3.5.3 (Austria), IBM SPSS Statistics 25 (Armonk, NY, USA), or GraphPad Prism 5 (La Jolla, CA, USA). Groups were compared by using two-sided Student's T-test or Mann−Whitney U test in the case of two groups depending on the data distribution (parametric or non-parametric), and two-sided analysis of variance (ANOVA) with Bonferroni's multiple correction as a post hoc test or Kruskal−Wallis one-way analysis of variance with Dunn's as post hoc test when more than two groups were analyzed. Rank analysis of covariance was used for group comparisons taking age as a covariate, including Bonferroni's multiple correction as a post hoc test. Correlation of peptides with age and clinical parameters, such as disease duration, UPDRS, ICARS, MMSE scores, was performed using Spearman's rank correlation coefficient. Random forest was applied for multivariate analysis to generate decision trees to improve group discrimination. The models generated by random forest were developed in 70% of our cohort and validated in 30%. For random forest analysis, an imputation method (Amelia II, R package) was used to fill in missing values 55 . Receiver operating characteristic (ROC) was used to determine the diagnostic accuracy by calculating the area under the curve (AUC).

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The mass spectrometry shotgun proteomics profiling data have been deposited to the ProteomeXchange Consortium via the PRIDE 56