Molecular linkage between post-traumatic stress disorder and cognitive impairment: a targeted proteomics study of World Trade Center responders

Existing work on proteomics has found common biomarkers that are altered in individuals with post-traumatic stress disorder (PTSD) and mild cognitive impairment (MCI). The current study expands our understanding of these biomarkers by profiling 276 plasma proteins with known involvement in neurobiological processes using the Olink Proseek Multiplex Platform in individuals with both PTSD and MCI compared to either disorder alone and with unaffected controls. Participants were World Trade Center (WTC) responders recruited through the Stony Brook WTC Health Program. PTSD and MCI were measured with the PTSD Checklist (PCL) and the Montreal Cognitive Assessment, respectively. Compared with unaffected controls, we identified 16 proteins associated with comorbid PTSD–MCI at P < 0.05 (six at FDR < 0.1), 20 proteins associated with PTSD only (two at FDR < 0.1), and 24 proteins associated with MCI only (one at FDR < 0.1), for a total of 50 proteins. The multiprotein composite score achieved AUCs of 0.84, 0.77, and 0.83 for PTSD–MCI, PTSD only, and MCI only versus unaffected controls, respectively. To our knowledge, the current study is the largest to profile a large set of proteins involved in neurobiological processes. The significant associations across the three case-group analyses suggest that shared biological mechanisms may be involved in the two disorders. If findings from the multiprotein composite score are replicated in independent samples, it has the potential to add a new tool to help classify both PTSD and MCI.


Introduction
Studies of the long-term psychiatric and neurocognitive functioning of World Trade Center (WTC) responders during the two decades since September 11, 2001 have found high rates of impairment. The most prevalent psychiatric condition is post-traumatic stress disorder (PTSD), which is characterized by re-experiencing, avoidance, negative cognitions and mood, and arousal symptoms [1][2][3] . Nearly 20% of responders developed PTSD, and 10% continue to suffer from the disorder 1,4 . The most prevalent neurocognitive condition is mild cognitive impairment (MCI), which is characterized by declines in memory, learning, concentration, and decision-making that are not yet sufficient to cause functional limitations 5 . Critically, systematic reviews have identified consistent associations between PTSD and both neurocognitive dysfunction 6 and dementia 7 in cohorts of veterans and Holocaust survivors. In our WTC cohort, we observed a 2.67-fold increase in the incidence of MCI among responders with PTSD two decades after exposure 8 . Given this association, this paper uses proteomics analysis to undertake an in-depth characterization of the pathophysiology of MCI, PTSD, and their co-occurrence.
Proteomics is a promising strategy for characterizing the biological signatures of disorders that has been facilitated by the emergence of high-throughput technologies 9 . Proteins execute functions within cells and communication between them, and thus are potentially involved in pathological processes underpinning PTSD and MCI. Proteomics, therefore, aims to capture the dynamics of protein expression and detail their interactions within a cell 10 , an important process when trying to elucidate cellular adaptation to environmental signals and cellular aspects of disease processes 11 . Proteomics offers a different level of understanding of these processes compared with genomics and transcriptomics because proteins undergo alternative post-translational modification (e.g., phosphorylation) essential for protein function; as a result, information from a single gene can encode different protein species 10 and form protein complexes that determine function 11 .
Existing work on proteomics has identified biomarkers that are altered both in individuals with PTSD and with MCI. For example, PTSD has been linked to alterations of serum proteins such as glial fibrillary acidic protein (GFAP), vascular endothelial growth factor (VEGF) 12 , β-amyloid 13 , and C-reactive protein (CRP) 14 . Similarly, MCI was associated with changes to VEGF, CRP, and cortistatin (CORT), among others 15 . Co-occurring PTSD and MCI was examined in only one molecular study of a mouse model that found that the loss of FMN2 gene was associated with both PTSD-like phenotypes (i.e., fear extinction) and age-accelerated memory impairment 16 . However, no studies to our knowledge have examined the extent to which protein signatures for PTSD, MCI, and their comorbidity differ in vivo in humans. This is important because of known interspecies variability and differences in proteomics 17 .
This study aims to fill the gap in molecular studies of PTSD and MCI by profiling a large set of proteins (k = 276) with known involvement in related processes to determine whether markers of neurodevelopmental processes, cellular regulation, immunological function, cardiovascular disease, inflammatory processes, and neurological diseases are linked to PTSD and MCI by comparing patients with PTSD, MCI, and comorbid PTSD-MCI with unaffected controls [18][19][20] . We hypothesized that alterations in these processes reflect a combination of proteomic profiles that are observed in PTSD, MCI, and comorbid PTSD-MCI but not in unaffected individuals. Second, we constructed multiprotein composite scores and examined their associations with PTSD and MCI symptom severity.

Participants
Participants were recruited through the Stony Brook WTC Health Program 21 . This study was approved by the Stony Brook University IRB. Written informed consent was obtained. The analysis focused on a subsample of male responders who completed their annual monitoring visit in 2019. We studied only male responders because <10% of the Stony Brook cohort is female, and women show notably different protein expression patterns from men 22 . Responders with a history of medical or neurodegenerative conditions, brain tumors, cancers, or cerebrovascular conditions were ineligible for the study.

Clinical measures and classification
Probable PTSD was measured with the Posttraumatic Stress Disorder Checklist-Specific Version (PCL-17) 23 , a 17-item self-report questionnaire modified to assess the severity of WTC-related DSM-IV PTSD symptoms over the past month on a scale of 1 (never bothered by) to 5 (extremely bothered by) (Cronbach α = 0.96). Probable PTSD was operationalized by a PCL total score >44. The unaffected sample was asymptomatic (PCL score <22).
MCI was measured using the Montreal Cognitive Assessment (MoCA), a widely used objective multidomain test 24 . A conservative cutoff of <22 was applied to reduce misclassification. Normal cognitive functioning was defined as MoCA >26 consistent with testing guidelines 25 . Unaffected controls (PCL <22 and MoCA >26) were subject to an additional medical record review to rule out responders with a clinical history of PTSD and related disorders.
The final sample (N = 181) included 34 responders with comorbid PTSD-MCI, 39 with PTSD only, 27 with MCI only, and 81 unaffected controls.

Proteomics profiling
Protein expression of plasma was profiled using the Olink Proseek Multiplex Platform. The Olink multiplex immunoassay was designed to provide an ultrasensitive, reproducible, and highly multiplexed method for measuring protein expression. The measurement was based on state-of-the-art Proximity Extension Assay (PEA) technology 26 . More details are available online (https:// www.olink.com). Three commercial Olink panels were profiled for each participant included in the Neurology, Neuro Exploratory and Cardiovascular II (CVII) panels. Thus, 276 proteins (92 proteins per panel) were targeted involving a range of processes indicative of a range of neurological diseases, cellular regulation, immunology, cardiovascular, inflammatory, development, and metabolism.

Proteomics data preprocessing
A number of internal and external controls were added to the plasma samples for quality control to monitor protein-antibody reactions, the DNA extension step, and detection quality of the qPCR in order to estimate the background signal and to calculate the limit of detection (LOD) for Olink panels. Proteins below LOD were imputed with LOD 27 . Protein concentration was represented in arbitrary units on a log 2 scale and termed Normalized Protein eXpression (NPX), i.e., a one NPX difference means a doubling of protein concentration. The NPX value represented a relative quantification so that the data for a specific protein can be compared across different samples. Reference samples run on plates from different batches were included for batch-effect correction. The adjustment factor at protein level for each batch was calculated as median NPX of the bridging samples and subtracted from the NPX values of each sample. Batch-corrected log-transformed NPX was used in subsequent analyses (termed normalized NPX). We compared the reproducibility of the bridging samples using Pearson correlation. Supplementary Figure 1 shows the high reproducibility of the Olink panels across six representative sets of technical duplicates, with a mean correlation r = 0.97.

Differential proteomics analysis
To assess associations of PTSD and MCI with protein regulation, differential analyses were carried out using a linear model with normalized NPX as the dependent and case/control as independent variables, adjusting for age and race, on a subset of (a) 34 PTSD-MCI cases versus 81 unaffected controls, (b) 39 PTSD-only cases versus 81 unaffected controls, and (c) 27 MCI-only cases versus 81 exposed controls. Statistically significant proteins were identified at P < 0.05, as well as at false discovery rate (FDR) < 0.1 within each panel 28 . To assess the consistency of the findings, a Monte-Carlo experiment was conducted by randomly partitioning the data into 50% discovery and 50% replication subsample. We considered replicated proteins in which both the discovery and replication subsamples were significant at P < 0.10, and had effect sizes in the same direction. The random partitioning was repeated 100 times, and the number of times the proteins were replicated was recorded. The correlation between the estimated beta coefficients of all proteins for case/ control status across the three subset analyses was assessed using Pearson correlation coefficients. The overlap between the top proteins identified from each subset analysis was compared via a Venn diagram. The top proteins identified from this study were compared with recent omics studies of PTSD and Alzheimer's disease (AD).

Disease-burden analysis
Among the proteins identified at FDR < 0.1 from the PTSD-MCI subset analyses, three competing models were fitted to ascertain which of the following models best fit the protein-regulatory pattern: H1, the protein expression of PTSD-only subgroup was intermediary between PTSD-MCI and control (i.e., Control < PTSD only < PTSD-MCI or Control > PTSD only > PTSD-MCI), H2, the protein expression of the PTSD-only subgroup was similar to PTSD-MCI subgroup (i.e., Control ≠ PTSD only = PTSD-MCI), or H3, the protein expression of PTSD-only subgroup was similar to the unaffected controls (i.e., Control = PTSD only ≠ PTSD-MCI). For model H1, a linear model was fitted to the subgroup defined by 1 = control, 2 = PTSD only, and 3 = PTSD-MCI as an ordinal predictor. For model H2, a linear model was fitted to the subgroup defined by 0 = control, 1 = PTSD only, or PTSD-MCI as a binary predictor. For model H3, a linear model was fitted to the subgroup defined by 0 = control or PTSD only, 1 = PTSD-MCI as a binary predictor. All models were adjusted for age and race. The Bayesian Information Criterion (BIC) score was computed, and the model that corresponded to the smallest BIC score was selected as the best-fitting model. Analyses were repeated by replacing PTSD-only subgroup with MCI-only subgroup. Proteins that identified model H1 as the best-fitting model can be regarded as candidate biomarkers for disease burden characterized by co-occurrence of PTSD-MCI.

Multiprotein composite score
To evaluate the utility of proteomics in classifying cases and controls, we applied the elastic net algorithm 29 . For each case/control subset, the top-ranking proteins by P values from the differential expression analysis were used as candidate feature sets. Leave-one-out (LOO) crossvalidation prediction was used to evaluate model performance, i.e., the model was trained on N-1 samples, and used to predict the score in the left-out test sample, and the process was cycled through N samples. Within each training set, the optimal tuning parameters were determined via a fivefold cross-validation. The area under the ROC curve (AUC) was used as a metric for performance evaluation. Pearson correlation was calculated to estimate the association between the multiprotein composite scores and PTSD and MCI symptom-severity score.

Participant characteristics
The overall average age was 55.1 (SD = 7.78), and the mean ages of the four groups were similar. The majority of the sample was Caucasian, and no significant racial/ ethnic differences among cases and controls were observed (Table 1).

Differential protein analysis associated with PTSD and MCI
Subset analysis of comorbid PTSD-MCI case group versus controls identified 16 Olink proteins at P < 0.05, of which six attained FDR < 0.1. Eleven of the original 16 proteins were upregulated in cases. The six proteins significant at FDR < 0.1 were NCAN, BCAN, CTSS, MSR1, MDGA1, and CPA2; all six proteins were replicated >50% times in the Monte-Carlo experiment. On the other hand, subset analysis of PTSD-only cases versus controls identified 24 proteins at P < 0.05, of which two attained FDR < 0.1. In total, 22 out of these 24 proteins were upregulated in cases. The two proteins significant at FDR < 0.1 were CD302 and FLRT2; both were replicated >70% times in the Monte-Carlo experiment. Finally, subset analyses of MCI-only cases versus controls identified 20 proteins at P < 0.05, of which only one attained FDR < 0.1. Seven out of these 20 proteins were upregulated in cases. The protein significant at FDR < 0.1 was PVR, which was replicated >80% times in the Monte-Carlo experiment. Altogether, 50 unique proteins were obtained from the combined lists in subset analyses ( Table 2). Several identified proteins had been previously implicated in other omics studies of PTSD and AD. Additional details on comparison of these proteins with recent omics studies of PTSD and AD were provided in Supplementary Text and Supplementary  Tables 2-4. The Venn diagram comparing the overlap between the top proteins in subset analyses ( Fig. 1) suggested that CTSS was the only common protein identified by all subset analyses at P < 0.05, whereas EFNA4 was in common between PTSD-MCI and PTSD-only analyses; BCAN, MDGA1, CPA2, and EPHA10 were in common between PTSD-MCI and MCI-only analyses; PVR, CD200, and ATP6V1F were in common between PTSDonly and MCI-only analyses.
Among 50 unique proteins identified above, 39/ 50 showed consistent sign/direction in the estimated beta coefficients across the three subset analyses. The remaining 11 proteins were not among the proteins shared by any two subset comparisons. Across all 276 proteins examined in these analyses, the estimated beta coefficients for PTSD only versus controls and MCI only versus controls were moderately correlated (r = 0.345, P < 0.05) as shown in Fig. 2, suggesting that shared biological mechanisms may be involved in the two disorders.

PTSD-MCI-associated proteins linked to disease burden
Among the six proteins significant at FDR < 0.1 in the PTSD-MCI versus healthy control analysis shown in Table 2, BCAN and NCAN showed monotonically decreasing protein expression patterns, whereas for PTSD only versus PTSD-MCI, CTSS, MSR1, MDGA1, and CPA2 showed monotonically increasing protein expression patterns ( Supplementary Fig. 2). The BIC scores are reported in Supplementary Table 5. All the proteins (except NCAN) achieved the lowest BIC scores in the H1 model (i.e., the protein expression of the PTSD-only subgroup was intermediary between PTSD-MCI and control). The BIC scores of H1 and H3 models (i.e., control = PTSD only ≠ PTSD-MCI) of NCAN were comparable, indicating that both models fit NCAN equally well, and suggesting that these proteins are associated with disease burden of co-occurring PTSD and MCI compared with PTSD only. On the other hand, only for NCAN, H1 was the best model. The protein expression of BCAN, CTSS, MDGA1, and CPA2 indicated that the MCI-only subgroup was similar to PTSD-MCI since the BIC scores for H2 model (i.e., control ≠ MCI only = PTSD-MCI) were the lowest, whereas for MSR1, the MCI-only subgroup was similar to controls. These results suggest that the dysregulations of BCAN, CTSS, MDGA1, and CPA2 were most strongly associated with MCI.

Multiprotein composite score
The leave-one-out (LOO) cross-validation achieved an AUC = 0.84 in PTSD-MCI classification (Table 3) using the top 37 proteins associated with PTSD-MCI at P < 0.1 listed in Supplementary Table 6 as candidate features. The AUC was lower at 0.81 using the 16 proteins associated with PTSD-MCI (P < 0.05). Similarly, the LOO crossvalidation achieved AUC = 0.83 and 0.84 in MCI-only classification using the 20 and 41 MCI-only associated proteins (P < 0.05 and P < 0.1), respectively. However, the LOO cross-validation only achieved AUC 0.77 in PTSDonly classification (Table 3) using the 52 PTSD-only associated proteins at P < 0.1 listed in Supplementary  Table 6. The AUC was lower (0.68) using 24 PTSD-only associated proteins at P < 0.05 (Supplementary Table 7). In all three classification models, using all 276 proteins as candidate features achieved a lower AUC, suggesting that adding in other protein signals may induce noise (Supplementary Table 7). Taken together, the results from The P values were computed from one-way ANOVA (for age) and chi-squared test (for race).    *Statistically significant after accounting for the false discovery rate (FDR < 0.10).
multiprotein composite scores indicated that the panel of proteins included in this study had larger discriminative power for MCI compared with PTSD.

Discussion
Prior studies have shown that chronic PTSD in the responders to the World Trade Center disaster is associated with systemic and neuropsychiatric conditions including MCI 30,31 . Furthermore, in some instances, we demonstrated that not only was there an association, but that PTSD helps to mediate the development and chronicity of these conditions, and may be linked to possible early dementia 32 . The current study was the largest study to evaluate the molecular link between PTSD and MCI in the same cohort. It profiled a large set of proteins involved in a number of neurobiological processes, neurological diseases, cellular regulation, immunology, cardiovascular, inflammatory, development, and metabolism. In this study, we systematically assessed changes in the proteome of WTC responders suffering from PTSD with and without comorbid MCI nearly two decades after the traumatic event, in order to identify biomarkers that could inform us the biologic changes in our patients as well as the nature of the relationship between these conditions. We found that both MCI and PTSD were associated with serologic proteinopathy. The results also suggested that comorbid PTSD-MCI was likely a more severe form of PTSD rather than a separate condition. Last, we found that protein dysregulation was more systematically associated with MCI. As such, the multiprotein composite score provided us with a novel method to characterize and monitor patients with both MCI and PTSD and, if confirmed in independent studies, may ultimately give us insights into potential novel therapeutic interventions.
We identified 16 proteins associated with PTSD-MCI at p < 0.05 (six at FDR < 0.1), 20 proteins associated with PTSD only (two at FDR < 0.1), and 24 proteins associated with MCI only (one at FDR < 0.1), resulting in a total of 50 unique proteins from the combined lists. It is important to note that protein expression in the blood does not represent protein production in any specific tissue, per se, but rather proteins secreted into the blood from multiple organs and tissues. This is in contrast to gene expression analysis that is derived from a specific tissue. Nonetheless, although overall comparison with recent omics studies in AD showed that most of the top genes identified in these studies did not overlap with our targeted panel of 276 proteins as described in Supplementary Text, there were some that did as described below. Among these 50 proteins, only Cathepsin S (CTSS) was in common across the three subset analyses. Our analyses identified positive associations across the three subset analyses (r = 0.35-0.45), suggesting shared biological mechanisms across these two phenotypes. Notably, the gene encoding Cathepsin S (CTSS) had been found to be upregulated in the discovery cohort of Dean Hammamieh 33 , and plays an important role in antigen presentation and immune  responses 34 . Single-nucleotide polymorphisms (SNP) that map to the CTSS gene have been found to be associated with late-onset Alzheimer's disease (AD) 35 . Other members of the Cathepsin family have also been shown to be implicated in AD (Cathepsins B and D) 36,37 and SCZ (Cathepsin K) 38 . On the other hand, MAM domaincontaining glycosylphosphatidylinositol anchor protein 1 (MDGA1) and ephrin type-A receptor 10 (EPHA10), which were identified in both the PTSD-only and MCIonly analyses, have been found to be associated with pathologic and clinical diagnoses of AD in the transcriptomes of postmortem brain 39 . MDGA1 is implicated in the radial migration of cortical neurons of the neocortex 40 , whereas EPHA10 is involved in mobility in neuronal and epithelial cells and memory formation 41 . Similarly, V-type proton ATPase subunit F (ATP6V1F) and OX-2 membrane glycoprotein (CD200), which were identified in both the PTSD-only and MCI-only analyses, have been found to be differentially expressed in the transcriptomes of peripheral blood cells of patients with PTSD 33,42,43 . Based on the transcriptome mega-analysis results of Breen Tylee 43 (DE genes at P < 0.05 for each trauma-specific case-control cohort as evident in Supplementary Table 2 of Breen study), ATP6V1F and CD200 showed consistent effect-size direction in transcriptomic regulation compared with the proteomics results in our data. Specifically, ATP6V1F was downregulated in the gene expression of emergencydepartment trauma survivors 42 , consistent with the protein expression in our data. In addition, loss of function of ATP6V1F has been shown to be a potential enhancer of tau toxicity, a hallmark of AD 44 . Yet, CD200 was upregulated in childhood trauma and interpersonal trauma subgroups 45 , consistent with our proteomics data. CD200 expression was shown to be downregulated in the hippocampus and inferior temporal gyrus of AD patients 46 .
The authors further showed that lower expression of CD200 receptor was observed in microglia compared with blood-derived macrophages. Thus, we hypothesized that the upregulation of CD200 in plasma samples of our study could be a consequence of cell migration to blood through the blood-brain barrier.
The top two proteins, namely neurocan (NCAN) and brevican (BCAN) core proteins, identified from analyses of PTSD-MCI versus controls showed monotonically decreasing protein expression patterns across the PTSDonly and MCI-only subgroups, suggesting that these proteins are candidate biomarkers for disease burden characterized by co-occurrence of PTSD and MCI. Genetic variation in NCAN has been shown to be a common risk factor for bipolar disorder and schizophrenia 47 , as well as in MCI 48 . In addition, NCAN and BCAN are members of the chondroitin sulfate proteoglycan (CSPG) protein families, and CSPGs are implicated in neurodegenerative diseases 49 . Specifically, CSPGs have been shown to accumulate in senile plaques in brains of patients with AD 49 , potentially suggesting that fewer CSPGs will penetrate into the blood in AD. Together with the previous epidemiologic findings that PTSD is associated with long-term cognitive decline 30,50 , this suggests that NCAN and BCAN may constitute novel biomarkers contributing to processes by which PTSD affects cognitive functioning.
The multiprotein composite score based on top PTSD-MCI and MCI-only associated proteins achieved a high accuracy (AUC = 0.84) in PTSD-MCI and MCI-only classification, respectively. On the other hand, the multiprotein composite score based on top PTSD-only associated proteins achieved AUC = 0.77 in PTSD-only classification. These results suggested that the proteins included in this study have a larger discriminative power for MCI compared with PTSD. We also found a robust association between the composite score, PTSD, and CI symptom severity. This suggested that the current multiprotein composite score may be further refined into a useful index that aids in classification.

Strengths and limitations
This study has several strengths, including a large-scale high-precision multiplexed proteomic analysis of a large Table 3 Leave-one-out cross-validation prediction performance on models trained on subsets of (a) PTSD-MCI, (b) PTSD only, and (c) MCI only versus controls.

Classification
Candidate feature set AUC Correlation with PCL Correlation with MoCA score PTSD-MCI versus control 37 PTSD-MCI-associated Olink proteins at P < 0.1 from Supplementary Table 6 0.84 0.57 (P < 0.001) −0.54 (P < 0.001) PTSD only versus control 52 PTSD-only associated proteins at P < 0.1 from Supplementary Table 6 0.77 0.36 (P < 0.001) −0.12 (P = 0.18) MCI only versus control 41 MCI-only associated proteins at P < 0.1 from Supplementary Table 6 0.83 0.24 (P = 0.01) −0.52 (P < 0.001) number of neurological, inflammatory, and immunerelated proteins using validated panels, and a common trauma in all participants including controls. Nonetheless, our findings must be considered in the context of several limitations. First, our study is cross-sectional, which can establish concurrent associations between protein expression, PTSD, and MCI. However, the direction of the associations cannot be determined. Longitudinal studies of linkages between change in symptom severity and change in protein expression are needed to determine the direction of the effects we observed. Second, potential confounders, such as the level of trauma exposure and comorbid medical conditions, were not considered. Third, the multiprotein composite score was constructed based on the proteins identified from the same study samples. Although we used a LOO cross-validation prediction scheme to reduce the bias in model evaluation, it is important to replicate the composite score in an independent validation cohort. Fourth, although our study covered a wide spectrum of proteins, it is a targeted proteomics study and may therefore miss changes in proteins that were unobserved in this study. In addition, the multiprotein composite score indicated that the current proteomics panel can discriminate MCI from control at high accuracy; however, the accuracy is lower in PTSD classification. It remains uncertain whether PTSD classification accuracy would be improved by surveying other proteins. Mass spectrometry is a competing platform for more comprehensive and hypothesis-free protein coverage. However, absent a targeted hypothesis, this platform requires a much larger sample size to rule out the greater numbers of false positives.

Conclusion
To conclude, the current study identified several novel protein biomarkers for PTSD, MCI, and their cooccurrence. Many of these proteins have previously been implicated in other neurological and psychiatric disorders, in particular AD and schizophrenia. We also found substantial similarities in the profile of protein alterations of PTSD and MCI. This coincides with the evidence of shared heritability and molecular similarities across common brain disorders 51 . Our study further derived a multiprotein composite score that, upon replication and pending further refinement, could aid development of a practical, plasma-based assay to aid in classifying PTSD, MCI, and comorbid PTSD-MCI. Ultimately, the composite score could potentially be used to monitor patients longitudinally.