Diversity of Amyloid-beta Proteoforms in the Alzheimer’s Disease Brain

Amyloid-beta (Aβ) plays a key role in the pathogenesis of Alzheimer’s disease (AD), but little is known about the proteoforms present in AD brain. We used high-resolution mass spectrometry to analyze intact Aβ from soluble aggregates and insoluble material in brains of six cases with severe dementia and pathologically confirmed AD. The soluble aggregates are especially relevant because they are believed to be the most toxic form of Aβ. We found a diversity of Aβ peptides, with 26 unique proteoforms including various N- and C-terminal truncations. N- and C-terminal truncations comprised 73% and 30%, respectively, of the total Aβ proteoforms detected. The Aβ proteoforms segregated between the soluble and more insoluble aggregates with N-terminal truncations predominating in the insoluble material and C- terminal truncations segregating into the soluble aggregates. In contrast, canonical Aβ comprised the minority of the identified proteoforms (15.3%) and did not distinguish between the soluble and more insoluble aggregates. The relative abundance of many truncated Aβ proteoforms did not correlate with post-mortem interval, suggesting they are not artefacts. This heterogeneity of Aβ proteoforms deepens our understanding of AD and offers many new avenues for investigation into pathological mechanisms of the disease, with implications for therapeutic development.

Retention time drifts were no more than ±1 min with the exception of peptide LSQKFPK (from 18.5 min to 16.1-16.7 min) in the first run, likely due to column equilibration at the beginning. D. Mass accuracy in parts-per-million (ppm) of 29 BSA peptides over the four-day course of the CDR3 cohort DDA experiment. Mass accuracy did not demonstrate a substantial drift (e.g., ±15 ppm). These system suitability metrics for DDA (i.e., untargeted) proteomic experiments are extracted from the data directly and do not require database searches (i.e., identification-free metrics). These metrics allow determination of sub-optimal instrument performance which, in a untargeted proteomics experiments, can waste biological samples and lead to incorrect conclusions. Overall these metrics are stable throughout the experiment.

Summary
In summary, we have developed a method for further enrichment of Aβ from the other, as-yet-uncharacterized protein components of the soluble Aβ aggregates and a top-down (undigested) nLC-MS/MS method for the mass spectrometric characterization of Aβ proteoforms (Fig. 1, Supp. Fig. 1-3, Supp. Table 1-2). Applying this combined approach to soluble aggregates and insoluble material in brains of 6 cases with severe dementia and pathologically confirmed AD, we found diversity of Aβ peptides -26 unique proteoforms total.
There are many unanswered questions arising from the data presented: 1. How do the observed truncations and PTMs affect AD pathology? The age of presentation of clinical dementia, rate of cognitive decline, and plaque growth/deposition? 2. If a large portion of Aβ peptide in both soluble and insoluble aggregates is modified (e.g., truncated), how does this affect the binding and thus accuracy of PET-PIB and other amyloid-binding agents for imaging studies? 3. If Aβ is ~0.1% of total soluble aggregates based on Aβ1-x ELISA, what is the composition of the ~99.9% as-yet-uncharacterized protein components of the soluble aggregates? 4. How do the truncations and PTMs affect Aβ properties such as solubility, aggregation propensity, toxicity, interaction with 'binding partners' (e.g., prion protein) 1 , or recognition by the brain's immune response system, microglia? 5. Would we expect the same level of heterogeneity in the familial early-onset cases of AD? 6. What is the proteoform diversity in Aβ monomers and plaques, and which proteoforms are predominant in each fraction? 7. Is there any correlation between Apolipoprotein E genotypes and particular Aβ proteoforms? 8. Would any of these proteoforms be candidate pharmacodynamic markers for AD treatments?
We are in the early stages of characterizing Aβ proteoforms present in the soluble Aβ aggregates across the spectrum of AD progression using our top-down nLC-MS/MS approach followed by absolute quantitation of proteoforms that segregate AD from cognitively intact high pathology individuals. A full characterization of the Aβ proteoforms in a cross-sectional study of patient cohorts is beyond the scope of this communication.

Relationship to Previous Studies
PTMs are chemical modifications to proteins 2 . PTMs can result in changes in the physicochemical and biochemical properties of proteins, increasing the range of functional outcomes and providing an additional level of regulatory control 3,4 . The site localization, chemical nature of the PTM, and the microheterogeneity serve to precisely fine-tune the "information carrying capacity" of proteins within cellular networks 2,5 . One of the first described Aβ PTMs was pyro-glutamate at the third and eleventh amino acid residues of Aβ in human AD brain 6 . However, analysis of Aβ proteoforms has been largely restricted to examination of plaques 6-10 or more recently CSF [11][12][13] . These studies, particularly plaque-based studies, came to varying conclusions as to which Aβ proteoform was most abundant. This lack of consensus is likely due to variations in methodology used to isolate Aβ and tools to identify the sequence structure. However, few studies have examined the PTMs present on soluble Aβ aggregates from either human or transgenic animal models. PTMs on soluble Aβ examined todate include glutamine deamidation 14 and di-tyrosine crosslinking of synthetic Aβ 15 , recombinantly expressed Aβ from CHO cells 16 , and N-terminal extension 17 and O-linked glycosylation of CSF Aβ 18 . One study identified phosphorylation of serine 8 in transgenic mice and human AD tissue 19 . We found 26 unique Aβ proteoforms (Fig. 2) in severe AD brain. Of the 26 proteoforms identified none were phosphorylated 19 , glycosylated 18 , nitrated 20 , or di-tyrosine linked 20 . Nor did we identify N-terminal extension, but rather extensive N-and C-terminal truncations. These negative results should be interpreted with caution, as we have not determined whether the methods we have used would have been sufficient for detection of these other PTMs or extended peptides. However, the clear trend of N-terminally truncated Aβ proteoforms having greater abundances in the more insoluble fraction relative to the soluble aggregates is in line with early work demonstrating that these types of peptides displayed enhanced aggregation via sedimentation analyses 21 .

Advantages of our approach
The advantages of our approach include: i) the application of top-down (i.e., intact, undigested) mass spectrometry to Aβ allows the unambiguous identification of the peptide and all its combinatorial truncations and PTMs. ii) The high resolution and high mass accuracy of the intact Aβ peptide and its fragment ions via tandem mass spectrometry (MS/MS) provide precise peptide sequencing information. iii) The discovery of both known and novel variations at proteome level. Other techniques like matrix-assisted laser desorption/ionization (MALDI) do not offer the same level of resolution and mass accuracy. In this study, it was the resolution and mass accuracy of the parent and fragment ions that allowed the unambiguous identification of peptide sequences and PTMs. Importantly, we employed MS/MS to fragment the peptides, producing fragment ions, for direct de novo sequencing, a technique not often applied in MALDI-based studies 11 . Finally, chromatographic separation of proteoforms before mass spectrometry analysis offers better sample separation, reducing sample complexity and improving resolution.

Limitations
While our approach offers numerous advantages, it is not without its limitations. There are numerous reports that detail types of proteoforms identified either by mass spectrometry or biochemistry 20 offering some insight as to what Aβ proteoforms we might expect to find in human AD brain. However, an extraction of soluble HMW Aβ aggregates as detailed in our previous work 22 , followed by high-resolution mass spectrometry has never been performed before; thus, in practice we had no a priori knowledge. In our approach, we used a mass spectrometry data acquisition method commonly referred to as data-dependent acquisition (DDA). In DDA, MS/MS scans of a given proteoform are acquired once the proteoform has been detected in the preceding MS scan. A limitation to this approach is that MS/MS spectra acquired are done so in a stochastically sampled manner. In other words, the most abundant proteoforms are targeted for MS/MS before less abundant proteoforms. Further, since this in an untargeted approach, due to the lack of a priori knowledge, not every MS/MS spectrum acquired is guaranteed to provide all the fragment ions needed to unambiguously determine a given sequence and/or PTM. An alternative approach in future studies would be data-independent acquisition (DIA), which does not require a priori knowledge like DDA, but instead fragments all precursors in a defined m/z window (MS/MS-based quantification 23 ). The result is effectively a lower limit of detection (more sensitivity) compared to DDA (MS-based quantification) 23 due to increased selectivity 23,24 . However, DIA has been generally used and optimized for "bottomup" (i.e., proteolytically digested) samples 25 rather than "top-down" (undigested) samples. Coupled to these limitations, we used differential mass spectrometry (dMS) for proteoform quantitation in this study. Labels, both chemical and isotopic, commonly employed in bottom-up experiments are not readily applicable to top-down proteomics for quantitation. It is often difficult to chemically label intact proteins [26][27][28] . Isotopic labeling through stable isotopes, for example, is not guaranteed to incorporate into the entire protein and further can result in often difficult to interpret data due to overlapping isotopic envelopes and fragment ions 27 . Thus, we used dMS, which is reported to detect as low as 2-fold changes in protein or peptide expression between sample sets [29][30][31] . Until the issue of labeling intact proteins are addressed in the field, quantitation in top-down proteomics experiments will remain largely limited to dMS.
Ultimately, the question of whether the N-and C-terminal truncations and posttranslational modifications of Aβ reflect findings relevant to living AD patients vs. post-mortem artefact remains unanswered despite our correlation analysis. Initial attempts to discern whether C-terminal truncations were an artefact of sample preparation by using 18 O water demonstrated incorporation of one to two heavy oxygen isotopes into the C-terminus of canonical Aβ1-42 (data not shown). Brain biopsy in living humans undergoing a craniotomy to relieve normal pressure hydrocephalus followed by immediate preparation by our protocol 22 would help address this question. The fact that several truncated Aβ proteoforms were found in the CSF of living human AD patients 12,13,17,32 reduces the likelihood that these specific proteoforms are the result of post-mortem or sample preparation artefacts. Yet, whether the heterogeneity of the proteoforms observed is a reflection of the 'age' of the peptide in brainwith more truncations and PTMs reflecting a longer lived peptide subjected to more processingor if Aβ is rapidly processed after canonical beta-and gamma-secretase cleavage of amyloid precursor protein remains unresolved 33,34 . Future work utilizing either BACE1 inhibitors or general asparticacid protease inhibitors in the homogenization buffer could be a step forward in resolving some of aspects of this issue particularly with Aβp11-42 and Aβ11-42 (Supp. Fig. 2 and Supp. Fig.  9). Further, while we take great care to remove all leptomeningeal, and intraparenchymal vessels to the fullest extent possible, it is impossible with our current method and that of others (to the best of our knowledge) to fully exclude all blood vessels (including capillaries). Given this limitation, we cannot exclude the possibility of contribution of Aβ proteoforms from vessel walls in our data. Aβ proteoforms from vessel walls may represent a different pool of proteoforms in AD brain. Lastly, given the surprising level of heterogeneity observed in Aβ, it is now evident that our Aβ1-x ELISA is underestimating the total amount of Aβ, since the assay uses HJ3.4 as the detection antibody, which requires Aβ to have a free (unmodified) canonical Aβ N-terminus. As a result, an assessment of any Aβ proteoforms lost more readily than others throughout the purification process and C8 micro-column procedure is lacking. In the absence of such an assay, future studies will employ an isotopic-labeled internal Aβ standard for data normalization.

Priorities for future research
1. Future studies will include a broader exploration of the spectrum of Aβ proteoforms present in the AD brain across disease severitymild cognitive impairment, mild dementia, and severe dementiacompared to non-demented with high pathology controls and older healthy controls. Certain Aβ proteoforms may play an important role in the pathogenesis of AD. For example, if the predominant pathogenic Aβ proteoforms were C-terminally truncated, monoclonal antibodies targeting C-terminal truncated neoepitopes of Aβ would have a higher chance of success in clinical trials compared to those targeting the N-terminus. Many monoclonal antibodies currently in clinical trials target the canonical N-terminus and have not shown success 35 , yet, likewise, mid-domain monoclonal antibody Solanezumab has also recently shown disappointing results 36 . However, we cannot exclude the possibility that timing, dosage, or combination of therapeutics may still net an effect in reducing disease progression.
2. The characterization of Aβ proteoforms could lead to the targeted analysis of CSF or plasma in non-demented patient cohorts to identify which proteoforms may be a predictive marker for progression to clinical dementia.
3. Given the level of heterogeneity observed in Aβ, the development of an ELISA with capture and detection antibodies in the mid-domain would provide more accurate orthogonal measurements of total Aβ. However, the problems of cross reactivity with other amyloid precursor protein fragments and mid-domain steric hindrance would need to be addressed. If successful, this would likely improve consistency among reports of Aβ measurements using ELISAs to make therapeutic, prognostic, or diagnostic assessments.

Implications
The present findings provide an initial profile of Aβ proteoforms derived from soluble and more insoluble Aβ aggregates in human AD brain. Readily apparent from the data are the diversity of Aβ in human tissue. These data clearly demonstrate our dearth of knowledge in Aβ and the properties of these modified forms, which remain entirely unknown until further investigation.
Overall, this work i) developed an operational template for advancing our understanding of Aβ proteoform abundance and expression during disease progression, which will inform our therapeutic efforts. ii) Will facilitate direct comparison of human Aβ proteoforms to those found in transgenic animal models to determine which, if any, faithfully recapitulate those found in human AD 37,38 . Such an animal model would be the logical choice for novel AD therapeutic development, though it is possible no existing model accurately mimics the human disease, which would prompt the development of new model systems. iii) Enable direct comparison with Aβ proteoforms in the CSF to provide diagnostic and prognostic markers to distinguish AD patients from non-AD and monitor clinical trial treatment effects. Furthermore, the purification scheme and the top-down platform developed here may be used as a framework for the assessment of other human neurodegenerative diseases with hallmark proteinopathies, such as alpha-synuclein, prion protein, and superoxide dismutase 1 and therefore potential relevant therapeutic targets.