Suprabasin-derived bioactive peptides identified by plasma peptidomics

Identification of low-abundance, low-molecular-weight native peptides using non-tryptic plasma has long remained an unmet challenge, leaving potential bioactive/biomarker peptides undiscovered. We have succeeded in efficiently removing high-abundance plasma proteins to enrich and comprehensively identify low-molecular-weight native peptides using mass spectrometry. Native peptide sequences were chemically synthesized and subsequent functional analyses resulted in the discovery of three novel bioactive polypeptides derived from an epidermal differentiation marker protein, suprabasin. SBSN_HUMAN[279–295] potently suppressed food/water intake and induced locomotor activity when injected intraperitoneally, while SBSN_HUMAN[225–237] and SBSN_HUMAN[243–259] stimulated the expression of proinflammatory cytokines via activation of NF-κB signaling in vascular cells. SBSN_HUMAN[225–237] and SBSN_HUMAN[279–295] immunoreactivities were present in almost all human organs analyzed, while immunoreactive SBSN_HUMAN[243–259] was abundant in the liver and pancreas. Human macrophages expressed the three suprabasin-derived peptides. This study illustrates a new approach for discovering unknown bioactive peptides in plasma via the generation of peptide libraries using a novel peptidomic strategy.

Human plasma is the most commonly sampled diagnostic biospecimen and is a potentially informative resource for characterizing proteomes 1 . Low-molecular-weight peptides that circulate in plasma as hormones, cytokines and growth factors, constitute an essential part of the homeostatic regulatory mechanism of many physiological processes. Despite recent successes in tryptic plasma-based proteome analyses 2,3 , an important unmet challenge is to comprehensively identify plasma native peptide sequences that have undergone endogenous proteolytic processing 4 . Polypeptide sequences stored in protein databases consist almost exclusively of trypsin-digested sequence information of proteins derived from tissues/organs, secreted products and some biological fluids 1 . The enormous concentration range of currently identified plasma proteins/peptides roughly spans 12 orders of magnitude from highly expressed proteins to low-abundance bioactive/biomarker peptides 4,5 . This makes direct in-depth plasma peptidomics profiling one of the most challenging tasks in analytical biochemistry 6,7 and represents a major limiting factor for identifying bioactive/biomarker peptides and their receptors. Our previous approach to identify novel bioactive peptides used full-length cDNA library information and the synthesis of putative native peptide sequences estimated from the proteolytic processing of secretory proteins. We then explored the biological activities of these peptides and confirmed their immunoreactive presence in human peripheral circulation 8,9 . This in silico methodology enabled the discovery of potent bioactive peptides possessing physicochemical characteristics that prevent their identification by conventional methods: however, to confirm their discovery, exact native amino acid sequences in the peripheral circulation need to be demonstrated 10 .
We attempted to improve our high-yield plasma extraction technique 11 to allow comprehensive identification of undiscovered native peptides using mass spectrometry without enzyme-digestion of plasma proteins. Our original method, which we named "differential solubilization", efficiently depleted plasma of highly abundant proteins with minimal loss of low-molecular-weight peptides. This efficiently enriched native peptides that were both unbound and bound to carrier proteins, but not to the level required to detect important low-molecularweight bioactive peptides by mass spectrometry. Here we describe optimization of the protocol to further remove residual plasma proteins from extracts enriched by the above method and to improve the efficiency of peptide

Results
Peptidomic strategy for the analysis of non-tryptic human plasma. We used 2-200 μL of pooled human plasma from four healthy individuals to deplete high-abundant proteins and to further remove residual high-molecular weight proteins using our modified "differential solubilization" methodology 11,12 . We confirmed using tricine SDS-polyacrylamide gel electrophoresis that this procedure maximally enriched for low-molecular weight native peptides that had been both bound and unbound to plasma carrier proteins as described 11,12 . Enriched eluates, with or without the reductive alkylation procedure, were either analyzed directly without prefractionation by liquid chromatography tandem-mass spectrometry (LC-MS/MS) or subjected to reversedphase high performance liquid chromatography (RP-HPLC) for prefractionation into 8 or 13 fractions by a 'cyclic sample pooling technique' prior to LC-MS/MS 21 . To minimize hydrophobic peptide loss, we used siliconized tubes in all procedures and LC-MS/MS compatible surfactants for repetitive enrichment and solvent exchange during lyophilization and redissolution 22 . Subsequent LC-MS/MS analyses of both unseparated extracts and those prefractionated yielded high resolution/high sensitivity data, which were subjected to a PEAKS database search based on a de novo sequencing-based database search 10 . The above process facilitated the direct detection of plasma native peptides at concentrations as low as picomolar range. This resulted in the identification of more than 7959 distinct native peptide sequences with peptide identification false discovery rates (FDR) of 0% after excluding peptides derived from the keratin protein family. The peptidomics data identified by mass spectrometry have been deposited into the ProteomeXchange Consortium via the PRIDE 23 partner repository with the dataset identifier PXD003533, which contains other peptides identified by a PEAKS PTM search in addition to the 7959 peptides identified.
Synthetic polypeptides of the identified novel sequences. We sought to discover novel bioactive peptides using synthetic polypeptides designed from the identified sequences. We selected native peptide sequences that were less than 39 amino acid residues in length, identified with an FDR of 0%, uniquely assigned to a single secretory protein, and without any amino acid substitutions/modifications. Of the 104 peptides we initially synthesized, nine showed insufficient solubility and/or low purity by LC-MS/MS analysis. Thus, the remaining 95 peptides (Supplementary Table S1) were tested for their ability to induce cellular responses in cultured mammalian cells or to modulate spontaneous animal behaviors as described below.  (Fig. 3a,b), and was blocked by pretreatment with a dihydropyridinesensitive calcium channel antagonist, nicardipine (Fig. 3e,

Anorexigenic and anti-dipsogenic effects of SBSN_HUMAN[279-295].
We next studied whether any of the 95 synthetic peptides could regulate appetite, thirst, or spontaneous behaviors more potently than already known anorexigenic/orexigenic peptides. Mice were intraperitoneally injected with 0, 10 or 100 pmol synthetic peptide prior to the start of a dark phase and eating, drinking, and locomotor activities were continuously monitored throughout the dark phase 24,25 . We found that another 17 amino acid residue peptide derived from suprabasin, GQGVHHTAGQVGKEAEK {SBSN_HUMAN[279-295] or suprabasin(279-295), monoisotopic mass 1732.8725} (Figs. 1a,b, 2c), suppressed food and water intake and induced spontaneous locomotor activity at both 10 pmol/mouse and 100 pmol/mouse (Fig. 7, ANOVA, p < 0.0001). An inhibitory effect of SBSN_HUMAN[279-295] on food intake was significant as early as 15-45 min following injection, and lasted for at least 120 min. These results demonstrated SBSN_HUMAN[279-295] to be a potent, endogenous, peripherally acting anorexigenic peptide in human plasma.

Presence of suprabasin-derived peptides in human plasma, tissues and cells.
To determine accurate plasma levels of the three suprabasin-derived peptides, all of which have high hydrophilicity, we synthesized respective stable isotope labelled peptides (Fig. 2). We then spiked human plasma samples with serial dilution of the peptides prior to extraction. LC-MS/MS analyses were performed to generate the extracted ion chromatogram (XIC) intensities 26

Discussion
In this report, we describe a peptidomic strategy for non-tryptic plasma, which successfully identified a large number of distinct native peptides in human peripheral circulation. Using this peptidomic resource, we performed functional validation studies of synthesized peptides and ultimately unraveled the potent biological activities of three endogenous peptides. We did not employ proteomic differential display analysis or search for factors that showed distinct plasma level differences between healthy and disease samples; this is because most endogenous bioactive peptides are secreted locally and exert their effects in endocrine and/or paracrine fashions without significantly altering plasma concentrations or tissue expression levels. Therefore, we created  www.nature.com/scientificreports/ the current strategy to first accurately identify as many native peptide sequences as possible that are universally present in healthy human peripheral circulation, and to functionally screen many of these factors to discover the most powerful endogenous regulators. We tested the ability of the initial series of 95 peptides to elicit intracellular responses in a cell culture model system and to modulate spontaneous animal behaviors. We subsequently performed extensive validation studies to confirm their functions and their systemic expressions in human tissues and cells. These initial attempts resulted in the discovery of three novel bioactive peptides derived from the same precursor protein, suprabasin, which have potent and desired biological activities.
Our results revealed SBSN_HUMAN[225-237] and SBSN_HUMAN[243-259] to be novel endogenous stimulators of NF-κB and inducers of cytokines, chemokines and growth factors, which are well-known to play pivotal roles in inflammation and cancer biology. NF-κB is an inducible dimeric transcription factor 27 and upon dissociation from the inhibitor protein, IκB-α, which is degraded by the 26S proteasome, it is translocated to the nucleus and binds to specific DNA sequences in the promotor regions of multiple genes 28 43 , for example, reduce food intake in mice after intraperitoneal administration of 100 µg/kg, 25 mg/kg and 4 µg/kg, respectively, which correspond to approximately 6 to 8 orders of magnitude higher doses than those we observed with SBSN_HUMAN[279-295]. In the current study, we carefully measured food and water consumption and locomotor activities using a system that accurately measures these spontaneous behaviors over the entire dark phase of the diurnal cycle 24,25 because the majority of spontaneous rodent activities occur during the dark phase 44 . Considering the plasma concentration of SBSN_HUMAN[279-295] determined to be 1.5 nM using a stable isotope-tagged peptide technique and that very low doses exerted anorexigenic and anti-dipsogenic effects, endogenous SBSN_HUMAN[279-295] is potentially involved in the maintenance and/or regulation of eating and drinking behaviors.
The efficiency and accuracy of our current success in enriching and identifying low-molecular-weight native peptides embedded in a vast amount of plasma proteins is analogous to correctly selecting and identifying several (a-f) Synthetic SBSN_HUMAN[279-295] was intraperitoneally injected to ad libitum watered and fed mice approximately 30 min before the onset of the dark phase, and cumulative food intake (a,d), water intake (b,e) and spontaneous locomotor activities (c,f) were monitored throughout the entire dark phase of the diurnal cycle. Cumulative food and water intake are expressed in grams and mL, respectively, and physical activity is represented by infrared beam interruption counts for non-treated mice (open circles), and for mice treated with 10 pmol/mouse (a-c) or with 100 pmol/mouse (d-f) SBSN_HUMAN[279-295] (closed circle) during the initial 180 min period. *p < 0.05, **p < 0.01 compared with control animals. Data are expressed as the mean ± S.E.M. (n = 6-7 mice per group). www.nature.com/scientificreports/ hundred sesame seeds randomly scattered in an Olympic-size swimming pool filled with an enormous variety of beans. Our technique to enrich native peptides in plasma is comparable to removing the 'beans' from the pool to as few as 100,000 without significantly losing or damaging the 'sesame seed' population, while subsequent fractionation of the sesame-seed-rich fluid enables accurate identification of these sesame seeds. Thus, our unique non-tryptic peptidomic strategy using a single drop of human plasma enables us to commence high throughput functional screening using the identified endogenous "orphan ligands" library.
In conclusion, our current approach identified a huge number of endogenous peptide sequences far more rapidly than previous efforts and our initial trial to screen the biological activities of 95 synthetic peptides resulted in the discovery of three novel peptide hormones. Our human plasma peptidome strategy facilitated the discovery of novel bioactive peptides and biomarkers far more efficiently than other approaches; therefore, we propose to use either UniProt entry names or protein names with their amino acid positions, such as SBSN_HUMAN [279][280][281][282][283][284][285][286][287][288][289][290][291][292][293][294][295] or human suprabasin(279-295), rather than designating new names to each new bioactive peptide discovered.

Methods
Peptide extraction and reductive alkylation of human plasma. Blood samples were collected from four healthy volunteers into vacutainers containing Na 2 -EDTA and immediately separated in a refrigerated centrifuge at 1000g for 20 min. Aliquots were immediately flash-frozen in liquid nitrogen and stored at − 80 °C www.nature.com/scientificreports/ until processing. Thawed plasma was processed according to the differential solubilization method, as described previously 11,12 , but with the following modifications. A 50-μL plasma sample was diluted 1:2 with 100 μL denaturing solution (7 M urea, 2 M thiourea and 20 mM dithiothreitol), slowly dropped into 2 mL ice-cold acetone, with stirring at 4 °C for 1 h and then centrifuged at 19,000g for 15 min at 4 °C. The precipitate was resuspended in 1 mL 80% acetonitrile containing 12 mM HCl, mixed at 4 °C for 2 h and centrifuged again at 19,000g for 15 min at 4 °C. The low molecular weight peptides fraction in the supernatant was lyophilized and stored at − 80 °C until use. Efficient depletion of plasma high abundant proteins was confirmed using tricine SDS-polyacrylamide gel electrophoresis of the eluted samples as described 11,12 . Lyophilized peptides were re-dissolved in a solution of 1 × Invitrosol (Life Technologies, CA, USA) and 100 mM ammonium hydrogen carbonate 22  All LC-MS/MS acquisition data in each of MS group was searched together against the SwissPro_2020_03.fasta database (selected for Homo sapiens; 20,365 entries) using PEAKS database search algorithm (Bioinformatics Solutions, Waterloo, Canada). The PEAKS Studio (version X) performed peak picking, de-isotoping, charge deconvolution of fragment ions and a de novo peptide sequencing-based database search from MS and MS/MS spectra of peptides. The search parameters were as follows: enzyme, no enzyme; fixed modification, carbamidomethyl (C, only for samples with reductive alkylation); variable modifications, acetyl (N-term, K), amidated (C-term), pyro-glu from Q (Q), oxidation (M), carbamidomethyl (N-terminal, only for samples with reductive alkylation); peptide ion mass tolerance, 6 ppm; fragment ion mass tolerance, 0.02 Da. The PTM algorithm of PEAKS Studio was applied to identify variable modifications and substitutions [46][47][48] . The FDR was set as 1%.
Peptide synthesis and dissolution. To explore the biological functions of the identified sequences, we screened the bioactivities of selected peptides having the following characteristics for chemical synthesis: (1) uniquely assigned to precursor proteins, (2) having a gene name, (3) encoded by secretory proteins as defined by SwissProt keywords, (4) between 5 and 39 amino acid residues, (5) without any substitutions or modifications. For the initial round of chemical synthesis, 105 sequences were synthesized (SCRUM. Tokyo, Japan) and reconstituted to 2-10 × 10 -3 M in 10% acetonitrile/0.1%TFA. After dilution to 1:1000, the synthesized peptides were analyzed by LC-MS/MS to confirm their high purity and solubility. www.nature.com/scientificreports/ Measurements of plasma peptide levels using stable isotope-labeled peptides. The following three stable isotope-labeled peptides were synthesized by Scrum (Tokyo, Japan) using l-lysine-N-9-fluorenylmethoxycarbonyl ( 13 C 6 , 98%; 15 N 2 , 98%) and l-arginine-N-9-fluorenylmethoxycarbonyl ( 13 C 6 , 98%; 15 (Fig. 2). The plasma concentrations were then extrapolated from the XICs generated using the respective endogenous peptides and the corresponding spiked stable isotope-labeled peptides as described 26 .
Cell culture. Human aortic smooth muscle cells (HAoSMCs) from Promocell (Heidelberg, Germany), rat aortic smooth muscle cells (A-10, CRL 1476) from American Type Culture Collection (USA), human monocytic leukemia cells (THP1) and human hepatocellular carcinoma cells (HepG2) from Riken CELL BANK (Ibaraki, Japan), and human keratinocyte cells (HaCaT) from Cell Lines Service (Eppelheim, Germany) were purchased. Cells were cultured using the appropriate medium and supplements recommended by the suppliers. Successive experiments using HAoSMCs were performed with passage 4-8 cultures. Human monocytes were separated from peripheral blood samples using Lymphocyte Separation Solution (Nacalai Tesque, Kyoto, Japan) as described 49 and incubated in RPMI1640 medium containing 10% fetal bovine serum for 3-4 days for differentiation into macrophages. Real-time RT-PCR. Total mRNA was isolated from HAoSMCs using TRIzol reagent (Invitrogen, Carlsbad, CA, USA), reverse transcribed with a first-strand cDNA synthesis kit (Takara Bio, Shiga, Japan), and quantified using a CFX96 Touch™ Real-Time PCR Detection System (Bio-Rad Laboratories)-based RT-PCR protocol using KAPA SYBER (Nippon Genetics, Tokyo, Japan) as described 52 . After reverse transcription, the reaction mixtures were denatured at 94 ℃ for 3 min followed by 40 cycles of PCR at 94 ℃ for 10 s, 55 ℃ for 10 s, 72 ℃ for 30 s. PCR primers were synthesized by Eurofins Genomics (Tokyo, Japan) and their sequences are shown in Supplementary Table S3.

Determination of intracellular free Ca 2+ concentration [Ca
Peptide binding to the surface of cells. A10 cells and HAoSMCs plated on glass coverslips were deprived of serum for 16 h and incubated for 30 min with each FAM-labeled suprabasin-derived peptide (10 -6 M). Cells were then washed twice with PBS and fixed with 4% paraformaldehyde for 30 min. The nuclei were counterstained using DAPI Fluoromount-G (SouthernBiotech, Birmingham, AL, USA) and fluorescence was detected using an LSM710 confocal microscope (Carl Zeiss) as described 53,54  Samples were injected onto a C 18 0.075 × 120 mm analytical column (Nano HPLC Capillary Column; Nikkyo Technos, Tokyo, Japan) attached to an EASY-nLC 1000 HPLC system (Thermo Fisher Scientific). The mobile phase consisted of 0.1% formic acid and 90% acetonitrile (solvent A), and the mobile phase gradient was programmed as follows: 0-5% A (0-2 min), 5-25% A (2-42 min), 25-55% A (42-56 min), 55-95% A (56-57 min), and 95% A (57-60 min). Separated peptides were subjected to analysis on a Q-Exactive™ instrument (Thermo Fisher Scientific) operated in data-dependent mode to automatically switch between full-scan MS and MS/MS acquisition. Full-scan mass spectra (m/z 350-900) were acquired on an Orbitrap instrument (Thermo Fisher Scientific) with 70,000 resolution at m/z 200 after accumulation of ions to a 3 × 10 6 target value. The 20 most intense peaks with charge states more than two were selected from the full scan with an isolation window of 2.4 Da and fragmented in the higher energy collisional dissociation cell with a normalized collision energy of 27%. Tandem mass spectra were acquired on the Orbitrap mass analyzer with a mass resolution of 35,000 at m/z 200 after accumulation of ions to a 5 × 10 5 target value. The ion selection threshold was 2 × 10 3 counts, and the maximum allowed ion accumulation times were 100 ms for full MS scans and 50 ms for tandem mass spectra. Typical mass spectrometric conditions were as follows: spray voltage, 2 kV; no sheath and auxiliary gas flow; heated capillary temperature, 250 °C; and dynamic exclusion time, 30 s.
Database searches were performed using the SEQUEST algorithm 57 incorporated in Proteome Discoverer 1.4.0.288 software (Thermo Fisher Scientific). The search parameters were as follows: enzyme, trypsin; variable modification, oxidation (M); static modification, carbamidomethyl (C); peptide ion mass tolerance, 6.0 ppm; fragment ion mass tolerance, 0.02 Da. The false discovery rate for peptide identification was set at 1%.
In vivo analysis of water/food intake and spontaneous locomotor activity. Adult male C57BL/6J mice weighing 25-30 g (CLEA Japan, Tokyo, Japan) were maintained essentially as described 25,58 , in a controlled temperature environment (22-25 °C), with a 12-h light-dark cycle (lights on 08:00-20:00), and with free access to food (standard laboratory chow; CE2, CLEA Japan, Tokyo, Japan) and water. After at least 7 days of habituation with saline injections on weekdays, mice were intraperitoneally injected with indicated doses of synthetic peptides dissolved in 100 μL ddH 2 O or with 100 μL ddH 2 O alone using a 27 G needle without anesthesia approximately 30 min before the start of the dark period. Water and food intake and the locomotor activity were recorded using a simultaneous monitoring system (ACTIMO-100 M combined with MFD-100, Shinfactory, Fukuoka, Japan) essentially as described 24,25 . In brief, the monitoring system used beam sensors located every 2 cm along the floor of the enclosure and ACTIMO-DATA software (Shinfactory, Fukuoka, Japan) to detect animal movements. Data were imported in real-time using the Spike2 analysis program (Cambridge Electronic Design, Cambridge, UK). Water intake was measured by a drop counting system. A food container was designed to prevent mice from dragging food into their bedding and to avoid spillover and the minimum quantity of measurable food using the microbalance was 0.01 g. Water and food intake were recorded simultaneously every 3 min. Mice were housed in individual chambers for 3 days for them to become familiar with the recording environment. The experiments were performed during the dark phase (20:00-08:00) in a room that was completely isolated from external noises. www.nature.com/scientificreports/ activated mariculture keyhole limpet hemocyanin (Pierce) and used to immunize Japanese white rabbits on days 1, 15, 29, 43 and 57. Blood was collected prior to the first injection and on days 36, 50 and 64 post-injection and antibody titer was determined by ELISA. The polyclonal antisera were purified using a Melon™ Gel IgG Spin Purification Kit (Thermo Fisher Scientific) to remove non-relevant proteins that are often present in high abundance 10 .
Immunofluorescence staining. The expression of the three suprabasin-derived peptides was also analyzed in human-derived cells in culture. Cells were washed twice with PBS, fixed with 4% paraformaldehyde for 15 min, blocked with Blocking One (Nacalai Tesque), incubated at 37 ℃ for 30 min with specific antibodies against SBSN_HUMAN[225-237], SBSN_HUMAN [243][244][245][246][247][248][249][250][251][252][253][254][255][256][257][258][259] or SBSN_HUMAN[279-295], diluted to 1:500 with 1% BSA and 0.1% sodium azide in PBS, and visualized by a 30 min incubation with a goat anti-rabbit Alexa Fluor 594 secondary antibody (1:1000, Abcam) at 37 ℃. Cell nuclei were counterstained using DAPI Fluoromount-G (SouthernBiotech). Laser scanning confocal microscopy was performed using an LSM710 confocal microscope (Carl Zeiss) 53 www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.