In silico identification and functional validation of linear cationic α-helical antimicrobial peptides in the ascidian Ciona intestinalis

We developed a computing method to identify linear cationic α-helical antimicrobial peptides (LCAMPs) in the genome of Ciona intestinalis based on its structural and physicochemical features. Using this method, 22 candidates of Ciona LCAMPs, including well-known antimicrobial peptides, were identified from 21,975 non-redundant amino acid sequences in Ciona genome database, Ghost database. We also experimentally confirmed the antimicrobial activities of five LCAMP candidates, and three of them were found to be active in the presence of 500 mM NaCl, nearly equivalent to the salt concentration of seawater. Membrane topology prediction suggested that salt resistance of Ciona LCAMPs might be influenced by hydrophobic interactions between the peptide and membrane. Further, we applied our method to Xenopus tropicalis genome and found 11 LCAMP candidates. Thus, our method may serve as an effective and powerful tool for searching LCAMPs that are difficult to find using conventional homology-based methods.


Results
Genome-wide search for discovery of novel Ciona LCAMPs. We developed a computing method for the detection of LCAMPs in the Ciona genome based on the structural and physicochemical features of known LCAMP precursors: short length, secretory peptide, and cationic amphipathic α-helix of mature peptide. The Ciona genomic database called Ghost database, includes 21,975 non-redundant complete amino acid sequences, from which 1,949 sequences with 50-100 residues in length were extracted.
Firstly, to find LCAMP precursors from 1,949 sequences, we predicted signal peptides using SignalP 4.1 server. In total, 252 sequences were predicted to be secretory peptides, and the sequences corresponding to the signal peptides were eliminated from them. Furthermore, the secretion of their peptides into extracellular space was confirmed using TMHMM server, and then, eight sequences were eliminated as transmembrane proteins. Next, sequences containing cysteine residues were eliminated from the 244 subsequences without signal peptides, and cationic amphipathic α-helical structure was predicted using HeliQuest server. HeliQuest screening was performed for 165 subsequences without cysteine residues and 80 putative cationic amphipathic helical segments were identified. The secondary structure of HeliQuest-identified cationic amphipathic segments was further confirmed. Using PROTEUS server, it was found that 44 segments had the potential to form α-helical structures. Subsequently, the aggregation propensity of cationic amphipathic α-helical segments was predicted using AGGRESCAN, in which normalized average of aggregation propensity (Na 4 vSS) of the proteins was evaluated. Forty-four cationic amphipathic α-helical segments had Na 4 vSS values within a broad range of − 65.6 to 14.8. A previous study reported that antimicrobial peptides have Na 4 vSS values within the range of − 40 to 60 21 . Therefore, we investigated whether cationic amphipathic α-helical segments with Na 4 vSS value less than − 30 have antimicrobial activity. Four selected segments, KH.C1.365, KH.L63.9, KH.S1007.2, and KH.S930.2 with Na 4 vSS values of − 32.9, − 65.6, − 48.1 and − 44.4, respectively, exhibited no or weak antimicrobial activities against Escherichia coli ( Supplementary Fig. S1). Consequently, as potential candidates for Ciona LCAMPs, we identified 24 segments with Na 4 vSS value between − 30 and 30. Finally, the prediction of subcellular localization was performed to eliminate cellular components from LCAMP candidates. Using DeepLoc-1.0 server, 22 genes were identified as LCAMP candidates (Table 1).
Ciona LCAMP candidates identified by our screening method include known LCAMPs such as Ci-MAMfamily genes 20 and a number of uncharacterized proteins. In addition to Ciona genome, we predicted LCAMPs in X. tropicalis genome using our screening method. Eleven LCAMP candidates containing seven known LCAMPs 22 were selected from 26,132 non-redundant protein sequences in the X. tropicalis genome (Supplementary  Table S1), which suggests that our screening method is able to support the identification of LCAMPs from various genomes. LCAMP precursors are known to have a highly conserved signal peptide 23 . Two conserved signal peptides were found in Ciona LCAMP candidates identified by our screening method (Fig. 1). In five members of Ci-MAM family, signal peptides composed of 21-22 amino acid residues were well-conserved (80% majority-rule consensus sequence: MDRKIVFALXLVXXLXVSXXXA). In addition, another conserved signal peptide composed of 23 amino acid residues (80% majority-rule consensus sequence: MNKSALLXLLXXGLXVLXEXXXA) was identified in six LCAMP candidates, suggesting that they belong to a novel LCAMP family. Previous studies showed that Ciona LCAMP precursors possess an anionic helical region in addition to a cationic amphipathic region 19,20 . Using HeliQuest server, anionic helical sequences were found at the C-terminus of 17Ciona LCAMP precursor candidates (Table 1), indicating that most Ciona LCAMP precursors are composed of a signal peptide, a cationic amphipathic region, and an anionic helical region in that order. Table 1 19,20 . We examined the effects of five novel Ciona LCAMPs in the presence of NaCl on antimicrobial activity against E. coli. Reflective of the habitat of C. intestinalis, salt resistance was found in all peptides at the concentration of 12.5 μM (Fig. 2). In particular, KH.C1.640, KH.S908.1, and KH.S921.1 peptides retained potent antimicrobial activity in the presence of up to 500 mM NaCl. Previous studies suggest that helical stability, hydrophobicity and amphipathicity in LCAMPs are important for salt resistance [24][25][26][27] . However, each property of Ciona LCAMPs cannot explain salt resistance per se, as shown in Table 3. Therefore, we predicted the membrane interaction property of Ciona LCAMPs using AmphipaSeek server, which predicts that a protein or peptide can be monotopic and anchored via an amphipathic helix inserted parallel to the membrane interface 28 . The prediction showed that a membrane anchor region was found in higher salt resistant  Table 3). This suggests that in-plane membrane anchoring might be responsible for salt resistance.

LCAMP gene expression in Ciona hemocytes.
To investigate the expression of Ciona LCAMPs in response to microbial infections, we performed RT-PCR analysis on Ciona hemocytes stimulated with bacterial lipopolysaccharide (LPS). First, the expression of Ci-mam-A (KH.C1.100), known as an inducible antimicrobial peptide 20 , was examined using our culture system. As a previous study, Ci-mam-A expression was induced by LPS stimulation (Fig. 3). The expression of eight LCAMP genes in LPS-treated hemocytes was also further assessed by RT-PCR. Ci-meta4 and two novel LCAMP genes (KH.C1.640 and KH.S908.1) were transcriptionally upregulated by LPS, whereas there were no changes in four LCAMP genes containing Ci-pap-A (Fig. 3). NF-κB is activated in cells challenged with LPS and other inflammatory stimuli and is involved in the transcriptional activation of responsive genes 29 . To confirm the involvement of Ciona LCAMPs in the immune response, we predicted NF-κB binding sites in the upstream regions of each gene. Using Match-1.0 public, putative NF-κB binding sites were identified in the upstream regions of LPS-inducible LCAMP genes, but not in the upstream regions of KH.C7.94 and KH. S1531.4 genes ( Table 4). These suggests that four inducible LCAMPs (Ci-MAM-A, Ci-META4, KH.C1.640, and KH.S908.1) may play a role in the innate immune response.

Discussion
In this study, we applied a computational approach to identify novel LCAMPs in C. intestinalis and identified 22 potential LCAMP candidates using their known physicochemical characteristics and subcellular localization predictions. Furthermore, we experimentally validated the broad-spectrum antimicrobial activity of five selected LCAMPs. Two of them were induced with LPS and putative NF-κB binding sites were identified in the upstream region of their genes, suggesting that they may play a role in the innate immune response. However, our data do not rule out the involvement of the other three genes in the immune response, as they may be regulated by uncharacterized innate immune signaling pathways.  www.nature.com/scientificreports/ In order to explore the proteins and peptides with similar function among different species, homology search tools such as BLAST have generally been used. Several defensin homologs were identified by BLAST search in plant and vertebrate genomes [30][31][32] . In these studies, some defensins that could be missed by BLAST search were further identified by a hidden Markov model based on a conserved cysteine-rich defensin motif. In contrast to defensin with three unique cysteine frameworks, it is difficult to find LCAMPs among different species using commonly used homology search tools due to poor sequence conservation of LCAMPs, especially in the cationic amphipathic α-helical region 33 . Therefore, computational approaches without sequence homology information have been designed to discover novel LCAMPs from sequence databases. In marine mussels, novel antimicrobial peptides, myticalins, were identified in the genomes and transcriptomes using bioinformatics analyses 13 . In Ciona, two LCAMP families, Ci-MAM and Ci-PAP, were identified from a hemocyte EST database based on two structural criteria of LCAMPs, signal peptide, and cationic α-helical region 19,20 . We developed an in silico method for screening LCAMPs based on seven criteria, size, cationicity, hydrophobicity, amphipathicity, helicity, aggregation propensity, and subcellular location. Thus, we identified peptides that potentially have antimicrobial activity based on the structural and physicochemical features and eliminated non-antimicrobial peptides which cannot come into contact with microbes using subcellular localization prediction. In the prediction of these properties, we used web-based tools that can handle a large number of sequences simultaneously (see "Materials and methods"). This enables screening of LCAMPs in genome databases. Consequently, we identified novel LCAMPs containing KH.S921.1 from C. intestinalis genome database using our in silico method. In addition to C. intestinalis genome, we applied this method to X. tropicalis and succeeded in identifying the cationic amphipathic region including known LCAMPs. Our newly developed method is effective in screening LCAMPs in the genomes of various species. However, our in silico method does not completely exclude false positives. Experimental validation is necessary to validate novel LCAMPs in other organisms using our screening method.
A number of LCAMP precursors are known to possess a signal peptide, cationic amphipathic region, and anionic proregion 34,35 . In C. intestinalis, almost all candidates shared a typical primary structure composed of a signal peptide, a cationic amphipathic region, and an anionic helical proregion in that order. The signal peptide is highly conserved within the species, and this conservation tendency increases within the individual LCAMP families. For instance, the signal peptides of Ci-MAM family genes are well-conserved 20 (Fig. 1). We identified a novel conserved signal peptide composed of 23 amino acid residues, indicating that six LCAMP precursors containing KH.S921.1 with their conserved signal peptide form a novel LCAMP family. In this novel LCAMP family, the anionic proregion is short or absent, whereas in the Ci-MAM family, it is relatively conserved. LCAMP precursors require proteolytic processing in which anionic proregions are removed by proteolytic cleavage to attain their active forms. In frog, LCAMP precursors are processed at the RXXR-, KK-, or RR-motifs corresponding to common cleavage sites in proregions 22,23 . However, in Ciona LCAMP candidates, acidic amino acid residues, E and D, were found around the boundary of putative cationic active region and proregion, but their basic processing motifs were not. In ascidian, LCAMP precursors may be processed at the N-terminal of the acidic amino acid residue. Previous studies showed that ascidian LCAMPs, Styelins and Clavanins, are cleaved at the N-terminal of D and DD, respectively 36,37 .
Herein, we demonstrated salt resistance of Ciona LCAMPs. Many antimicrobial peptides lose their activities under physiological salt concentration, although they exhibit significant in vitro activity against bacteria in the presence of low salt concentration. Considering the therapeutic application, the salt-dependent inactivation of antimicrobial peptides is a major obstacle 38 . Marine organisms, including C. intestinalis, are considered a good resource of salt-resistant antimicrobial peptides. So far, several salt-resistant antimicrobial peptides have been identified in marine organisms 39 . We demonstrated that three synthetic peptides derived from KH.C1.640, KH.S908.1, and KH.S921.1 retained potent antimicrobial activity in the presence of up to 500 mM NaCl. Additionally, AmphipaSeek prediction strongly suggested these peptides appear to insert parallel to membrane interface. We propose that salt resistance is influenced by hydrophobic interaction between the peptide and membrane. The action mechanism of LCAMPs, such as dermaseptins and cecropins, is explained by "carpet model" 40 . According to this model, the peptides first align parallel to the phospholipid bilayer membrane, remaining in contact with the lipid head groups and effectively coating the surrounding area. When a threshold concentration of the peptide is reached, the peptides self-assemble within the membrane plane and this reorientation leads to a local disturbance in membrane stability, causing the formation of large cracks, leakage of cytoplasmic Scientific RepoRtS | (2020) 10:12619 | https://doi.org/10.1038/s41598-020-69485-y www.nature.com/scientificreports/ components, disruption of the membrane potential, and ultimately disintegration of the membrane. High-salt concentration tends to disturb the first step of the action mechanism, especially the electrostatic interaction between cationic peptides and negatively charged bacterial membranes. It seems likely that salt-resistant LCAMPs could be bound to the membrane surface by the hydrophobic interaction, even if the electrostatic interaction was disturbed under high-salt conditions. In this study, we demonstrated the effectiveness of a combination of computational and experimental approaches used to identify putative and novel LCAMPs from ascidian and frog genomes. LCAMP is a promising candidate for a new class of antibiotics, with a broad spectrum of antimicrobial activity, ease of synthesis, and a novel mechanism of action against microbes. Our in silico screening method will allow us to discover novel LCAMPs from genetic resources and shed light on the relationship between the structure and function of LCAMPs.

Materials and methods
Genome-wide in silico screening for antimicrobial peptides. The amino acid sequences used for discovery of novel LCAMPs were retrieved from two genome databases, Ciona intestinalis type A (recently called Ciona robusta) sequences from Ghost database (https ://ghost .zool.kyoto -u.ac.jp/cgi-bin/gb2/gbrow se/ kh/) 41 and Xenopus tropicalis sequences from Xenbase (https ://www.xenba se.org/entry /) 42 . LCAMP prediction was performed for sequences with 50-100 residues in length. LCAMP candidates were selected based on signal peptide sequence, hydrophobicity, hydrophobic moment, net charge, α-helix structure, and aggregation propensity. The signal peptide searches and cleavage site predictions were carried out using SignalP 4.1 server (https ://www.cbs.dtu.dk/servi ces/Signa lP-4.1/) 43 . Upon the removal of the signal peptide region, the sequences were used for the prediction of membrane-spanning regions using TMHMM server v.2.0 (https ://www.cbs.dtu.dk/ servi ces/TMHMM /) 44 to eliminate transmembrane proteins. Subsequently, cationic amphipathic helix prediction was performed using HeliQuest (https ://heliq uest.ipmc.cnrs.fr/) 45 . HeliQuest program slides an 18-residue window along the protein sequence and calculates hydrophobicity, hydrophobic moment, and net charge for each generated segment. Appropriate screening parameters were determined by analyses of cecropin-A (Gen-Bank: AAA29185), moricin (BAB13508), Ci-PAP-A (ABR45664), Ci-MAM-A (ACA97856), dermaseptin SI (CAD92232), and PGLa (CAA25963): hydrophobicity between 0 and 0.6, hydrophobic moment between 0.1 and 1.0, and net charge between 3 and 10. Sequence statistics used in HeliQuest screening were as follows: minimal number of polar residues, minimal number of uncharged residues (serine, threonine, asparagine, glutamine, and histidine), minimal number of glycine residues, and maximal number of charged residues were 6, 1, 0, and 12, respectively, and cysteine was excluded. For identification of an anionic helical region in the LCAMP precursor, parameters were limited: hydrophobicity between − 0.7 and 0.4, hydrophobic moment between 0 and 0.5, and net charge between − 10 and − 3. Secondary structures of putative cationic amphipathic segments were further confirmed by PROTEUS (https ://www.prote us2.ca/prote us/) 46 , and their aggregation propensities were calculated using AGRRESCAN (https ://bioin f.uab.es/aggre scan/) 47 . Finally, subcellular localization was predicted using DeepLoc-1.0 (https ://www.cbs.dtu.dk/servi ces/DeepL oc/) to confirm the secretion of LCAMP candidates 48 . Sequence analysis. Peptide sequences were aligned using ClustalW. The prediction of NF-κB binding sites upstream of the initiating methionine codon of each gene (less than 2,000 base pairs) was carried out using Match-1.0 public (https ://www.gene-regul ation .com/pub/progr ams.html) 49 . The values of the cut-off for core and matrix similarities were 0.75 and 0.8, respectively. Antimicrobial activities of the peptides. Peptides used in this study were chemically synthesized using the Fmoc method and obtained from Eurofins Genomics (Tokyo, Japan Antimicrobial activities of the peptides were measured as described previously 50 . Synthetic peptides were added to the microbial suspension in the individual wells of a microplate at final concentrations of 3.125, 6.25, 12.5, and 25 μM. The relative activities (RA) were calculated by the equation RA = ((A PC − A NC ) − A SA )/ (A PC − A NC ), where A PC , A NC , and A SA represent the absorbance of positive control, negative control, and samples, respectively. The bacteria and yeast strains used were Escherichia coli (NBRC14237), Staphylococcus aureus (NBRC12732), Pseudomonas aeruginosa (NBRC12582), and Saccharomyces cerevisiae (NBRC10217), which were purchased from Biological Resource Center, National Institute of Technology and Evaluation (NBRC, Tokyo, Japan). The interaction between Ciona LCAMPs and membrane interface was predicted using AmphipaSeek server (https ://npsa-pbil.ibcp.fr/cgi-bin/npsa_autom at.pl?page=/NPSA/npsa_amphi pasee k.html) 28  The animals were bled by removal of the tunic and puncture of the heart, and washed with sterile Ca 2+ -and Mg 2+ -free artificial seawater (CMF-ASW; 463 mM NaCl, 11 mM KCl, 25.5 mM Na 2 SO 4 , 2.15 mM NaHCO 3 , 5 mM HEPES (pH 8.0), 5 mM EDTA). The coelomic fluid (hemolymph) and wash solution were collected and filtered through a nylon net. After centrifugation at 1,000×g for 30 s, the hemocyte pellet was washed with CMF-ASW and resuspended in CMF-ASW. Coelomic fluid was further centrifuged and the resulting supernatant (hemolymph) was used for the cultivation of hemocytes.
Hemocytes were plated in a 24-well plate and cultured at 16 °C in filtered seawater containing 20% hemolymph in the presence of penicillin (final concentration of 50 units/mL) and streptomycin (final concentration of 50 μg/mL). After incubation for 36 h, the cells were cultured in 20% hemolymph/seawater without antibiotics and then stimulated with 0.1 mg/mL LPS from E.coli (Sigma-Aldrich, St. Louis, MO, USA) for 1 h at 16 °C.
Reverse transcription-polymerase chain reaction (RT-PCR) analysis. Total RNA was extracted from uninduced and LPS-induced cells and was reverse transcribed using PrimeScript RT reagent Kit (TAKARA Bio Inc., Shiga, Japan) with oligo dT primer. The primers used in this study are mentioned in Supplementary  Table S2. PCR cycles were as follows: initial denaturation at 94 °C for 2 min; 35 or 40 cycles at 94 °C for 30 s, 48 °C for 30 s, and 68 °C for 30 s. EF1α (elongation factor 1α) and PPIA (peptidylprolyl isomerase A) were used as internal control for RT-PCR analyses. The PCR products were separated by electrophoresis on 2% agarose gel and stained with ethidium bromide.