Encrypted Antimicrobial Peptides from Plant Proteins

Examples of bioactive peptides derived from internal sequences of proteins are known for decades. The great majority of these findings appear to be fortuitous rather than the result of a deliberate and methodological-based enterprise. In the present work, we describe the identification and the biological activities of novel antimicrobial peptides unveiled as internal fragments of various plant proteins founded on our hypothesis-driven search strategy. All putative encrypted antimicrobial peptides were selected based upon their physicochemical properties that were iteratively selected by an in-house computer program named Kamal. The selected peptides were chemically synthesized and evaluated for their interaction with model membranes. Sixteen of these peptides showed antimicrobial activity against human and/or plant pathogens, some with a wide spectrum of activity presenting similar or superior inhibition efficacy when compared to classical antimicrobial peptides (AMPs). These original and previously unforeseen molecules constitute a broader and undisputable set of evidences produced by our group that illustrate how the intragenic concept is a workable reality and should be carefully explored not only for microbicidal agents but also for many other biological functions.

In Greek Philosophy but also under the historical knowledge accumulated in Physics, Aristotle introduced the concept of first cause or prime mover, meaning the first of all substances or the commencement of everything that exists in nature. Assuming that complex and superior beings are not logically supposed to be the unique cause (origin, explanation) of themselves, under the framework proposed by the thinker, things in the universe should be caused by previous and original ones. Therefore, these causes exist in succession, as a chain of events distending regressively.
The paradigmatic cosmological theory of the Big Bang, initially suggested in Georges Lemaître's pivotal work, "Un univers homogène de masse constante et de rayon croissant rendant compte de la vitesse radiale des nébuleuses extra-galactiques" 1 and his subsequent Primeval-Atom hypothesis 2,3 ; the Universal Common Ancestry theory 4 , strengthened by the recent advances in genome sequencing; or the present models of development and differentiation of stem cells 5,6 , seem to share the same archetypical structure as proposed by the Aristotelian framework. It means that all these theories and models conceive a fundamental starting point of its own substance that during a certain period of time moves towards an increasing structural and functional complexity until the inevitable finitude. By analogy, the complete genome of any living form could also be considered as the primordial starting point in a chain of events. It means that a linearly organized genetic material undergoes metabolic processing yielding structures with forms and functions that, in many ways, as a direct function of time and environmental conditions, appear to be different or substantially apart from its origin. Considering a broader and interspecific scenario, the actual differences in shapes, sizes and behavior our senses perceive in all living creatures, may also be seen as successful products of the internal possibilities (potentialities) of the prime genetic material exposed to a number of evolutionary factors in the time line.
In our previous work 7 , inspired by this theoretical reasoning, protein sequences were perceived as the prime source from where antimicrobial peptides (AMPs) could be located in one or more regions of the complete polypeptide chain of some mature and expressed proteins. In strict tandem with Aristotle's thought, the primary

Results
Plants genomes as sources of IAPs. Publicly available protein collections from Theobroma cacao, Arabidopsis thaliana, Citrus sinensis and Gossypium. raimondii were submitted to the software Kamal v1.0 alpha for the filtering of putative IAPs encrypted within proteins. Kamal uses calculated physicochemical properties derived from protein primary structure as descriptors of peptide activity, as previously demonstrated by our group 7 . Nine filters were sequentially applied to plant genomes to uncover putative IAPs, as described in Table 1. Kamal generates a list of protein fragments with higher probability of antimicrobial activity, which should be curated by the user aided by auxiliary tools available online 20 . By extensive user examination of the putative IAPs generated by Kamal, twenty-one peptides were selected for solid phase peptide synthesis (Table 2). Tc02, Tc06 and Tc08 have their helical wheel projections and their physicochemical properties depicted in Fig. 1 as an illustration. The distribution of amino acid residues in these IAPs in theoretical helical segments emulates the characteristic amphiphilic character of AMPs 20 . Furthermore, to investigate the physicochemical properties of putative IAPs prospected from the T. cacao genome in relation to randomly selected peptides, quantile plots were generated. Tc02, Tc06 and Tc08 have a net charge, isoelectric point 21 and an aggregation tendency (Na4vSS) 22 higher than the 3 rd quartile of a sample of ten thousand peptides from 16 to 22 amino acid residues prospected randomly from the same T. cacao protein collection. Moreover, average hydrophobicity 23 and hydrophobic moment 24 are in the third quartile of the reference set (Fig. 1). These three peptides are representative of the putative IAPs derived from plant genomes selected for solid phase peptide synthesis (Supplementary Material 01, Table S1 and Figure S1). Therefore, putative IAPs selected for synthesis tend to be more cationic, more hydrophobic, and more prone to aggregation than the average T. cacao random protein fragment of the same number of residues. The interaction of IAPs with model membranes. The secondary structure of synthetic IAPs was investigated in buffer and after the addition of dimyristoilphophatidilcoline (DMPC) and 2:1 dimyristoilphophatidilcoline: dimyristoilphophatidilglycerol (DMPC:DMPG) large unilamelar vesicules (LUVs) at 50-fold molar excess. In buffer, most peptides presented far Ultraviolet-Circular Dichroism (UV-CD) spectra compatible with random coil structures (Table 3). Nevertheless, some were already structured as α-helices or β-sheets even before the addition of vesicles. Peptides Tc02 and Gr02 are structured as stable α-helical segments in buffer, and membrane addition accentuated their helical content (Fig. 2). IAPs, such as At02, Gr01 and Tc08, transitioned from random coil structures in buffer to α-helical segments upon membrane interaction, similarly to the classical AMP DS01 7,16,25 (Fig. 2). Other molecules presented Far-UV CD spectra compatible with β-sheets and were considered separately. The helical percentages of the 21 IAPs evaluated herein in buffer and following the addition of LUVs are described in Table 3. This data corroborates our previous results that IAPs behaves as most membrane-active α-helical AMPs undergoing structure change in the presence of membranes. We have further investigated the thermotropic behavior of phospholipid membranes enriched with 4 mol% IAPs by differential scanning calorimetry (Fig. 3). The putative IAPs varied in their ability to alter the P′ β →L α main phase transition of DMPC and 2:1 DMPC:DMPG LUVs. Peptides Tc02 and Gr02 disturbed the main phase transition of vesicles similarly to DS01 (Fig. 3). These two IAPs induced a two-component main phase transition with a more cooperative endotherm at a lower temperature (sharp component) superimposed by another less-cooperative endotherm at a higher temperature (broad component). These alterations are characteristic of deep peptide penetration in the acyl interior of model membranes, a feature previously associated with antimicrobial activity 7 . Other peptides, such as Gr01, disturbed the main phase transition of LUVs to a lesser extent, indicating that although this peptide structures as an α-helical segment, it interacts more superficially with model membranes (Fig. 3).
IAPs are potent and broad antimicrobial agents with low hemolytic activities. The antimicrobial activity of the 21 putative IAPs versus Gram-positive and Gram-negative bacteria, yeasts and filamentous fungi were evaluated. A panel of plant and human pathogenic microorganisms were assayed. Sixteen IAPs, or approximately 76% of the synthesized IAPs, showed inhibitory activity at concentrations up to 256 µM against at least one of the evaluated microorganisms (Tables 4 and 5). Five of them, Tc02, Tc08, At02, Gr01 and Gr02 were as potent as DS01 16 and Asc-8 17 , with a similar amplitude of spectrum of activity. Moreover, some specificity towards particular groups of microorganisms were observed. Tc02 inhibited the growth of yeasts and filamentous fungi more  Table 2. Peptide IDs, source proteins, scanned organisms, peptide sequences and number of residues of the current generation of IAPs. * Carboxyamidated peptides. ** Protein not available on NCBInr database. This protein was obtained from the predicted proteins downloaded from cocoagendb.cirad.fr. *** Protein nor available on NCBInr database. This protein was obtained from a list of oxidative-stress from A. thaliana 55 .
efficiently than Gram-positive and -negative bacteria, while Tc08 presented an opposite pattern of activity. The same is true for At04, which showed almost negligible activity to most microorganisms, except for Staphylococcus aureus. Tc06, At01, Cs01 and Zm01 were also as potent as AMPs in inhibiting the growth of yeasts. Other peptides, such as At02, Gr01 and Gr02, inhibited the growth of microorganisms without regard to specific membrane structures or phospholipid compositions. The microbicidal activities of these IAPs were also evaluated by standard dilution assays and are presented as Supplementary Material (Supplementary Material 02, Tables S2 and S3). Inhibitory and microbicidal concentrations were very similar for the most active peptides, except for Tc06, At01, Cs01 and Zm01, which presented non-detectable microbicide concentrations to various microorganisms, while they inhibited their growth at concentrations lower than 32 µM, indicating a microbistatic activity rather than microbicidal.
The sixteen synthetic IAPs that displayed antimicrobial activity were tested for their cytotoxicity using human erythrocytes as model. Although the IAPs described herein were filtered to hinder the growth of plant pathogens (Table 1), erythrocyte lysis is used as a standard to evaluate peptide selectivity 16,17,26 . IAPs induced erythrocyte lysis at variable concentrations similar to the AMPs Asc-8 17 and DS01 16 (Table 6 and Supplementary Material 03, Table S4). While Asc-8 is a known hemolytic molecule, DS01 did not induce significant alterations in red or white blood cells at the tested concentrations up to 45 μM 16 . IAPs induced 50% hemolysis at intermediate concentrations to these two AMPs. Tc02, Gr01 and Gr02, three of the peptides with highest antimicrobial activities, induced 50% hemolysis at 64 μM, a concentration approximately 32 times higher than that required for fungicidal activity, or 12 times higher than that necessary for antibacterial activity.

IAPs from T. cacao inhibit the germination of Moniliophthora perniciosa basidiospores.
The eleven IAPs filtered from the T. cacao genome were evaluated for their potential to inhibit the germination of M. perniciosa basidiospores, the causal agent of witches' broom disease 27 . M. perniciosa basidiospores germinate and infect cocoa meristems 4 hours after contact with cocoa meristems 28 . Five IAPs and the amphibian peptides DS01 and Asc-8 inhibited the germination of 1 × 10 6 basidiospores after 4 hours of incubation in the evaluated  23 ), Aggregation (Nav4SS parameter from Aggrescan 22 ), GOR IV 53 alpha helix, extended configuration and coil structures was generated for 10.000 randomly selected peptides ranging from 16-22 amino acids from T. cacao genome. Color bars in each quantile plot represent the physicochemical parameters for the IAPs Tc02 (red), Tc06 (blue) and Tc08 (green). Helical wheel plots for the same IAPs are also represented, demonstrating that once structured as α-helical segments, these peptides show an amphiphilic character similar to classical AMPs. Hydrophobic residues are represented in green, positively charged residues are represented in red, negatively charged residues are represented in blue and polar uncharged residues in light blue. Tc02, Tc06 and Tc08 were selected based on their tested antimicrobial activities (Tables 4 and 5) to illustrate physicochemical parameters of the filtered IAPs when compared to randomly selected peptides from T. cacao. Physicochemical properties and helical wheel plots of the remaining IAPs are shown on Supplementary Material 01.
IAPs from G. raimondii inhibit areolate mildew development on cotton leaves. IAPs were also able to inhibit the cotton pathogen Ramularia areola 29 growth in vitro ( Table 4). The IAPs Gr01 and Gr02, prospected from G. raimondii proteins, along with other peptides such as Tc06, At03 and Cs02, originally prospected for other plants but found with 100% identity in the cotton genome, were evaluated for their potential to inhibit disease development on cotton leaves infected with R. areola spores. Disease severity was evaluated in a scale from 1 (no symptoms) to 5 (over 50% of leaf area showing disease symptoms) (Fig. 5a). IAPs and 2 × 10 6 spores were incubated for 20 minutes and this solution was applied in plant leaves. Gr01 and Tc06 showed significant results (p < 0.05, One-way ANOVA) on the control of areolate mildew down to 16 and 32 µM, respectively (Fig. 5b). Moreover, IAPs showed similar qualitative results to the commercial fungicide used as control. The number of spores collected from the leaves was also significantly decreased in the presence of the IAPs (Supplementary Material 05, Figure S3). Surprisingly, Gr02 did not show a significant reduction in areolate mildew severity and spores concentration. This demonstrates that in vitro biological activities might not be observed in vivo, and novel assays are necessary to accurately address these results.

Discussion
We are persuaded that mature proteins represent a feasible and copious source from where smaller bioactive polypeptide chains may be earned, particularly antimicrobials. In the present report, we confirm our previous findings 7 by demonstrating additional proofs of this concept exploring plant protein sequences as sources of their derived peptide primary structures and, evaluating their respective biological activities against varied pathogens. The IAPs searched in plant genomes described herein have equivalent potency, selectivity, and display a similar activity spectrum towards microorganisms as AMPs from various sources 15,[30][31][32] . Moreover, these IAPs fold into α-helical segments upon membrane interaction producing a range of membrane disturbances, in the same manner as mature antimicrobial peptides from frog skin secretions 7,33 (Figs 2 and 3). The disturbances introduced by the IAPs in model membranes, as depicted in Fig. 3, are typical of molecules that are localized at the polar/ nonpolar interface of the phospholipid bilayer, with a tendency to disturb the acyl interior of membranes, similar to DS01 and other AMPs in the literature 34 . Furthermore, membrane disturbance is directly associated with peptide helicity, and the same is observed for the IAPs (Fig. 2). Our data demonstrates that classical antimicrobial peptides, selected for their membrane disruptive potential by millions of years of evolution 35 , can be mimicked in several aspects by internal protein segments with comparable physicochemical properties, but with no identifiable adaptive pressure for such activity when encrypted in parent proteins. Once identified and characterized, these IAPs can be used to confer resistance to a variety of pathogens, by transforming plants with fragments of their own genetic material, providing an alternative to the transgenic technology. This was demonstrated by our    To the best of our knowledge, this is the first time that molecules with significant antimicrobial activity are obtained from T. cacao and G. raimondii. The IAPs Tc02, Tc06 and Tc10 inhibited the germination of M. perniciosa basidiospores and can be considered promising agents against other T. cacao pathogens, such as Phytophthora spp. and Moniliophthora roreri 36 . Despite the encouraging results, cocoa transformation is challenging due to its long growth period to yield its first fruits 37 . Micro-Tom tomatoes transformation represents an alternative to evaluate the intragenic approach in the control of M. perniciosa infection 38 . For cotton plants, the inhibition of areolate mildew symptoms ( Fig. 5b1 and b2) and the decrease in spore concentration (Supplementary Material 05, Figure S3) with comparable results to a commercial fungicide indicates that peptides Gr01 and Tc06 are suitable candidates for cotton plant transformation.
The evidences we have encountered so far by applying our methodology, alongside most of the native bioactive peptide data available in the literature 7,16,31,39 , compel us to infer that other sources of proteins in animals, algae and microorganisms should yield similar results to the ones we have evaluated here. Putative IAPs can be prospected from both clade-specific proteins or from evolutionarily conserved homologous genes that span different taxa. The latter seems to be the case of the IAPs Tc06 and Cs02. Tc06 is derived from cytochrome P450 family 87, found with 100% identity in many eudicots and with more than 75% identity in some monocots such as Oryza sativa and Zea mays (Table 2), while the IAP Cs02 is a fragment of the photosystem II CP47 chlorophyll apoprotein, found identically in Bacteria and Eukarya kingdoms (Supplementary Material 06, Figure S4). However, at the present phase of our investigations it appears that more important than a most needed systematic search for novel bioactive peptides in other protein sources that would reinforce our findings, some fundamental questions seem to emerge from the current scenario that deserve to be addressed with higher priority: 1. Are these facts pointing out to a general phenomenon? 2. If so, what would be the biological and physiological implications of that? 3. Could protein degradation (into bioactive peptides) be considered an overlooked step of specific protein post-processing and/or function in nature and not only an earlier stage of proteolysis destined to supply amino acids to various metabolic pathways? 4. What would be the implications of this concept in biotechnology, agriculture and human health?
From our perspective, these questions have no easy answers. Each one of them appears to open such a wide range of theoretical and experimental research possibilities that, at present, are beyond our best work force and probably lifetime effort.
It is our understanding that although the empirical results on microbicidal activities presented here constitute a robust and assorted set of material evidences on which our prime hypothesis is anchored, they seem to represent just the tip of a much larger aspect of the reality that need to be investigated rigorously not only for different types of antimicrobial properties and modes of action, but also for other important biological functions such as analgesic, anti-inflammatory, hypotensive, for example. In fact, our group recently identified encrypted opioid peptides on dehydrin protein family from Coffea spp., showing similar results to Leu-Enkephalin 40,41 . We also identified a conserved Met-Enkephalin in a F-box protein from T. cacao (NCBI EOY06230.1 -Residues Lys 176 to Lys 184) with similar cleavage sites for serine proteases as found on human pro-enkephalins (unpublished data). Such results highlight how abundant encrypted peptides can be, suggest another facet to genome plasticity and therefore offer new extrinsic principles to biodiversity [42][43][44] . Mechanistic features and physicochemical studies on the antimicrobial peptides evaluated in this work are available as a second part of the current paper, entitled "Towards an experimental classification system for membrane active peptides".

Material and Methods
IAPs search criteria and peptide synthesis. The predicted proteins from Theobroma cacao were downloaded from Cacao Genome Database (www.cacaogenomedb.org) on October 2013. Arabidopsis thaliana, Gossypium raimondii and Citrus sinensis genomes were downloaded from the Phytozome portal (https:// phytozome.jgi.doe.gov) on February, July and August 2014, respectively. All genomes were searched for putative IAPs. IAPs were searched through a set of defined physicochemical parameters using Kamal alpha v.1.0 software 7 . Physicochemical descriptors were updated from our previous report and are listed in Table 1. A total of 21 peptides were chemically synthesized using Fmoc/t-butyl strategy 45 . Peptide chain elongation was performed on Rink Amide resin, yielding C-terminal amidated peptides. Two known AMPs from frog skin secretions, Dermaseptin DS01 16 and Asc-8 17 , were also synthesized. Helical wheel plots were created using an Internet Tool (http://lbqp. unb.br/NetWheels/).

Mass spectrometry analyses.
All crude synthetic peptides were analyzed by mass spectrometry to confirm peptide mass and amino acid sequence. Experiments were carried out in an UltrafleXtreme MALDI-TOF/TOF (Bruker Daltonics), controlled by FlexControl 3.0 software (Bruker Daltonics) using α-cyano-4-hydroxycinnamic acid matrix (Fluka) for ionization. Peptides monoisotopic mass were obtained in reflector mode over a range of 700-3500 m/z with external calibration using Peptide Calibration Standard II (Bruker Daltonics). Peptide MS/ MS spectra were obtained by means of LIFT fragmentation after analyzing the obtained MS spectra and selection of precursor ions for fragmentation. The software FlexAnalysis 3.0 (Bruker Daltonics) was used for mass spectrometric data analysis. Peptide primary structures were inferred by means of manual interpretation of fragmentation spectra. . Fractions were manually collected and analyzed by mass spectrometry to confirm the elution time of each synthetic peptide. Peptides were then purified in a preparative scale using the same parameters described above. Fractions of interest were collected and analyzed by mass spectrometry to confirm purity. Purified synthetic peptides containing Trp or, at least, three Tyr residues were quantified using calculated molar absorption coefficients 46 . The remaining peptides were quantified using the UV absorbance of the peptide bond according to the literature 47 .
Large unilamelar vesicles (LUVs) preparation. DMPC and 2:1 DMPC:DMPG (w/w) were dissolved in chloroform and methanol (3:1 v/v) at 10 mg/mL, dried using a rotary evaporator and left 3 hours under high vacuum in a freeze dryer. Phospholipids were then dissolved in 20 mM Sodium phosphate -NaOH, 150 mM NaCl, pH 7.4 and hand-shaken until the formation of a cloudy solution, which was passed 19 times through a 100 nm polycarbonate membrane at 30 °C for the formation of large unilamelar vesicles (LUVs) using a mini-extruder (Avanti Polar Lipids). Phospholipid concentration was estimated according to the ammonium ferrothiocyanate method 48 . Differential scanning calorimetry (DSC). Thermograms were obtained using a VP-DSC (GE Healthcare) at a temperature range from 10 to 40 °C at a scanning rate of 0.5 °C/min. Blank thermograms using buffer alone and 0.5 mM DMPC or 2:1 (w/w) DMPC:DMPG LUVs in buffer were acquired as reference. Peptides were added  Table 6. Hemolytic activity. ** IAP concentration that at least 50% of red blood cells remain intact.
to fresh samples of 0.5 mM LUVs at a concentration of 20 μM (0.04 mol/mol peptide/phospholipids) at room temperature, immediately followed by DSC data acquisition. Each sample was subjected to various thermal scans until there were no distinguishable changes in the thermal profile of the main phase transition (P′ β →L α ) of phospholipids between scans. Data was concentration normalized, baseline subtracted (linear connect), and fitted to a non two-state transition with two peaks determined by the user applying the MicroCal Origin software v7.0. Re-scans for selected cases were acquired using fresh peptide and LUVs solutions to check the reproducibility of the data.
Circular Dichroism (CD). Experiments were conducted on a Jasco-J815 spectropolarimeter (Jasco International Co.). Spectra were acquired at room temperature from 200 to 260 nm as an average of 4 readings   The microbicidal activity of the IAPS was also evaluated. After the growing period showed on Supplementary Material 07 for each organism, 10 µL of the minimum inhibitory concentration and raising concentrations up to 256 µM were transferred to agar plates, and incubated at the same conditions (Table 2). YPD agar and Figure 5. Ramularia areola infection test. Disease severity index ranging from 1 (without symptoms) and 5 (over 50% of leaf area showing symptoms) in both abaxial (a1) and adaxial surface (a2). Areolate mildew severity in cotton leaves after treatment with a commercial fungicide (PrioriXtra) and different concentrations of IAPs Gr01 (b1) and Tc06 (b2). Gr01 and Tc06 showed comparable results to fungicide at concentrations starting at 16 µM and 32 µM, respectively.
Mueller-Hinton agar were used for fungi/yeasts and bacteria, respectively. Meanwhile, susceptibility microplates were incubated again to evaluate if there would be any growing after a longer period. For both approaches, the minimal killing concentration (MKC) was defined as the concentration that no growth was observed after incubation. Three biological replicates were performed for each test.
Evaluation of cytotoxicity of IAPs. The cytotoxicity of the selected IAPS was performed using human red blood cells in strict accordance with relevant guidelines and regulations (Ethical committee -UnB # 1.939.989). Different concentrations of IAPs (128 µM-0.5 µM) were incubated with a final concentration of 2.5 × 10 5 red blood cells.mL −1 . Briefly, 100 µL of red blood cells was incubated with 100 µL of different concentrations of each IAP, prepared in PBS, for 30 minutes at 37 °C, with constant shaking at 120 rpm. PBS and Triton X-100 0.1% (v/v) were used as negative and positive controls, respectively. After incubation, the solution was centrifuged at 900 × g for five minutes followed by transferring 50 µL of the supernatant to flat-bottom polystyrene 96-well micro plates containing 50 µL of PBS. Microplate reading was performed at a BioTek Multireader, using a wavelength of 540 nm. All tests were performed in 3 biological replicates, composed of 3 biological repetitions each. The obtained data was subtracted by the negative control and relative to the positive control, defined as 100% lysis.
Inhibition of M. perniciosa basidiospore germination by T. cacao IAPs. The antimicrobial potential of T. cacao IAPs for inhibiting M. perniciosa basidiospores was evaluated. Briefly, 1 × 10 6 basidiospores mL −1 were incubated for one hour with different concentrations of T. cacao IAPs (256-0.5 µM), at 24 °C, in a final volume of 100 µL. Autoclaved dH 2 0 was used as a control. After incubation, 3 droplets of 10 µL each were placed in a slide coated with agar (1.2% (w/v)). Slides were incubated for 4, 24 and 48 hours at 24 °C. Slides were visualized using a DMRXA light microscope (Leica). Approximately 150-200 basidiospores per droplet from each time point were counted and compared to the control condition, which was defined as 100% germination. All experiments were performed in biological triplicates, with 3 technical repetitions each.
Experimental groups were composed of 13 pots, with two plants each. From each plant, two leaves from sixty one days old cotton plants were selected for treatment. Sixteen days after application, disease severity and number of spores per leaf per experimental group were evaluated. For severity analysis, a score from 1 (no symptoms) to 5 (symptoms over 50% of foliar area) was attributed for each leave. The final score for each treatment was an average from the four leaves from each pot. Spore counting was performed using a Neubauer chamber. Briefly, four leaves from each pot were scrapped and counted, yielding a final spore count per pot. Data and statistical analysis were performed using Excel, GraphdPad Prism and Past softwares.