TcTI, a Kunitz-type trypsin inhibitor from cocoa associated with defense against pathogens

Protease inhibitors (PIs) are important biotechnological tools of interest in agriculture. Usually they are the first proteins to be activated in plant-induced resistance against pathogens. Therefore, the aim of this study was to characterize a Theobroma cacao trypsin inhibitor called TcTI. The ORF has 740 bp encoding a protein with 219 amino acids, molecular weight of approximately 23 kDa. rTcTI was expressed in the soluble fraction of Escherichia coli strain Rosetta [DE3]. The purified His-Tag rTcTI showed inhibitory activity against commercial porcine trypsin. The kinetic model demonstrated that rTcTI is a competitive inhibitor, with a Ki value of 4.08 × 10–7 mol L−1. The thermostability analysis of rTcTI showed that 100% inhibitory activity was retained up to 60 °C and that at 70–80 °C, inhibitory activity remained above 50%. Circular dichroism analysis indicated that the protein is rich in loop structures and β-conformations. Furthermore, in vivo assays against Helicoverpa armigera larvae were also performed with rTcTI in 0.1 mg mL−1 spray solutions on leaf surfaces, which reduced larval growth by 70% compared to the control treatment. Trials with cocoa plants infected with Mp showed a greater accumulation of TcTI in resistant varieties of T. cacao, so this regulation may be associated with different isoforms of TcTI. This inhibitor has biochemical characteristics suitable for biotechnological applications as well as in resistance studies of T. cacao and other crops.


Materials and methods
In silico analysis of the sequence of TcTI and phylogenetic analysis. The complete nucleotide sequence of the Theobroma cacao trypsin inhibitor was obtained from the cocoa EST database (http:// esttik. cirad. fr/, accession number KZ0ACR1YI20FM1).
The amino acid sequence was obtained using the Expasy Translation tool. The signal peptide prediction was conducted using the SignalP 4.0 Server 34 . The recognition of conserved domains was done using Pfam (http:// pfam. xfam. org/ search/ seque nce) 35 , and analysis carried out with MEROPS (http:// merops. sanger. ac. uk/).
The phylogenetic tree was constructed using the Mega 7.0 software, based on the neighbor-joining method. The sequences analyzed for the formation of the phylogenetic tree came from seven gene sequences for the trypsin inhibitor found in the Cirad cacao genomic bank (http:// esttik. cirad. fr/). From the alignment analyses, sequences of other homologous proteins with over 80% similarity were considered for the construction of the phylogenetic tree.
To analyze the transcriptional profile of TcTI in T. cacao infected by Mp, reference files containing the transcripts of Theobroma cacao cv 'Comum' (Forastero genotype) in FASTA format (GCA_000208745.2_Criollo_ cocoa_genome_V2/) were used. These files were obtained from the GenBank database (https:// ftp. ncbi. nlm. nih. gov/ genom es/ genba nk/ plant/ Theob roma_ cacao/ latest_ assem bly_ versi ons/) through the RNA Galaxy workbench 2.0 platform (https:// rna. usega laxy. eu), using the Salmon extension 36 . Twenty-one transcripts corresponding to ~ 21 kDa proteins with characteristics of Kunitz-type trypsin inhibitors (KPIs) were selected. The relative quantification of each transcript corresponding to the TcTI gene and its gene family was performed based on public data from ten libraries (five from the control condition and five from the condition of infection by Mpbiotrophic phase) of RNA-Seq data from T. Cacao 37 , available at NCBI's SRA (https:// www. ncbi. nlm. nih. gov/ sra) under number SRA066232. In summary, the identified transcripts corresponded to the infected apical meristem tissues of T. cacao 30 days after inoculation with a suspension of M. perniciosa basidiospores, representing the green broom stage. In order to find the proteins associated with the analyzed transcripts, BLAST analysis using the BlastX command (https:// blast. ncbi. nlm. nih. gov/) was performed and a heat map was generated for visualization of the expression profile of the transcripts using the ComplexHeatmap packages in the R statistical software (R Core Development Team 2013) 38 .
Expression and purification of rTcTI in E. coli. The E. coli culture transformed with the recombinant plasmid was grown in LB medium (Luria-Bertani) and incubated at 37 °C under stirring at 200 rpm until OD 600 0.5 and 0.7 ηm. The expression was induced by adding 0.4 µmol L −1 of IPTG (isopropyl beta-D-thiogalactoside) for 4 h at 37 °C. The cell culture was then centrifuged at 10,000 g, and the precipitate was dissolved in binding buffer and lysozyme (100 μg mL −1 ) for 30 min. Total bacterial extract was ultrasonicated for 20 s interspersed with 30 s of rest on ice using the amplitude parameter of 70% in the ultrasound processor (pGEX 30). This process was carried out until the extract lost its viscosity, producing total membrane disruption. Purification of the protein rTcTI was performed as described by Pirovani et al. 33 .
Inhibitory activities. Activity against porcine trypsin. The trypsin inhibitory assay was performed using BApNA (Nα-benzoyl-D, L-arginine 4-nitroanilide hydrochloride) as a substrate. A total of 10 µl of trypsin solution (0.3 mg mL −1 in 2.5 mmol Tris-HCl, pH 7.5) was incubated for 15 min at 37 °C with 60 µL of inhibitor rTcTI solution (0.5 mg mL −1 ) and 120 µL of 50 mmol L −1 Tris-HCl buffer, at pH 7.5. The molar ratio of rTcTI versus trypsin was 1:1 for all assays. Reactions began with the addition of 200 µL of 1.25 mol L − BApNA solution. The colorimetric results were measured by absorbance at 410 ηm. Residual activity of 100% trypsin was attributed for the control readings (no inhibitor). The activity of the remaining concentrations of the inhibitor was calculated using the controls following equation: Residual activity [%] = [∆ABS410 CI/∆ABS410 SI] × 100, where ∆ABS410 SI corresponds to the change in absorbance at 410 ηm in the absence of the inhibitor, and ∆ABS410 CI corresponds to the change in absorbance at 410 ηm in the presence of the inhibitor. Test of thermostability of rTcTI. Aliquots of 7 μmol L −1 of the recombinant protein rTcTI were incubated at temperatures of 40,50,60,70,80, and 90 °C for 10 min. Aliquots of 20 μl of protein from each thermal treatment were placed in triplicate in an ELISA plate and 10 μl of porcine trypsin (Sigma©) at concentration of 0.5 mg mL −1 was added and incubated at 37 °C for 15 min. Afterwards, 200 μl of BApNA 1.2 mmol L −1 was added to all treatments. Chromogenic substrate hydrolysis followed the same parameters previously calculated for the inhibition curve. The residual activity calculations at different temperatures were performed by the same method used for the inhibition curve with the control reaction, using the same intensity of heat treatment without the inhibitor, stipulated as 100% inhibitory activity. Structural analysis. Circular dichroism (CD). The rTcTI protein was subjected to CD analysis in a J-815 spectropolarimeter (JASCO). The TcTI protein was purified from the E. coli extract and was subsequently dialyzed in phosphate buffer at 50 mmol L −1 , pH 7.4, to remove the salt solution, thereby avoiding any interference in the analysis. To identify the presence of secondary conformations, a scan spectrum of 190-250 nm was used in 1 mm quartz cuvettes. Data were collected with a scan rate of 50 nm min −1 and 0.5 nm range. Readings were performed at 26 and 96 °C after 5 min incubation under the same temperatures, and the average of three consecutive measurements was used for the analysis. The percentage of the secondary structure based on the CD spectrum was calculated with the K2D3 software 41 .

Determination of the inhibition constant [Ki
Molecular modeling of TcTI. The structure of the trypsin inhibitor from Theobroma cacao was inferred using the Swiss-Model server (http:// swiss model. expasy. org/) 42  . All experiments were performed in accordance with relevant guidelines and regulations. The accumulation of IT inhibitors in cocoa meristems infected by the fungus M. perniciosa was analyzed. Two varieties of cocoa were selected for the experiment, Catongo and TSH1188, susceptible and resistant varieties, respectively. After 30 days of cocoa seed germination, 240 seedlings were selected of each variety and kept in the greenhouse at the CEPLAC premises. A total of 120 seedlings of each variety were separated for inoculation with a 2,105 × mL −1 of basidiocarp suspension (M. perniciosa) 45 and maintained for 24 h in a humid chamber at 25 °C. This acclimatization allowed the germination of M. perniciosa spores, their penetration and consequent plant infection.
The meristems (2-3 cm segments) were collected at 1, 5, 45 and 60 days after inoculation. Approximately 10-20 meristems were collected for each stage. The extraction of total proteins followed the protocol described by Pirovani et al. 46 . The protein extract was measured with the 2D Quant Kit (GE Healthcare) according to the manufacturer's manual of recommendations. The protein extract concentrations were normalized for the analysis of IT accumulation by the western blot technique 40 , and fragments were quantified by the Gel.Quant 3.1 software. The protein extracts of the different treatments were normalized for analysis on the 2D SDS-PAGE gel, as described by Santos et al. 47 .
Bioassays against larvae of Helicoverpa armigera. The insects were obtained from the colony maintained at the Costa Lima Quarantine Laboratory, Embrapa Meio Ambiente, reared with artificial feed according to the method described by Vilela et al. 48 . For biological tests, Helicoverpa armigera larvae at 7 ± 1 days of age and 2 h of starvation were used. Each replicate was composed of one larva placed individually in a Petri dish (diameter = 9 cm) and maintained in a growth chamber at 27 ± 1 °C, 70 ± 5% relative humidity and 12:12 photoperiod (L: E). There were 13 replicates per treatment. Soybean leaves were cleaned and immersed in a solution of 0.1 mg mL −1 rTcTI in 10 mmol L −1 phosphate buffer at pH 7.2 with Triton X-100 (0.01% v/v). Treated leaves were allowed to dry at room temperature and were offered to the H. armigera larvae for 24 h. The leaves with rTcTI and control treatments (10 mmol L −1 phosphate buffer, pH 7.2 with Triton X-100 [0.01% v/v]) were removed and the larvae were all maintained on artificial feed (the same utilized to maintain the insect colony) under controlled conditions. The larvae were weighed before treatments and one, three, and seven days after the bioassay, and the mortality was measured until the death of all larvae. The percent reduction in weight gain was calculated according to the method described by Halder et al. 49  ]] * 100. The amount of leaf surface consumed was determined using a LI-COR leaf area meter (LI-3100A; LI-COR Biosciences, Lincoln, NE, USA) before and after 24 h of exposure to the larvae. Analysis of variance (ANOVA) and the Scott-Knott test were applied to data on leaf area consumption, larval weight and mortality, using the Sisvar software with p-value < 0.05.
Collection of biological samples. The authors declare that all biological material used in the study was collected and treated in accordance with the institutional and national norms of theExecutive Commission of the Cacao Crop Plan (Comissão Executiva do Plano da Lavoura Cacaueira-CEPLAC) and the Embrapa Environmental Research unit (Embrapa Meio Ambiente).

TcTI sequence analysis. Sequence analysis of the trypsin inhibitor identified in the EST cocoa library 39
revealed an open reading frame sequence of 660 bp encoding a protein with 219 amino acids (Fig. 1b). The analysis also predicted molecular weight and isoelectric point of 23.96 kDa and 5.71, respectively. The molecular weight and isoelectric point of the mature polypeptide without the signal peptide were 21.14 kDa and 5.15, respectively. The analysis of the predicted amino acid sequence of TcTI using the SignalP 4.0 program predicted the presence of a signal peptide with a cleavage site between residues A26 and D27 (Fig. 1a). The protein without the signal peptide showed four possible sites of O-glycosylation, corresponding to T61, T28, T29, and S158 (Fig. 1a). Another analysis using the Pfam program showed that the Kunitz-type protease inhibitor (KPI) domain comprised 20 amino acids and was located between amino acids V21 and V50.
Phylogenetic cluster analysis with the protein sequence of other plant groups showed that the genes for cocoa trypsin inhibitors were grouped into three different branches (Fig. 2). The TcTI protein showed high similarity with inhibitors of species of the same genus Theobroma and with a similar genus, Herrania. The most similar protein in the cacao genomic bank with TcTI was related to the Tc00_p067240 locus and was slightly more distant from the Tc00_g042540 locus, but in the same branch (Fig. 2). The Tc02 chromosome gene products were grouped in the same branch, showing 99% identity and were closer to TcTI than the Tc05 chromosome gene products. The products of the genes referring to miraculin (Tc05_g020940 and Tc05_020950) located on the Tc05 Inhibitory properties and Ki determination of rTcTI. The rTcTI produced in a heterologous system in this study was active because residual trypsin activity was markedly reduced with increasing concentration of rTcTI in the reaction medium (Fig. 3a). The trypsin activity decreased to 70% at the concentration of 0.5 mol L rTcTI. An increase in concentration from 0.5 to 1.2 μmol L −1 of rTcTI reduced the residual activity to approximately 12% (Fig. 3a). The determination of the maximum reaction velocity (Vmax) and Michaellis-Menten constant (Km) was necessary to calculate the inhibition constant, Ki. Vmax values were 6.23 mmol L −1 min −1 in the absence of the inhibitor and 6.22 and 5.86 mmol L −1 min −1 in the presence of the inhibitor (rTcTI) at concentrations of 66.6 and 133.3 ηmol L −1 rTcTI, respectively. According to the double-reciprocal model, the curves showed a very similar intersection point on the 1/Vo axis, indicating that the inhibitor did not affect the Vmax values. The inhibition constant, Ki, determined according to the double-reciprocal model was 4.08 × 10 -7 mol L −1 for porcine trypsin in the presence of the substrate BApNA (Fig. 3b).
Thermostability studies of rTcTI. The rTcTI was subjected to thermal stability testing by incubation for 10 min. at temperatures of 40-90 °C. The rTcTI showed tolerance to heat treatment up to 60 °C. The treatment at 70 °C induced a decrease in the percentage of inhibition, but it still retained over 90% of its inhibitory activity. Incubation at 80 °C promoted a drop of 40% in its inhibitory activity (Fig. 4). The rTcTI completely lost its inhibitory capacity when treated at 90 °C, indicating complete protein denaturation.
Analysis of the secondary structure of TcTI. The structural analysis of rTcTI by CD measurements showed a predominance of irregular secondary structure, with absence of the α-helix and the presence of β-strands. The presence of β-strands was indicated by the negative peak at approximately 209-220 ηm (Fig. 5a). This analysis was performed at two different temperatures, 26 °C and 96 °C, to detect possible variations in the rTcTI structure during denaturation by heat. The rTcTI spectrum was similar for both treatments, having the same type of negative peak at 220 ηm, indicated by the dotted line graph for the treatment at 96 °C (Fig. 5a). The spectrum value obtained by CD was measured by the K2D3 server, indicating that the secondary structure of the TcTI has 37.8% beta sheet, 35.2% random coil and 8.8%, alpha helix. The 3D model was predicted by comparative analysis with other inhibitors deposited in the Protein Data Bank. For the creation of the three-dimensional model, the 3IIR mold structure was used, with an identity of 44.8% (Murraya koenigii) 43 . According to the Ramachandran plot ( Supplementary Fig. 1), 96.9% of residues are in energetically favorable regions. TcTI showed 12 regions of antiparallel β-strands and a small torsion in α-helices. This protein presented an inhibitory loop (Lys), as indicated in Fig. 5b.   (Fig. 6a,b,c,d and Supplementary Fig. 3). However, at 45DAI the identification of TIs was higher in both genotypes. Five TI isoforms were down accumulated in TSH1188, but in lesser intensity than in Catongo, which in turn showed high down accumulation of TI isoforms (Fig. 6e,f,g,h and Supplementary Table 1). It is interesting to note that TSH1188 also showed upward accumulation of 7 TIs at 45 DAI, while Catongo showed only down accumulation of these proteins. Among those upward accumulated TIs from TSH1188, five putative miraculin-like proteins were encountered. BLAST analysis of proteins also showed similarity with TIs of T. cacao. All identified PI spot sequences and further proteomic information can be found in Supplementary Table 1.
Bioassays with T. cacao and the WBD pathogen. The analysis of the protein profile of meristems showed the difference in proteins expressed under different treatments. The fragment of approximately 23 kDa presented a more intense variation of expression in comparison with the others, being present in both the initial and final stages after inoculation (Fig. 7a,b and Supplementary Fig. 4). This band corresponded to the trypsin www.nature.com/scientificreports/ inhibitor. In the final stages, the difference in the protein profile in the infected and control treatments was more accentuated compared to the initial stages (Fig. 7b). The accumulation of the trypsin inhibitor was analyzed in meristems at different times after inoculation with M. perniciosa. In the initial stages of infection, at 1 DAI (days after inoculation), there was greater accumulation of inhibitors in the infected treatment for the resistant variety (TSH1188), in contrast to the susceptible variety (Catongo), in which there was a decrease in the accumulation of this inhibitor (Fig. 7c).
In the interval of five DAI, the inhibitor accumulation was lower for both varieties, but for the control treatment in TSH1188, the inhibitor was more abundant according to the western blot test. In Catongo, the accumulation of the inhibitor in the initial stages of infection and its respective controls were lower, maintaining the basal expression. For the final stages of infection, between 45 and 60 DAI, the abundance of the inhibitor decreased considerably in comparison with the control treatments (Fig. 7b,d).
At 60 DAI, considered an advanced stage of the disease, it was not possible to detect significant accumulation of these inhibitors by the western blot technique. This occurred for both varieties. It was not possible to detect the presence of the inhibitor at 45 DAI in the Catongo variety under the experimental conditions, but slight accumulation of the inhibitor was detected in TSH1188.

Comparison of the transcriptional profile of the KPI gene family.
To determine the expression pattern of the gene family of the trypsin inhibitor, the relative quantification of 21 transcripts encoding ~ 21 kDa proteins with identity and coverage above 96% and 50%, respectively, was performed (Supplementary Table 2). Based on the results of BLAST X, compared to the findings of Gesteira et al. 29 , the transcript lcl_ NW_017234724.1_mrna_XM_018129750.1_34764, corresponded directly to the TcTI protein (EOY21251.1)   (Fig. 8). However, the transcript corresponding to TcTI (lcl_NW_017234724.1_mrna_XM_018129750.1_34764) had a repressed expression profile. Some of these transcripts with up-regulated expression corresponded to miraculin (lcl_NC_030854.1_mrna_ XM_007029298.2_19347 and lcl_NC_030854.1_mrna_XM_007029299.2_19348) protein accumulated in the resistant genotype, TSH1188 of the present study.

Assessment of the effect of TcTI on development of H. armigera larvae. Helicoverpa armigera
larvae were fed with soybean leaves sprayed with rTcTI at 1 mg mL −1 . Among the larvae that ingested the protease inhibitor (rTcTI), 33.3% died in the second larval instar while only 8.3% of individuals in the control group died in this same larval instar (Fig. 9d). The reduction of weight of the larvae treated with the inhibitor occurred especially during the first 24 h. The number of larvae fed with the rTcTI-treated leaves decreased compared to the larvae in the control treatment (Fig. 9b). The larvae of the rTcTI treatment had 71.9% reduction in weight gain compared to the larvae in the control group. Three days and seven days later, the larvae that ingested IP decreased by 62.9% and 11.0% in terms of body mass gain, respectively, compared to those of the control treatment (Fig. 9c). The no-choice bioassay showed that the rTcTI did not interfere with the larval consumption, since there was no significant difference in the average leaf area consumed by the larvae of the control treatment compared to those treated with rTcTI according to the Mann-Whitney test at 5% probability (Fig. 9a).

In silico analysis indicated that TcTI is a Kunitz-type inhibitor. The study of cocoa trypsin inhibitors
is of particular interest for investigation of plant-pathogen interactions in which the trypsin inhibitor gene has been identified as being differentially expressed during the Theobroma cacao-Moniliophthora perniciosa interaction, which causes the witches' broom 29 . In Brazil, M. perniciosa is responsible for drastic production and yield losses of cocoa beans in the south of the state of Bahia 50 . The trypsin inhibitor sequence fragment detected in EST libraries of the interaction T. cocoa-M. perniciosa 29 , was used to identify the complete inhibitor sequence in the cacao EST database (http:// ESTtik. cirad. fr/) 39 The analysis of conserved domains identified a region related to the family of trypsin inhibitors of the Kunitz type (Fig. 1a). The complete TcTI sequence of cocoa has an estimated molecular weight of 219 amino acids at 23.9 kDa, similar to most Kunitz-type inhibitors, which have a mass of 18-24 kDa 28 . Although TcTI has six cysteine residues, four of them occupy conserved regions for the Kunitz family, forming two sulfide bridges. This shows that the TcTI structure is similar to other Kunitz-type inhibitors. In addition, the inhibitors that have more than two disulfide bridges and more than four residues of cysteine are grouped in the same category due to the similar protein structure 51 . This suggests that the additional sulfide bridges do not promote any drastic modification of their own tridimensional structure, which has been demonstrated by means of the tridimensional model, also seen in the PDB 3D model (3iir). The latter presents two regular disulfide bridges of the Kunitz members and contains seven cysteine residues in its formulation 43 .
When analyzing the sequence referring to the Kunitz domain of TcTI, high similarity was found of inhibitors of species of the same genus (Theobroma) with species of the genus Populus. The library described by Argout et al. 52 shows that Theobroma cacao shares some families of genes with Populus trichocarpa. Another similarity was noticed for the protein miraculin, which acts by limiting cell damage in conditions of biotic stress. This protein has been identified in witches' broom-resistant cocoa genotypes 47 . Similarity was also noted with sporamin, classified as a TI Kunitz protein, present in sweet potatoes. It is a vacuolar storage protein whose gene levels  www.nature.com/scientificreports/ are highly regulated in response to biotic and abiotic stresses 53,54 . This high similarity was confirmed using the ExpasyBlast tool with the sequence of TcTI and other proteins related to the cocoa trypsin inhibitor with TIs of other species. These analyses showed that although these genes are very similar, small variations exist, so they need to be grouped in different clusters (Fig. 2).

rTcTI accumulates in soluble E. coli extract and exhibits competitive-like inhibition. Eukary-
otic proteins generally accumulate in insoluble extracts of E. coli for different reasons, such as incorrect folding, failure of the post-translational modification or problems in DNA coding. Despite the fact that the Tcti ORF showed 28.6% rare codes for E. coli ( Supplementary Fig. 2), the expression of rTcTI was analyzed using Rosetta [DE 3 ], a specialized strain with optimized tRNA for rare codons. This strategy was successful since rTcTI accumulated abundantly in the soluble E. coli extract and the purified protein was active against porcine trypsin, as shown by the inhibitory assay (Fig. 3a). Other protease inhibitors from cacao have also been successfully expressed in the E. coli Rosetta [DE 3 ] strain 33,55,56 . With a concentration of 0.25 mol L −1 , rTcTI presented 60% inhibition of porcine trypsin activity. One TI from a scorpion presented a single curve highly similar to the plant-like inhibitor obtained for rTcTI, but the concentration of the inhibitor was tenfold greater than the one analyzed for rTcTI 57 .
The inhibition shown by rTcTI is of the competitive type, since the intersection of the curves occurred on the ordinate axis, indicating changes in the Km values and minimal changes in the Vmax values (Fig. 3b). This is a usual characteristic of these inhibitors from the Kunitz family, indicating that a direct interaction with the catalytic site of the enzyme might exist, the same site that binds to the substrate. rTcTI seems to be a powerful trypsin inhibitor with regard to its Ki value (4.08 × 10 -7 mol L −1 ), indicating high affinity between rTcTI and swine trypsin. Some Kunitz inhibitors are considered good candidates for strong interaction with enzymes that present lower Ki values and a larger inhibitory capacity, such as the case of Kunitz ITs-Soy Glicine Ki (3.2 × 10 -9 mol L −1 ) 58 , Trigonella foenum-graecum Ki (3.01 × 10 -9 mol L −159 , Entada acaciifolia Ki (1.75 × 10 -9 mol L −1 ) 60 , and Pithecellobium dumosum Ki (5.7 × 10 -10 mol L −1 ) 5 . These inhibitors can present even lower Ki values for proteases from insect pests' digestive enzymes, and these interactions may be even more specific 61 . rTcTI is stable at temperatures up to 60 °C. The rTcTI protein presented moderate stability at high temperatures, as shown by the thermal effect study (Fig. 4). The elevated inhibitory activity was present until 70 °C, and in the 10 min of treatment it increased to 80 °C, which revealed a minimum of 10% of its inhibitory capacity. A similar profile was found for the trypsin inhibitor of Vigna radiata, which maintained stable activity until 90 °C. The activity decreased stepwise with increasing time since hatching 62 . The Cassia grandis CgTI inhibitor also showed high thermostability at 60 °C, maintaining 100% of its inhibitory activity, followed by only slight activity reduction at 80° C 63 .
The thermostability of rTcTI may be related to structures that are partly random coil (Fig. 4). The disordered region to promote greater flexibility of the protein, with a relatively low loss of activity until to occur by chance may promote greater flexibility of the protein, with a relatively low loss of activity until 60 °C. The presence TcTI has a secondary structure rich in non-ordered regions. The TcTI protein has 5.4% α helix and 33.3% β strand based on the CD spectrum. This indicates that the largest part of the protein does not have a welldefined secondary structure (Fig. 5a). The presence of β-sheets is indicated by the negative pitch in the range of 220 ηm in the CD spectrum (Fig. 5a). Similar spectra are also found for other inhibitors of the Kunitz type 57,65 . As already mentioned, the secondary structure of rTcTI is predominantly composed of random coil arranged by chance and some β-sheet regions. The sweep analysis at 96 °C presented a similar spectrum to the previous sweep at 26 °C, showing that the native protein presented a similar profile to the denatured protein (Fig. 5a). However, the central structured part of the protein is substantially responsible for its activity and is not easily renewed, as shown by its activity loss during the thermal treatment (Fig. 4). These characteristics are noteworthy for a Kunitz-type inhibitor, since they do not have well-established α-helices. Similar structures have been described for other Kunitz inhibitors 59,60,[66][67][68] . The modeling through homology demonstrated that TcTI has a rich structure of random coil, at 35.23%, and a few structures of β-sheets, with 37.82%, corroborating the CD analyses (Fig. 5). The inhibitory site of the

Transcriptional profile of the KPI gene family operates in the initial stage of Mp infection. The
T. cacao genome contains a large gene family of trypsin inhibitors of the Kunitz type, whose transcripts were detected in the cDNA library of the interaction between T. cacao and M. pernciosa by 29 , which correspond to ~ 21 kDA proteins. The differential expression of the 21 selected genes (Fig. 8) can be changed according to the pathogen, inoculation time, stage of infection and plant species 74 . However, having a diversity of KPIs (Kunitz-type protease inhibitor) can also be advantageous for the plant in interacting with pathogens.
Considering that the Mp fungus was in the advanced infection phase [biotrophic phase] and the transcript corresponding to TcTI showed negative regulation, this pattern may have been caused by the early expression of trypsin inhibitors, possibly indicating that cell damage occurred at the onset of infection 75 . Additionally, with a transcriptional profile of the TcTI not accumulating 60 days after infection/inoculation (Fig. 8) based on studies by Teixeira et al. 37 in experiments with cacao genotypes (Catongo and TSH 1108), Santos et al. 47 characterized the dynamics of proteins involved in the development of WDB disease and found up-regulated proteins related to trypsin inhibitors in the resistant cacao genotype 60 DAI. Gesteira et al. 29 analyzing ESTs from the same two cacao genotypes 90 DAI and identified more than 30 expressed genes corresponding to trypsin inhibitors in the Mp-resistant genotype (TSH 1108).
This discrepancy between the protein profile and transcriptional profile can be explained by post-translational changes undergone by proteins that influence their accumulation 76 . In addition, this difference in transcriptprotein expression pattern may also be associated with the variety of cacao used by Teixeira et al. 37 , which differs from the genotypes we used and the studies carried out by Gesteira et al. 29 and Santos et al. 47 .
TI isoforms accumulate in the early stages of WB in a resistant cocoa genotype. Gesteira et al. 29 identified 32 trypsin inhibitor ESTs only for meristems of a resistant variety of T. cacau from an accumulative pool of different stages of infection by M. perniciosa. According to the immunodetection analysis of T. cacao meristems infected by WBD, a variation in the dynamics of differential inhibitor accumulation between contrasting cocoa genotypes was identified (Fig. 7). In the early stages of infection, there is greater accumulation of these inhibitors in the resistant variety TSH1188, such as the initial response to infection by the fungus between 1 and 5 days after exposure. In the most advanced stages of the disease, between 45 and 60 days, the accumulation of these inhibitors decreases significantly. Similar behavior was noticed in the study by Santos et al. 47 , who traced the protein profile of two varieties of T. cacau, TSH 1188 and Catongo, the same ones used in the present study, infected by WBD. They found that IPs were differentially expressed in both varieties, but with advancing disease, the resistant variety TSH 1188 presented down-regulated IPs. The same result was observed when the profile of the IPs was analyzed in 2D gels. In addition, it is possible to see different isoforms and IP intensity in the different Mp treatments of T. cacao ( Fig. 7 and Supplementary Table 1).
In resistant cocoa plants, a large amount of hydrogen peroxide (H 2 O 2 ) is produced at the beginning of Mp infection, which contributes to the control of infection and plant resistance. This response is accompanied by increased expression of some genes, such as Glp (germin-like oxalate oxidase protein from cacao), which acts for the formation of H 2 O 2 , as a temporary defense response of the plant 77 .
Alves et al. 78 also found differential expression of genes of the GPX family (glutathione peroxidase) in cocoa plants inoculated with Mp, at the initial stage of infection. Greater accumulation of PRs was also found from the onset of the disease to 45 days after infection, in the biotrophic phase of the disease 37,47,79 . The phylloplanin gene (TcPHYLL) and other plant defense genes also have their expression up-regulated, which increases the levels of transcripts in cocoa seedling tissues inoculated with the Mp fungus, indicating that this gene is related to biotic stress response induced early infection 80 . www.nature.com/scientificreports/ Increased expression after inoculation in resistant plants, followed by decreased expression, is a pattern observed regarding defense proteins in T. cacao plants, such as in legumain, TcLEG9 81 . Additionally, Alves et al. 78 also detected increased expression of GPX family genes in the green broom phase in susceptible cocoa plants.
Pirovani et al. 33 proposed a direct role of cacao cystatins in defense against Mp, and also described their action in the development of programmed cell death symptoms. With regard to cystatin, Cardoso et al. 55 characterized a cysteine (TcCYSPR04), suggesting that within 72 h after MpNEP-plant interaction, there is participation of several isoforms of cysteine-proteases in physiological events in the molecular battlefield of the interaction of T. cocoa and Mp. However, after the initial phase of infection, with the onset of the biotrophic phase of the disease, a pattern between the pathogen and host is established that can last from 45 to 90 days, characterized as the transition to the saprophytic phase, causing new peaks of expression of defense-related genes. It is thus possible to infer from our data that the pattern of accumulation of PI's followed the gene expression that is commonly observed in defense genes in resistant cocoa genotypes. While for susceptible genotypes there is no increase in H 2 O 2 production at the beginning of the infection, this occurs later, in the transition to the biotrophic phase of the disease 82 .
Moreover, studies suggest that Kunitz-type trypsin inhibitor genes are in a constant evolutionary process with possible modifications between different genotypes of the same plant species 74,83 . This constant evolution reflects the importance of these proteins in the T. cacao x M. perniciosa pathosystems, since detection of small structural differences can provide useful information about the specificity and its mechanism of action, resulting in the need to characterize these isoforms to strengthen the biotechnological application of PIs. Another group of inhibitors from the phytocystatin family also accumulated abundantly in the tissue of mature leaves, but only in infected leaves. No significant accumulation was seen by Pirovani et al. 33 . This decrease in inhibitors in cocoa plants may occur due to the response mechanism of cocoa, which activates the programmed cell death signaling pathway, affecting the expression of some proteases, as described for cysteine protease, which increases in infected tissues 46,55 . In meristems under normal conditions, the accumulation of the inhibitor gradually increases during the maturation stages, showing that these inhibitors are part of the natural protection of meristem tissues (Fig. 7b).
However, when we analyzed the PIs accumulated in TSH1188 throughout the development of WBD, five putative miraculin-like proteins were found. It is possible to observe the expression pattern of the transcripts corresponding to miraculin in Fig. 8. They were regulated in T. cacao at 60 DAI. Miraculines are glycoproteins related to cell stress, including biotic stress, to limit cell damage, particularly due to their amino acid sequence similarity with Kunitz-type inhibitors 84,85 . Miraculin genes described as Kunitz-type protease inhibitors were identified in studies by Teixeira et al. 37 , a finding that corroborates the idea that the defense responses of T. cacao are already induced in the biotrophic phase of the fungus Mp, which corresponds to the green broom stage of the disease. rTcTI affects larval growth of the H. armigera. rTcTI can inhibit growth of pests, such as H. armigera, as shown by feeding studies and tests carried out with rTcTI to evaluate its effects against the larvae of H. armigera. These larvae presented lower weight gain when fed with soy leaves with rTcTI (Supplementary Table 3).
In the most advanced stages, after 7 days of life, the larval weight differences were not statistically significant. However, the larval weight gain was less affected by the rTcTI inhibitor. This fact may be related to our feeding of the larvae with rTcTI only in the first 24 h, allowing recovery after the end of the defense. Trypsin and chymotrypsin are the main enzymes of insect defense in the digestive system, where they are regulated according to the specificity of the PIs in the diet 14,30,61,86 .
Mortality rates were not significantly different, although rTcTI caused high mortality in the second larval instar (45%). Mehmood et al. 87 also used Kunitz trypsin inhibitors (AnTI) to inhibit the mycelial growth of Aspergillus niger and Fusarium oxysporum, and found statistically significant mortality rates. This was due the use of a higher concentration of the inhibitor (50 µg) in comparison with that used in this study. Therefore, we can confirm that at concentration of 1.5 µg, rTcTI did not affect the average lifespan of H. armigera larvae when fed with leaves containing protease inhibitors, although it caused the death of the larvae in the second instar. This fact may be associated with the concentration of the inhibitor and the incubation time, as well as an exposure response of the larvae to the inhibitor. According to Philippe et al. 74 , insects can develop defense mechanisms and produce proteases that are less susceptible to the action of inhibitors. Other studies have revealed that the tools to control larval growth inhibition do not drastically affect insect mortality. For this reason, the use of these substances can be considered an important strategy for pest control, since they do not interfere in the selection of resistant populations 7,88 .
Spraying protease inhibitors on leaves seems to be an efficient feeding technique. The protease inhibitors are anti-digestive and reduce gut protease activity, thus eliciting compensatory consumption by insects 89 . However, there was no significant difference between the consumed areas of the leaves sprayed with the rTcTI inhibitor compared to the control leaves.

Conclusion
The present study confirms that TcTI is a trypsin inhibitor of the Kunitz type with excellent potential for biotechnological application due to its biochemical characteristics, such as inhibition of the competitive types, stability at elevated temperatures and high inhibitory capacity.
Its inhibitory effect on the Mp fungus indicates its functioning as a defense molecule in the pathosystem M. perniciosa × T. cacao, in addition to corroborating other studies that have reported its important role in inhibiting the mycelial growth of pathogenic fungi. Studies of its conformation found a secondary structure rich in non-ordered regions, which may be related to the flexibility of the protein in presenting thermostability. These