Introduction

Protein acetylation is an evolutionarily conserved post-translational modification in both eukaryotes and prokaryotes1. Acetylation of lysine residues in proteins is a dynamic and reversible process that was first discovered in histone proteins nearly fifty years ago2.Acetylation of histones and other transcription factors in nucleus has been extensively studied in regulation of gene transcription3. After discovery of lysine acetylation in non-histone proteins, it is believed that the extent of this modification is not restricted to nuclei, which has greatly expanded our understanding on functions of this modification4,5.In recent ten years, lysine acetylation has been found to occur in almost every compartment of a cell, such as the mitochondria and the cytoplasm6,7,8, and to play important roles in various cellular processes including protein-protein interactions9, enzymatic activity10,11, metabolic pathways9,10,12,13,14, cell morphology7, calorie restriction15,16, and protein-nucleic acid interactions17,18. Therefore, lysine acetylation is believed to be a main signaling modulator due to its widely occurrences and diverse cellular functions19.

Proteome-wide lysine acetylation profiles in many prokaryotes2,12,20,21,22,23,24,25,26,27 and eukaryotes6,7,8,15,28,29,30,31,32,33,34 have been investigated by mass spectrometry (MS)-based proteomics. These studies have provided definite evidences about biological functions of lysine acetylation. Identification of acetylome on proteomic level greatly increases the knowledge of lysine acetylated proteins and expands global view of their functional landscape28. However, few studies have been reported about the lysine acetylome in plant fungal pathogens. Until now, only two recent papers reported proteome-wide analysis of lysine acetylation in plant pathogens Botrytis cinerea and Fusarium graminearum 28,35.

In this study, we performed large-scale identification of lysine acetylated proteins in the rice blast fungus, Magnaporthe oryzae, which can cause rice blast, one of the most devastating diseases on rice throughout the world36,37,38. We identified 2,720 lysine acetylation sites in 1,269 proteins, which was account for about 10.3% of the total proteins in this fungus. Several amino acid residues surrounding the lysine acetylation sites were conserved, including KacR, KacK, and KacH. The lysine acetylated proteins were predicted to involve in diverse cellular functions and located in 820nodes and 7,709 edges among the protein-protein interaction network. Dozens of lysine acetylated proteins are found to be important to vegetative hyphal growth and fungal pathogenicity. In summary, our results provided the first lysine acetylome of M. oryzae and suggested protein lysine acetylation played important roles to fungal development and pathogenicity.

Results

Proteome-wide features of lysine acetylation sites and proteins

To perform large scale lysine acetylome analysis of M. oryzae, an integrated approach including protein extraction, trypsin digestion, HPLC fractionation, affinity enrichment, and high-resolution LC-MS/MS with following database search and bioinformatics analysis was employed in this study. Samples were originated from vegetative hyphae shaken in liquid complete medium of strain P13139. Two biological replicates were performed to confirm the integrity of our data. When the MS data was obtained, we searched them against the M. oryzae database concatenated with reverse decoy database, and performed QC validation. Firstly, we checked the mass error of all the identified peptides. The distribution of mass error was near zero and most of them were less than 0.02 Da, which mean mass accuracy of the MS data fitted the requirement (Fig. 1a). Secondly, length of most peptides distributed between 8 and 20, which agreed with the property of tryptic peptides (Fig. 1b). Therefore, the sample preparation reached the standard.

Figure 1
figure 1

QC validation of MS data and summary of acetylated proteins. (a) Mass error distribution of all identified peptides, (b) Peptide length distribution. (c) Venn diagram representing the number of acetylation sites for the two biological replicate analysis, (d) Distribution of acetylated proteins based on their number of acetylation sites.

Acetylation sites detected by both of two replicates were used in our analysis (Fig. 1c). A total of 2,720 lysine acetylation sites were finally identified, and these lysine acetylation sites were distributed in 1,269proteins (Fig. 1d and Table S1), which was account for about 10.3% of the total predicted proteins in M. oryzae 39,40. The number of lysine acetylation sites in proteins was diverse and distributed from one to ten (Fig. 1d). There were 678 proteins with only one acetylation site, which was account for 53.4% of acetylated proteins. Meanwhile, 257 (20.3%), 143 (11.2%), 72 (5.7%), and 34 (2.7%) proteins contained two, three, four, and five acetylated sites respectively (Fig. 1d). Other 6.7% of modified proteins contain more than five acetylated sites.

Lysine acetylation motif investigation

To determine the pattern of lysine acetylation sites, software motif-x was used to analyze amino acid sequences from the −10 to + 10 positions of the identified acetylation sites. As shown in Fig. 2a, a total of 11conserved motifs were identified in the acetylated proteins, i.e. KacR, KacK, KacH, KacXK, KacXR, KacN, KacS, KacT, KacF, KacXXR, KacV(Kac represented the acetylated lysine and X represented a random amino acid residue).Among them, the motifs KacR, KacK, and KacH were highly conserved and ranked at the top three, which were count for 39.4% of the all identified acetylated peptides (Fig. 2b).Most of these motifs were conserved among other species2,7,14,15,23,24,25,26,27,30,32,34.In these motifs, several amino acid residues were conserved, for instance, Arginine(R), Lysine (K), Histidine (H), and Asparagine(N) were located downstream of acetylated lysines. The heat map of amino acid compositions surrounding the acetylation sites were generated (Fig. 2c).While the enrichment of amino acid residues R, H, and K were observed in the + 1 position, and amino acid residue Cysteine(C)was observed in the −1 position.

Figure 2
figure 2

Acetylation motifs and conservation of acetylation sites. (a) Acetylation motifs and conservation of acetylation sites. (b) Number of identified modification sites in each acetylated protein. (c) Heat map of the amino acid compositions of the acetylation sites.

Annotation of acetylated proteins

To investigate functions of the lysine acetylated proteins, we first annotated their subcellular localization with the WoLFPSORT. As shown in Fig. 3a, most of the lysine acetylated proteins were located in the cytoplasm (31.8%), the mitochondria (30.0%), and the nucleus (23.1%).For the rest lysine acetylated proteins, 63 ones (5.0%) and 54 ones (4.3%) were located in the extracellular space and the plasma membrane, respectively. These results suggested diverse subcellular localization of lysine acetylated proteins occurred, especially the intracellular compartments.

Figure 3
figure 3

Characteristics of identified acetylated proteins. (a) Pie chart showing the protein subcellular localization of acetylated proteins. (b) Gene Ontology functional classifications of acetylated proteins, which were based on molecular function, cellular component and biological process. (c) Protein domain enrichment analysis of acetylated protein. (d) KEGG pathway-based enrichment analysis of acetylated proteins.

We then classified the lysine acetylated proteins based on their predicted functions. All proteins were first annotated by GO terms. As shown in Fig. 3b, the most significantly enriched biological process was translation (116 proteins). The main molecular function was structural molecule activity (89 proteins).We also investigated protein domains and found that functional domains related to nucleophile amino hydrolases and translation protein SH3-like domain were significantly enriched in the lysine acetylated proteins (Fig. 3c). KEGG pathway analyses showed that 20 pathways were enriched for acetylated proteins (Fig. 3d). The most abundant one was the ribosome pathway, which contains 83acetylated proteins. Other enriched pathways included biosynthesiss, carbon metabolism, and so on.

Protein-protein interaction network feature

To further understand biological process regulated by acetylation, protein-protein interaction (PPI) network analyses on lysine acetylated proteins were performed. PPI network including all protein interactions in different developmental stages was established with the search tool for the retrieval of interacting genes and/or proteins (STRING) database (Fig. 4a; Table S2). We found that the lysine acetylated proteins formed a highly organized network of interacting proteins. We performed network analyses on the established network at high STRING confidences. Overall, the interacting network of the lysine acetylated proteins had significantly more interactions than expected, and this PPI network contained 820 nodes with 7,709 edges, in which the average node degree was 18.8.

Figure 4
figure 4

Protein-protein interaction networks of identified acetylated proteins. The red and blue nodes in networks indicated acetylated proteins and non-acetylation proteins. (a) The overview of interaction network of acetylated proteins. (b) and (c) indicated interaction network of acetylated proteins associated with ribosome and proteasome, respectively.

The established PPI network was then analyzed by MCODE41, and38highly inter-connected clusters within the network were identified (Table S2). Several dominant clusters involved in lots of functionally related proteins. The most top cluster was ribosome. Among the 77 ribosome proteins, 73 proteins were lysine acetylated (Fig. 4b; Table S2). The second top was proteasome. Among the 31 proteasome proteins, 30 proteins were lysine acetylated (Fig. 4c; Table S2). In addition to these top clusters, the largest group of clusters was related to metabolic processes, e.g. cluster_4, cluster_5, cluster_6, cluster_10, cluster_11, and so on. Moreover, several identified inter-connected clusters were related with protein processing. For instance, cluster_7 was related with protein folding, cluster_8 and cluster_26were related with translation initiation, cluster_30 and cluster_37 were related with protein transport. Some clusters contained proteins with other functions. For example, cluster_12 was involved in MAPK signaling pathway, cluster_13 was composed by serine/threonine-protein phosphatases.

Dozens of lysine acetylated proteins are involved in vegetative hyphal growth and pathogenicity

To investigate roles of the identified 1,269 lysine acetylated proteins in fungal growth and pathogenicity, we retrieved these proteins to gene repositories built from the literature annotations reported throughout the last two decades. Thirty-five reported proteins were found to be important to vegetative hyphal growth (Table 1). Importantly, 27ones were essential to pathogenicity or important for virulence. Pfam and KEGG analysis showed that some lysine acetylated proteins were involved in signaling transduction pathway. For examples, Chm1 encodes a PKA protein kinase and MoCmk1 encodes a protein kinase, both of which were important to fungal growth and pathogenicity42.Some lysine acetylated proteins were involved in amino acid metabolism, for examples, 5-methyltetrahydropteroyl triglutamate-homocysteine S-methyltransferase MoMet6 and L-aminoadipatesemialdehyde dehydrogenase MoLys2, and they were also important to fungal growth and plant infection43,44. To test whether other acetylated proteins identified in this study were involved in vegetative hyphal growth, we searched our proteins against an ATMT mutant library reported previously45.We found that 16 disrupted genes were potential to be important for vegetative hyphal growth, some of which were also contribute to fungal pathogenicity (Table S3). Taken together, these findings suggested that dozens of lysine acetylated proteins were involved into vegetative hyphal growth, and several of them were involved in fungal pathogenicity.

Table 1 List of acetylated proteins involved in vegetative hyphal growth and fungal pathogenicity.

Discussion

Researchers have remarked that lysine acetylation can provide a new target for the development of effective drugs or vaccines based on an understanding of its regulatory mechanism46. However, studies summarized its function in plant pathogen fungi were almost entirely absent. As apioneering research of lysine acetylome in the rice blast fungus, this study helped researchers to understand the importance of acetylated processes in plant pathogen fungi, and provided useful and accessible data for further study in the biological field. This lysine acetylome contains 2,720 acetylation sites in 1,269proteins, which occupied about 10.3% of the total predicted proteins in this fungus. As compared with the thousands acetylated proteins discovered by in silico analysis techniques, we speculated that many lysine sites were dynamic acetylated during different development stages and under diverse environmental stimulus. Therefore, we will continue to gather acetylation data to obtain global view of the lysine acetylome in M. oryzae, and disseminated them to researchers. Moreover, our ability to detect acetylated proteins in different conditions and development stages is giving the biological researchers unprecedented access to understand infection-related morphogenesis or asexual development. Together, these findings widen roles of reversible acetylation in M. oryzae and open up new possibilities for investigations in the field.

By comparing characterizes of the lysine acetylated proteins, such as motifs, subcellular localizations, annotated functions, among different species, we found most of them are highly conserved. Moreover, all of the annotated protein-protein interaction networks among the lysine acetylated proteins have been reported in other species9,10,11,12,13,14,15,16,17,18,28,35. So, these analyses suggested conservation of protein lysine acetylation during evolution. However, 173 proteins with lysine acetylated sites were annotated as function unknown (Table S1). Dissection roles of these new proteins might enrich overview of the lysine acetylome of M. oryzae.

We found dozens of previously reported proteins important to vegetative hyphal growth were lysine acetylated. These proteins were involved in diverse functions, such as signal transduction, amino acid metabolism, energy transfer, cytoskeleton, transcription regulation, and so on. Importantly, two protein kinases, MoCmk1and Chm1, were reported to play important roles in fungal pathogenicity47,48,49. Unfortunately, roles of the acetylated lysine sites have not been functionally investigated. Recently, dynamic crosstalk between receptor tyrosine kinases and lysine acetylation were revealed by quantitative profiling of lysine acetylation in cultured carcinoma cell lines50. Moreover, phosphorylation and lysine acetylation cross-talk in a kinase motif associated with myocardial ischemia and cardioprotection were dissected by structure-based analysis51. Therefore, the acetylated lysine sites identified in our study will provide valuable information to investigate the interactions between lysine acetylation and other post-translation modifications. Furthermore, over 10novel proteins with predicted protein kinase domains were detected to contain lysine acetylated sites in this study. It will be interesting to characterize their functions on developments and plant infection and roles of lysine acetylated sites. Recently, a circadian-regulated protein Twilight/Twl, which plays key roles in conidiation and pathogenesis in M. oryzae, was found to contain one lysine acetylated site52. The de-acetylated form of Twilight/Twldriven by light-induced phosphorylation leads to its translocation from cytoplasm into nucleus. Because the acetylated Twilight/Twl was only detected in dark condition, it seems reasonable that the lysine acetylated site in Twilight/Twl could not be identified in our study. This study also strongly suggested important roles of lysines acetylation during development and pathogenicity.

Taken together, our study provides a comprehensive view of lysine acetylated sites during vegetative hyphal growth in M. oryzae. It will be helpful to understand roles of the protein with acetylated lysine at the post-translational modification level.

Materials and Methods

Protein extraction

Samplewas first grinded by liquid nitrogen, then the cell powder was transferred to 5 ml centrifuge tube and sonicated three times on ice using a high intensity ultrasonic processor (Scientz) in lysis buffer (8 M urea, 1% Triton-100, 65 mM DTT, and 0.1% Protease Inhibitor Cocktail). The remaining debris was removed by centrifugation at 20,000 g at 4 °C for 10 min. Finally, the protein was precipitated with cold 15% trifluoroacetic acid (TFA) for 2 h at −20 °C. After centrifugation at 4 °C for 10 min, the supernatant was discarded. The remaining precipitate was washed with cold acetone for three times. The protein was re-dissolved in buffer (8 M urea, 100 mM NH4CO3, pH 8.0) and the protein concentration was determined with 2-D Quant kit (GE Healthcare) according to the manufacturer’s instructions.

Trypsin digestion

For digestion, the protein solution was reduced with 10 mM DTT for 1 h at 37 °C and alkylated with 20 mM iodoacetamide (IAA) for 45 min at room temperature in darkness. For trypsin digestion, the protein sample was diluted by adding 100 mM NH4CO3 to urea concentration less than 2 M. Finally, trypsin (Promega) was added at 1:50 trypsin-to-protein mass ratio for the first digestion overnight and 1:100 trypsin-to-protein mass ratio for a second 4 h-digestion.

HPLC Fractionation

The sample was then fractionated into fractions by high pH reverse-phase HPLC using Agilent 300Extend C18 column (5 μm particles, 4.6 mm ID, 250 mm length). Briefly, peptides were first separated with a gradient of 2% to 60% acetonitrile in 10 mM ammonium bicarbonate pH 10 over 80 min into 80 fractions. The peptides were then combined into 8 fractions and dried by vacuum centrifuging.

Affinity Enrichment

To enrich acetylated lysine (Kac) peptides, tryptic peptides dissolved in NETN buffer (100 mMNaCl, 1 mM EDTA, 50 mMTris-HCl, 0.5% NP-40, pH 8.0) were incubated with pre-washed antibody beads (PTM Biolabs) at 4 °C overnight with gentle shaking. The beads were washed four times with NETN buffer and twice with ddH2O. The bound peptides were eluted from the beads with 0.1% TFA. The eluted fractions were combined and vacuum-dried. The resulting peptides were cleaned with C18 ZipTips (Millipore) according to the manufacturer’s instructions, followed by LC-MS/MS analysis.

LC-MS/MS Analysis

Three parallel analyses for each fraction were performed. Peptides were dissolved in 0.1% formic acid (FA), directly loaded onto a reversed-phase pre-column (Acclaim PepMap 100, Thermo Scientific). Peptide separation was performed using a reversed-phase analytical column (Acclaim PepMap RSLC, Thermo Scientific). The gradient was comprised of an increase from 6% to 23% solvent B (0.1% FA in 98% acetonitrile) for 24 min, 23% to 35% for 8 min and climbing to 80% in 4 min then holding at 80% for the last 4 min, all at a constant flow rate of 280 nl/min on an EASY-nLC 1000 UPLC system, the resulting peptides were analyzed by Q ExactiveTM Plus hybrid Quadrupole-Orbitrap mass spectrometer (ThermoFisher Scientific).

Database Search

The resulting MS/MS data was processed using MaxQuant with integrated Andromeda search engine (v.1.4.1.2). Tandem mass spectra were searched against Magnaporthe oryzae (14,835 sequences) database concatenated with reverse decoy database. Trypsin/P was specified as cleavage enzyme allowing up to 3 missing cleavages, 4 modifications per peptide and 5 charges. Mass error was set to 10 ppm for precursor ions and 0.02 Da for fragment ions. Carbamidomethylation on Cys was specified as fixed modification and oxidation on Met, acetylation on Lysine and acetylation on protein N-terminal were specified as variable modifications. False discovery rate (FDR) thresholds for protein, peptide and modification site were specified at 1%. Minimum peptide length was set at 7. All the other parameters in MaxQuant were set to default values. The site localization probability was set as >0.75.

Annotation and functional enrichment analysis

Gene Ontology (GO) annotation was derived from the UniProt-GOA database (http://www.ebi.ac.uk/GOA/). Proteins were classified by GO annotation into three categories: biological process, cellular compartment and molecular function. Identified proteins domain functional descriptions were annotated by InterPro domain database (http://www.ebi.ac.uk/interpro/). KEGG database (http://www.genome.jp/kegg/) was used to identify enriched pathways. These pathways were classified into hierarchical categories according to the KEGG website. A two-tailed Fisher’s exact test was employed to test the enrichment of the identified acetylated protein against all database proteins. Correction for multiple hypothesis testing was carried out using standard FDR methods. The GO, domains, and pathways with a corrected p-value < 0.05 are considered significant. WoLFPSORT (http://www.genscript.com/wolf-psort.html) was used to predict subcellular localization.

Motif analysis

Motif analysis was performed with softwaremotif-x53 by analyzing the model of sequences constituted with amino acids in specific positions of modifier-21-mers (10 amino acids upstream and downstream of the site) in all protein sequences. All of the protein sequences were used as background database parameter, and other parameters with default.