Large-scale identification of lysine acetylated proteins in vegetative hyphae of the rice blast fungus

Lysine acetylation is a major post-translational modification that plays important regulatory roles in diverse biological processes to perform various cellular functions in both eukaryotes and prokaryotes. However, roles of lysine acetylation in plant fungal pathogens were less studied. Here, we provided the first lysine acetylome of vegetative hyphae of the rice blast fungus Magnaporthe oryzae through a combination of highly sensitive immune-affinity purification and high-resolution LC-MS/MS. This lysine acetylome had 2,720 acetylation sites in 1,269 proteins. The lysine acetylated proteins were involved indiverse cellular functions, and located in 820 nodes and 7,709 edges among the protein-protein interaction network. Several amino acid residues nearby the lysine acetylation sites were conserved, including KacR, KacK, and KacH. Importantly, dozens of lysine acetylated proteins are found to be important to vegetative hyphal growth and fungal pathogenicity. Taken together, our results provided the first comprehensive view of lysine acetylome of M.oryzae and suggested protein lysine acetylation played important roles to fungal development and pathogenicity.

Protein acetylation is an evolutionarily conserved post-translational modification in both eukaryotes and prokaryotes 1 . Acetylation of lysine residues in proteins is a dynamic and reversible process that was first discovered in histone proteins nearly fifty years ago 2 .Acetylation of histones and other transcription factors in nucleus has been extensively studied in regulation of gene transcription 3 . After discovery of lysine acetylation in non-histone proteins, it is believed that the extent of this modification is not restricted to nuclei, which has greatly expanded our understanding on functions of this modification 4,5 .In recent ten years, lysine acetylation has been found to occur in almost every compartment of a cell, such as the mitochondria and the cytoplasm [6][7][8] , and to play important roles in various cellular processes including protein-protein interactions 9 , enzymatic activity 10,11 , metabolic pathways 9,10,12-14 , cell morphology 7 , calorie restriction 15,16 , and protein-nucleic acid interactions 17,18 . Therefore, lysine acetylation is believed to be a main signaling modulator due to its widely occurrences and diverse cellular functions 19 .
In this study, we performed large-scale identification of lysine acetylated proteins in the rice blast fungus, Magnaporthe oryzae, which can cause rice blast, one of the most devastating diseases on rice throughout the world [36][37][38] . We identified 2,720 lysine acetylation sites in 1,269 proteins, which was account for about 10.3% of the total proteins in this fungus. Several amino acid residues surrounding the lysine acetylation sites were conserved, including K ac R, K ac K, and K ac H. The lysine acetylated proteins were predicted to involve in diverse cellular functions and located in 820nodes and 7,709 edges among the protein-protein interaction network. Dozens of lysine While the enrichment of amino acid residues R, H, and K were observed in the + 1 position, and amino acid residue Cysteine(C)was observed in the −1 position.

Annotation of acetylated proteins.
To investigate functions of the lysine acetylated proteins, we first annotated their subcellular localization with the WoLFPSORT. As shown in Fig. 3a, most of the lysine acetylated proteins were located in the cytoplasm (31.8%), the mitochondria (30.0%), and the nucleus (23.1%).For the rest lysine acetylated proteins, 63 ones (5.0%) and 54 ones (4.3%) were located in the extracellular space and the plasma membrane, respectively. These results suggested diverse subcellular localization of lysine acetylated proteins occurred, especially the intracellular compartments.
We then classified the lysine acetylated proteins based on their predicted functions. All proteins were first annotated by GO terms. As shown in Fig. 3b, the most significantly enriched biological process was translation (116 proteins). The main molecular function was structural molecule activity (89 proteins).We also investigated protein domains and found that functional domains related to nucleophile amino hydrolases and translation protein SH3-like domain were significantly enriched in the lysine acetylated proteins (Fig. 3c). KEGG pathway analyses showed that 20 pathways were enriched for acetylated proteins (Fig. 3d). The most abundant one was the ribosome pathway, which contains 83acetylated proteins. Other enriched pathways included biosynthesiss, carbon metabolism, and so on.
Protein-protein interaction network feature. To further understand biological process regulated by acetylation, protein-protein interaction (PPI) network analyses on lysine acetylated proteins were performed. PPI network including all protein interactions in different developmental stages was established with the search tool for the retrieval of interacting genes and/or proteins (STRING) database ( Fig. 4a; Table S2). We found that the lysine acetylated proteins formed a highly organized network of interacting proteins. We performed network analyses on the established network at high STRING confidences. Overall, the interacting network of the lysine acetylated proteins had significantly more interactions than expected, and this PPI network contained 820 nodes with 7,709 edges, in which the average node degree was 18.8.
The established PPI network was then analyzed by MCODE 41 , and38highly inter-connected clusters within the network were identified (Table S2). Several dominant clusters involved in lots of functionally related proteins. The most top cluster was ribosome. Among the 77 ribosome proteins, 73 proteins were lysine acetylated ( Fig. 4b; Table S2). The second top was proteasome. Among the 31 proteasome proteins, 30 proteins were lysine acetylated ( Fig. 4c; Table S2). In addition to these top clusters, the largest group of clusters was related to metabolic processes, e.g. cluster_4, cluster_5, cluster_6, cluster_10, cluster_11, and so on. Moreover, several identified inter-connected clusters were related with protein processing. For instance, cluster_7 was related with protein folding, cluster_8 and cluster_26were related with translation initiation, cluster_30 and cluster_37 were related with protein transport. Some clusters contained proteins with other functions. For example, cluster_12 was involved in MAPK signaling pathway, cluster_13 was composed by serine/threonine-protein phosphatases.

Dozens of lysine acetylated proteins are involved in vegetative hyphal growth and pathogenicity.
To investigate roles of the identified 1,269 lysine acetylated proteins in fungal growth and pathogenicity, we retrieved these proteins to gene repositories built from the literature annotations reported throughout the last two decades. Thirty-five reported proteins were found to be important to vegetative hyphal growth (Table 1). Importantly, 27ones were essential to pathogenicity or important for virulence. Pfam and KEGG analysis showed that some lysine acetylated proteins were involved in signaling transduction pathway. For examples, Chm1 encodes a PKA protein kinase and MoCmk1 encodes a protein kinase, both of which were important to fungal growth and pathogenicity 42 .Some lysine acetylated proteins were involved in amino acid metabolism, for examples, 5-methyltetrahydropteroyl triglutamate-homocysteine S-methyltransferase MoMet6 and L-aminoadipatesemialdehyde dehydrogenase MoLys2, and they were also important to fungal growth and plant infection 43,44 . To test whether other acetylated proteins identified in this study were involved in vegetative hyphal growth, we searched our proteins against an ATMT mutant library reported previously 45 .We found that 16 disrupted genes were potential to be important for vegetative hyphal growth, some of which were also contribute to fungal pathogenicity (Table S3). Taken together, these findings suggested that dozens of lysine acetylated proteins were involved into vegetative hyphal growth, and several of them were involved in fungal pathogenicity.

Discussion
Researchers have remarked that lysine acetylation can provide a new target for the development of effective drugs or vaccines based on an understanding of its regulatory mechanism 46 . However, studies summarized its function in plant pathogen fungi were almost entirely absent. As apioneering research of lysine acetylome in the rice blast fungus, this study helped researchers to understand the importance of acetylated processes in plant pathogen fungi, and provided useful and accessible data for further study in the biological field. This lysine acetylome contains 2,720 acetylation sites in 1,269proteins, which occupied about 10.3% of the total predicted proteins in this fungus. As compared with the thousands acetylated proteins discovered by in silico analysis techniques, we speculated that many lysine sites were dynamic acetylated during different development stages and under diverse environmental stimulus. Therefore, we will continue to gather acetylation data to obtain global view of the lysine acetylome in M. oryzae, and disseminated them to researchers. Moreover, our ability to detect acetylated proteins in different conditions and development stages is giving the biological researchers unprecedented access to understand infection-related morphogenesis or asexual development. Together, these findings widen roles of reversible acetylation in M. oryzae and open up new possibilities for investigations in the field.
By comparing characterizes of the lysine acetylated proteins, such as motifs, subcellular localizations, annotated functions, among different species, we found most of them are highly conserved. Moreover, all of the annotated protein-protein interaction networks among the lysine acetylated proteins have been reported in other species [9][10][11][12][13][14][15][16][17][18]28,35 . So, these analyses suggested conservation of protein lysine acetylation during evolution. However, 173 proteins with lysine acetylated sites were annotated as function unknown (Table S1). Dissection roles of these new proteins might enrich overview of the lysine acetylome of M. oryzae.
We found dozens of previously reported proteins important to vegetative hyphal growth were lysine acetylated. These proteins were involved in diverse functions, such as signal transduction, amino acid metabolism, energy transfer, cytoskeleton, transcription regulation, and so on. Importantly, two protein kinases, MoCmk1and Chm1, were reported to play important roles in fungal pathogenicity [47][48][49] . Unfortunately, roles of the acetylated lysine sites have not been functionally investigated. Recently, dynamic crosstalk between receptor tyrosine kinases and lysine acetylation were revealed by quantitative profiling of lysine acetylation in cultured carcinoma cell lines 50 . Moreover, phosphorylation and lysine acetylation cross-talk in a kinase motif associated with myocardial ischemia and cardioprotection were dissected by structure-based analysis 51 . Therefore, the acetylated lysine sites identified in our study will provide valuable information to investigate the interactions between lysine acetylation and other post-translation modifications. Furthermore, over 10novel proteins with predicted protein kinase domains were detected to contain lysine acetylated sites in this study. It will be interesting to characterize their functions on developments and plant infection and roles of lysine acetylated sites. Recently, a circadian-regulated protein Twilight/Twl, which plays key roles in conidiation and pathogenesis in M. oryzae, was found to contain one lysine acetylated site 52 . The de-acetylated form of Twilight/Twldriven by light-induced phosphorylation leads to its translocation from cytoplasm into nucleus. Because the acetylated Twilight/Twl was only detected in dark condition, it seems reasonable that the lysine acetylated site in Twilight/Twl could not be identified in our study. This study also strongly suggested important roles of lysines acetylation during development and pathogenicity.
Taken together, our study provides a comprehensive view of lysine acetylated sites during vegetative hyphal growth in M. oryzae. It will be helpful to understand roles of the protein with acetylated lysine at the post-translational modification level.

Materials and Methods
Protein extraction. Samplewas first grinded by liquid nitrogen, then the cell powder was transferred to 5 ml centrifuge tube and sonicated three times on ice using a high intensity ultrasonic processor (Scientz) in lysis buffer (8 M urea, 1% Triton-100, 65 mM DTT, and 0.1% Protease Inhibitor Cocktail). The remaining debris was removed by centrifugation at 20,000 g at 4 °C for 10 min. Finally, the protein was precipitated with cold 15% trifluoroacetic acid (TFA) for 2 h at −20 °C. After centrifugation at 4 °C for 10 min, the supernatant was discarded. The remaining precipitate was washed with cold acetone for three times. The protein was re-dissolved in buffer Annotation and functional enrichment analysis. Gene Ontology (GO) annotation was derived from the UniProt-GOA database (http://www.ebi.ac.uk/GOA/). Proteins were classified by GO annotation into three categories: biological process, cellular compartment and molecular function. Identified proteins domain functional descriptions were annotated by InterPro domain database (http://www.ebi.ac.uk/interpro/). KEGG database (http://www.genome.jp/kegg/) was used to identify enriched pathways. These pathways were classified into hierarchical categories according to the KEGG website. A two-tailed Fisher's exact test was employed to test the enrichment of the identified acetylated protein against all database proteins. Correction for multiple hypothesis testing was carried out using standard FDR methods. The GO, domains, and pathways with a corrected p-value < 0.05 are considered significant. WoLFPSORT (http://www.genscript.com/wolf-psort.html) was used to predict subcellular localization.

Motif analysis.
Motif analysis was performed with softwaremotif-x 53 by analyzing the model of sequences constituted with amino acids in specific positions of modifier-21-mers (10 amino acids upstream and downstream of the site) in all protein sequences. All of the protein sequences were used as background database parameter, and other parameters with default.