A Comprehensive Assessment of the Genetic Determinants in Salmonella Typhimurium for Resistance to Hydrogen Peroxide

Salmonella is a Gram-negative bacterium that infects a wide range of hosts. Salmonella Typhimurium causes gastroenteritis in human, and can survive and replicate in macrophages. An essential mechanism used by the macrophages to eradicate Salmonella is production of reactive oxygen species. Here, we used proteogenomic approaches to determine the candidate genes and proteins that have a role in resistance of S. Typhimurium to H2O2. For Tn-seq, a highly saturated Tn5 insertion library was grown in vitro in Luria-Bertani broth (LB) medium as well as LB containing either 2.5 (H2O2L) or 3.5 mM H2O2 (H2O2H). We identified two sets of overlapping genes that are required for resistance of S. Typhimurium to H2O2L and H2O2H, and the result was validated via phenotypic evaluation of 50 selected mutants. The enriched pathways for resistance to H2O2 included DNA repair, aromatic amino acid biosynthesis, Fe-S cluster biosynthesis, iron homeostasis, flagellar genes, H2O2 scavenging enzymes, and DNA adenine methylase. Particularly, we identified aromatic amino acid biosynthesis (aroB, and aroK) and putative iron transporter system (ybbK, ybbL, and ybbM) as novel mechanisms for resistance to H2O2. The proteomics revealed that the majority of essential proteins, including ribosomal proteins, were downregulated upon exposure to H2O2. A subset of the proteins identified by Tn-seq were analyzed by targeted proteomics, and 70 % of them were upregulated upon exposure to H2O2. The identified candidate genes will deepen our understanding on the mechanisms of S. Typhimurium survival in macrophages, and can be exploited to develop new antimicrobial drugs. IMPORTANCE Salmonella infection is frequently caused by consumption of contaminated food or water. The infection may lead to gastroenteritis or typhoid fever, depending on the Salmonella serovars. Even though the bacterium encounters the immune defense arsenals of the infected host, including reactive oxygen species in phagocytes, the bacterium can survive and replicate. In this study, proteogenomic approaches were used in order to identify the genes and proteins that have a role in resistance to H2O2. In addition to the H2O2 scavenging and degrading enzymes, aromatic amino acid biosynthesis and iron homeostasis were identified among the most important pathways for H2O2 resistance. These findings will deepen our knowledge on the mechanisms of Salmonella survival in phagocytes and other niches with oxidative stress, and also provides novel targets to develop new antimicrobial therapeutics.


Introduction 69
Salmonella is a Gram-negative bacterium that infects humans and animals. Salmonella enterica 70 has numerous serovars, which include typhoidal and non-typhoidal strains. In contrast to the 71 typhoidal salmonellae which are human restricted pathogens, the non-typhoidal salmonellae 72 (NTS), serovar Enteritidis and Typhimurium, are able to infect a wide range of hosts, causing 73 gastroenteritis (1). The NTS strains, including Salmonella enterica serovar Typhimurium, account 74 for 11% (1.2 million cases) of the total foodborne illnesses caused by different pathogens in the 75 United States (2). It has been estimated that Salmonella is responsible for 93.8 million cases of 76 gastroenteritis, leading to 155,000 deaths worldwide annually (3). The pathogen remains a 77 continuous threat to the food safety, and public health. 78 To initiate an infection and survive inside the host, Salmonella needs to overcome a myriad of host 79 defense mechanisms. As Salmonella reaches the intestine and breaches the epithelial tissue, it 80 enters the macrophages and activates different virulence strategies in order to survive and replicate 81 in them (4). An essential mechanism used by the phagocytes to kill and eradicate Salmonella is have been discovered and the underlying mechanisms have been explored (11,12). Various 90 approaches and techniques have been employed to study global response of Salmonella or related 91 13 iscA, yjeB, yhgI), and transcription regulation (rcsA, oxyR, rpoE, yjeB, arcA, argR, rbsR, rpoS, 276 fadR, rcsB, furR, flhD). 277

Validation of Tn-seq results using individual mutants 278
For the selected 50 genes among the 137 genes identified by Tn-seq, the growth phenotype was 279 determined using individual single deletion mutants in LB, H 2 O 2 L, and H 2 O 2 H. The genes were 280 considered to play a role in resistance to H 2 O 2, if (i) lag phase time increased, (ii) growth rate 281 reduced or (iii) maximum OD 600 decreased in the presence of H 2 O 2 in comparison to the wild type 282 strain grown in the same conditions. Of the 50 single deletion mutants, 42 mutants were shown to 283 have a role in resistance to H 2 O 2 ( Fig. 2 and Data Set S2). One gene, yhaD, was identified by all 284 3 analysis tools, but it did not show the expected phenotype. The fliD was also identified by 285 ARTIST, but did not show any phenotype distinguishable from the wild type. The remaining 6 286 genes that did not show the phenotype was identified by Tn-seq Explorer. Based on the results of 287 the individual mutant assay, we conclude that 84% (42/50) of the genes identified by the Tn-seq 288 analysis and tested using single deletion mutants have a role for resistance to H 2 O 2 . These results 289 indicate that our Tn-seq analysis identified the genes in S. for fitness under the selection conditions, the identified genes are expected to express their proteins 312 under the conditions to perform their cellular functions. Often the proteins required for fitness 313 under a given condition are overexpressed under the condition, but it may not be the case for some 314 proteins. In this study, we had a unique opportunity to comparatively analyze both Tn-seq and the 315 MS data to understand the relationship between genetic requirements and changes in expression 316 level under the condition of interest, which was H 2 O 2 in this study. We also obtained the list of 317 essential genes based on our Tn-seq data, which could not tolerate insertions by definition, and if 318 we were not certain about essentiality of a gene from our Tn-seq data, the gene was searched for 319 essentiality in the previously reported list of Salmonella essential genes (22). The comprehensive 320 list of essential genes allowed us to study any correlation between the essentiality and the changes 321 in protein expression. Among the 246 proteins, there were 78 essential and 168 non-essential 322 proteins. Among the 78 essential proteins, 25 were upregulated whereas 53 were downregulated. 323 On the contrary, the majority (n = 96) of the detected non-essential proteins were upregulated, 324 while 72 non-essential proteins were downregulated. To further examine the quantitative 325 relationships closely, 64 genes/proteins identified by both methods (Data Set S3) were focused on. 326 Among the 64 genes/proteins, 57 genes showed negative Log2FC based on Tn-seq data, and 41 327 proteins among the 57 were upregulated at protein level. However, only 12 proteins had p values 328 of ≤ 0.05 (AhpC, ArcA, Crr, DksA, FliC, IcdA, OxyR, Pgm, RecA, RpoS, SlpA, and WecE). 329 Using KEGG pathway analysis, 150 proteins among the 246 were enriched in 21 pathways (Table  330 S3). Interestingly, of the all 59 30S and 50S ribosomal proteins in S. Typhimurium, 37 of these 331 proteins (63%) were downregulated in response to H 2 O 2 . Moreover, of the 8 identified proteins in 332 TCA cycle, 6 proteins were downregulated, including 2 essential proteins. 333 Although DDA method can be used to search for all proteins in a complex sample, it is prone to 334 miss the identification of important proteins due to the fact that fragmentation of tryptic peptides 335 from these proteins may not be triggered as a result of lower peptide ion intensities compared to 336 the threshold set. To quantify the proteins of the genes identified by Tn-seq more precisely and 337 accurately, we used targeted-proteomic approach by employing liquid chromatography coupled 338 with triple quadrupole mass spectrometry (LC-QQQ-ESI-MS). Here, tryptic peptides of the protein 339 were targeted for fragmentation (MS/MS) independent of their intensities, as described in 340 Materials and Methods, and the observed sequence specific fragment ion intensities from three 341 unique tryptic peptides were utilized for protein quantitation. Of the 137 Tn-seq identified genes, 342 we selected 33 genes to quantify their proteins in response to H 2 O 2 by using targeted proteomics 343 (Dataset S3). Interestingly, 23 (70%) of the 33 tested proteins were upregulated in response to 344 H 2 O 2 . This shows a good agreement between the results of the Tn-seq and the targeted proteomics. 345

Aromatic amino acid biosynthesis and H 2 O 2 346
Interestingly, our Tn-seq data revealed that the aromatic amino acid biosynthesis and metabolism 347 pathway play a role in conferring resistance in Salmonella to H 2 O 2 ( Fig. 3A and 3B). Five genes, 348 aroB, aroD, aroE_2, aroK, and aroA in the aromatic amino acid biosynthesis pathway were 349 identified by Tn-seq, and the fitness of the mutants were significantly reduced in the presence of 350 H 2 O 2 . To confirm this, 4 of these genes were evaluated using individual mutant assays. The 351 Salmonella aroK mutant showed the strongest phenotype, because it failed to grow in the presence 352 of H 2 O 2 L or H 2 O 2 H during 24 h incubation time. Also, the aroB mutant exhibited a strong 353 phenotype, significantly extending lag phase for both H 2 O 2 conditions. The aroE_2 mutant also 354 exhibited an extended lag time, but the aroA mutant did not show any difference in growth 355 phenotype in the presence of H 2 O 2 . In addition, targeted-proteomics also showed that all these 5 356 proteins were upregulated in response to H 2 O 2. The most upregulated protein was aroK and this 357 was followed by the aroE_2, aroA, aroB, and aroD ( Fig. 3C and Data Set S3). 358 The ROS damages a variety of biomolecules via Fenton reaction, which consequently lead to 359 metabolic defects, specifically auxotrophy for some aromatic amino acids (10). The E. coli mutants 360 that lack superoxide dismutase enzymes are unable to grow in vitro unless the medium are 361 supplemented with aromatic (Phe, Trp, Tyr), branched-chain (Ile, Leu, Val), and sulfur-containing 362 (Cys, Met) amino acids (36). We identified the genes in the aromatic amino acid biosynthesis 363 pathway that are critically important for resistance to H 2 O 2 . In this pathway, aroK catalyzes the 364 production of shikimate 3-phosphate from shikimate, which consequently leads to the production 365 of tryptophan, phenylalanine, tyrosine and some metabolites from the chorismate precursor in E. 366 coli. Further, the aroK mutant in E. coli displays increased susceptibility to protamine, a model 367 cationic antimicrobial peptide. It has been suggested that resistance to protamine is probably due 368 to the aromatic metabolites and product of aroK gene, which act as a signal molecule to simulate 369 the CpxR/CpxA system and Mar regulators (37). In our Tn-seq data, cpxR/cpxA and marBCRT 370 were in the list of non-required genes, but the proteomics data indicated that CpxR was 371 upregulated. Also, the aroK mutant in E. coli is resistance to mecillinam, a beta-lactam antibiotic 372 specific to penicillin-binding protein 2. It has been concluded that the AroK has a secondary 373 activity in addition to the aromatic amino acid biosynthesis, probably related to cell division (38). 374 In addition, the aroK gene presents a promising target to develop a non-toxic drug in 375 the aroB gene is attenuated in BALB/c mice (41). In addition to aroK and aroB, aroE_2 was also 384 shown to be important for resistance to H 2 O 2 , because deletion of the aroE_2 reduced the growth 385 rate by 35% in the presence of H 2 O 2 and increased the lag phase time, too. All these 3 genes in this 386 pathway are required for systemic infection of Salmonella in BALB/c mice in a more recent study 387 (18). We observed that there was a strong correlation between the fitness based on Tn-seq data, 388 growth rates measured by individual mutant assays, and upregulation of their proteins quantified 389 via targeted proteomics. This demonstrates the power of proteogenomic approach in discovering 390 and characterizing the genes that are required for growth under a specific condition. 391 The ybbM, ybbK, and ybbL have a role in H 2 O 2 resistance 392 The mutants with single deletion in each of ybbK, ybbL, and ybbM genes on the same pathway 393 showed a strong phenotype against the activity of H 2 O 2 in a dose-dependent manner. Based on Tn-394 seq data, the fitness of ybbM was -1.16 and -1.79 for H 2 O 2 L and H 2 O 2 H, respectively (Fig.4A) 20 This data emphasizes that ahpC, sodB, and tpx may be the primary players in scavenging and 437 degrading H 2 O 2 in our experiment. Why Tn-seq did not detect any of these genes, while proteomics 438 detected only these 3 proteins among others? It may reflect the functional redundancy in the 439 genetic network that prevented single deletions in one of the these genes from exhibiting fitness 440 defect. Alternatively, when these mutants were grown together with all other mutants in the library, 441 the functional protein lacking in one mutant due to Tn5 insertion could have been compensated by 442 the other mutants in the library. 443 In addition to these genes, oxyR was detected by Tn-seq (Fig. 1C) and DDA proteomics. The   can be involved in the resistance of Salmonella to oxidative stress, which warrants future study 502 into this direction. 503

Fe-S cluster biogenesis system and H 2 O 2 504
Salmonella requires the genes from Fe-S cluster biogenesis system in order to resist H 2 O 2 . Our 505 Tn-seq analysis identified 5 genes in this system as required for the resistance. In isc operon (Fe-506 S cluster), iscA, hscB, and hscA were among the genes required to resist H 2 O 2 . Particularly, the 507 hscA is on the top of the gene list identified by Tn-seq. In E. coli, this operon is regulated by iscR, 508 iron sulfur cluster regulator (58); in Salmonella the gene iscR encoding this transcription regulator 509 is named yfhP. The HscB and HscA chaperones are believed to be involved in the maturation of 510 [2Fe-2S] proteins (59, 60). The second operon that is involved in Fe-S protein biogenesis is the 511 suf, sulfur mobilization operon. Tn-seq found that two genes in this operon were required for 512 and SufD form a pseudo-ABC-transporter that could act as a scaffold (60); this operon is regulated 516 by OxyR (14). The other known genes in these two operons that are present in Salmonella are iscA, 517 sufA, sufB, and sufD; they showed a reduced fitness, while their p values were greater than 0.05. 518 The damage of Fe-S clusters is not only problem for the defective proteins, but also it fuels the 519 Beside the important pathways described above, there were many additional genes also important 538 for resistance to H 2 O 2 . Among those, the 3 unrelated genes, rpoS, pgm, and tonB, are important 539 ones that deserve more attention. The rpoS mutant showed reduced fitness and its protein was Tn-seq, individual mutant assays, and proteomic analysis in Data Set S1, S2, and S3, respectively. 557 558

Conclusions 559
We applied Tn-seq and proteomic analysis to find the genes and proteins that are required in S. 560 Typhimurium to resist H 2 O 2 in vitro. As the concentration of H 2 O 2 increased, the growth rate 561 reduced, the lag time extended, the fitness of mutants decreased, and some proteins were 562 differentially expressed. Validation of Tn-seq results with individual mutant assays indicated the 563 accuracy of the identified genes in response to the two H 2 O 2 concentrations. The targeted-564 proteomics had a good agreement with Tn-seq. We found about 80 genes that have not been 565 associated to resistance to H 2 O 2 previously. Salmonella employs multiple pathways to resist H 2 O 2 566 and the most important ones are ROS detoxifying enzymes, amino acid biosynthesis (aroK and 567 aroB), putative iron transporters (ybbK, ybbL, ybbM), iron homeostasis, Fe-S cluster repair, DNA 568 repair, flagellar and DNA adenine methylase genes. The genes identified in this study will broaden 569 our understanding on the mechanisms used by Salmonella to survive and persist against ROS in 570 macrophages. 571 Our unbiased system-wide approach, Tn-seq, was successful in identifying novel genetic 572 determinants that have not been implicated previously in Salmonella resistance to oxidative stress. 573 Furthermore, the combined use of quantitative proteomic approach has provided additional 574 insights on the function or mode of action of the identified genetic determinants in resisting 575 oxidative stress. As expected, the majority of the proteins important for resistance to H 2 O 2 were 576 upregulated in response to the same stressor. However, the expression level did not increase for 577 some proteins, in spite of their known roles in resistance to H 2 O 2 . Interestingly, the downregulation 578 of Dps and other proteins was counterintuitive to the common mode of protein regulation and 579 function, yet it may point to some unknown aspects of how Salmonella regulates the expression 580 of those proteins to better cope with the oxidative stress during infection in macrophage.

Selection of the mutant library for Tn-seq analysis 617
The transposon library was thawed at room temperature and diluted 10 -1 in fresh LB broth. To 618 activate the library, the diluted library was incubated at 37°C with shaking at 225 rpm for an hour. 619 Then, the culture was washed twice with PBS and resuspended in LB broth medium. The library 620 was inoculated to 20 ml LB broth and LB broth supplemented with either 2.5 or 3.5 mM H 2 O 2 621 (H 2 O 2 L and H 2 O 2 H, respectively), seeding CFU was 3.5 x 10 6 per ml. Then, when the cultures 622 reached mid-exponential phase, OD 600 of 2.7 (~1.17 x 10 8 CFU/ml), the incubation was stopped, 623 and the culture was immediately harvested by centrifugation, and stored at -20°C. 624

Preparation of Tn-seq amplicon libraries 625
Genomic DNA was extracted from the harvested cells using DNeasy Blood & Tissue kit (Qiagen), 626 and quantified using Qubit dsDNA RB Assay kit (Invitrogen). As described above, 20% of the 627 mutants in the library were the result of the integration of pBAM1 into chromosome. To remove 628 the Tn5-junction sequences originated from the plasmid in the Tn-seq amplicon libraries, genomic 629 DNA was digested with PvuII-HF (New England Biolabs), which digests immediately outside the 630 inverted repeats on both sides of Tn5 in pBAM1, and purified with DNA Clean & Concentrator-5 631 kit (Zymo Reaerch). Then, a linear PCR extension was performed using a Tn5-specific primer in 632 order to produce single stranded DNA corresponding to Tn5-junction sequences. To increase the 633 specificity in extending into Tn5-junction sequences, the linear PCR was conducted with a dual 634 priming oligonucleotide Tn5-DPO (5'-AAGCTTGCATGCCTGCAGGTIIIIICTAGAGGATC-3') 635 that is specific to Tn5 end (71). The PCR reaction contained 25 µl Go Taq Colorless Master Mix 636 (Promega), 20 µM Tn5-DPO primer, 100 ng gDNA, and 50 µl MQ-H 2 O. The PCR cycle consisted 637 of the initial denaturation at 95°C for 2 min, followed by 50 cycles at 95°C for 30 sec, 62°C for 45 638 sec, and 72°C for 10 sec. The PCR product was purified with DNA Clean & Concentrator-5 kit 639 and eluted in 13 µl TE buffer. After that, C-tail was added to the 3' end of the single-stranded 640 DNA. The C-tailing reaction was consisted of 2 µl terminal transferase (TdT) buffer (New England 641 Biolabs), 2 µl CoCl 2 , 2.4 µl 10 mM dCTP, 1 µl 1 mM ddCTP, 0.5 µl TdT and 13 µl purified linear 642 PCR product. The reaction was performed at 37°C for 1 h and the enzyme was inactivated by 643 incubation at 70°C for 10 min. The C-tailed product was purified with DNA Clean & Concentrator-644 5 kit and eluted in 12 µl TE. Next, the exponential PCR was performed with forward primer, P5-

Tn-seq data analysis 664
The preliminary data analysis was conducted by using a super computer in the High Performance 665 Computing Center (AHPCC) at the University of Arkansas. The libraries that were multiplexed 666