Impact of Three Different Mutations in Ehrlichia chaffeensis in Altering the Global Gene Expression Patterns

The rickettsial pathogen Ehrlichia chaffeensis causes a tick-borne disease, human monocytic ehrlichiosis. Mutations within certain genomic locations of the pathogen aid in understanding the pathogenesis and in developing attenuated vaccines. Our previous studies demonstrated that mutations in different genomic sites in E. chaffeensis caused variable impacts on their growth and attenuation in vertebrate and tick hosts. Here, we assessed the effect of three mutations on transcriptional changes using RNA deep-sequencing technology. RNA sequencing aided in detecting 66–80% of the transcripts of wildtype and mutant E. chaffeensis. Mutation in an antiporter gene (ECH_0379) causing attenuated growth in vertebrate hosts resulted in the down regulation of many transcribed genes. Similarly, a mutation downstream to the ECH_0490 coding sequence resulted in minimal impact on the pathogen’s in vivo growth, but caused major changes in its transcriptome. This mutation caused enhanced expression of several host stress response genes. Even though the ECH_0660 gene mutation caused the pathogen’s rapid clearance in vertebrate hosts and aids in generating a protective response, there was minimal impact on the transcriptome. The transcriptomic data offer novel insights about the impact of mutations on global gene expression and how they may contribute to the pathogen’s resistance and/or clearance from the host.


Impact of Three Different Mutations in Ehrlichia chaffeensis in Altering the Global Gene Expression Patterns Chandramouli Kondethimmanahalli & Roman Ganta
The rickettsial pathogen Ehrlichia chaffeensis causes a tick-borne disease, human monocytic ehrlichiosis. Mutations within certain genomic locations of the pathogen aid in understanding the pathogenesis and in developing attenuated vaccines. Our previous studies demonstrated that mutations in different genomic sites in E. chaffeensis caused variable impacts on their growth and attenuation in vertebrate and tick hosts. Here, we assessed the effect of three mutations on transcriptional changes using RNA deep-sequencing technology. RNA sequencing aided in detecting 66-80% of the transcripts of wildtype and mutant E. chaffeensis. Mutation in an antiporter gene (ECH_0379) causing attenuated growth in vertebrate hosts resulted in the down regulation of many transcribed genes. Similarly, a mutation downstream to the ECH_0490 coding sequence resulted in minimal impact on the pathogen's in vivo growth, but caused major changes in its transcriptome. This mutation caused enhanced expression of several host stress response genes. Even though the ECH_0660 gene mutation caused the pathogen's rapid clearance in vertebrate hosts and aids in generating a protective response, there was minimal impact on the transcriptome. The transcriptomic data offer novel insights about the impact of mutations on global gene expression and how they may contribute to the pathogen's resistance and/or clearance from the host.
Ehrlichia chaffeensis is a tick-transmitted intracellular bacterial pathogen causing human monocytic ehrlichiosis (HME) and it also infects dogs, deer, goats, and coyotes [1][2][3][4] . Mutations at certain genomic locations, leading to gene expression changes, impact the pathogen's ability to cause infection and persistence in a host 5,6 . The genome of E. chaffeensis may have evolved within a host cell environment leading to the development of mechanisms to undermine the host immune response 7 . Pathogenesis-associated E. chaffeensis genes are likely highly active in a host microenvironment and consistent with this hypothesis, differential gene expression in response to host cell defense is known to occur 8 . Progress has been made towards identifying genes crucial for Ehrlichia survival in a host cell environment [9][10][11] . However, to date only a few abundantly expressed genes are identified as associated with pathogenesis. Defining the genes involved in pathogenesis and virulence, and documenting their differential expression may aid in the discovery of novel proteins valuable as targets for therapeutic interventions and vaccine development for HME.
Genetically mutated intracellular pathogens are important resources for studying microbial pathogenesis, and also aid in the efforts of vaccine development 12,13 . Our previous study demonstrated the feasibility of transposon-based mutations in E. chaffeensis 5,6 . We also found that some insertion mutations resulting in transcriptional inactivation of membrane protein genes cause attenuation of the growth of the pathogen in vertebrate hosts. Insertions within the coding regions of ECH_0379 and ECH_0660 genes offered varying levels of protection against infection in a vertebrate host 14 . In this study, we hypothesized that the mutations' specific genomic locations may impact global gene expression and contribute to the pathogen's altered survival, infection progression, and replication in a host cell environment. To test this hypothesis, we assessed the impact of three mutations, reported earlier by Cheng et al. 5 , on global gene transcription. We selected two mutants with mutations within the coding regions of the ECH_0660 gene encoding for a phage like protein (ECH_0660) and the ECH_0379 gene encoding for an anti-porter protein (ECH_0379). Insertion mutation in ECH_0660 gene is located at the nucleotide position 213 of the 555 base long open reading frame. Similarly, mutation in ECH_0379 gene is located at the nucleotide position 682 of the 1056 base long open reading frame. The third insertion mutant strain, ECH_0490, has the insertion mutation 166 nucleotides downstream from the stop codon of ECH_0490 gene.
High throughput RNA sequencing (RNA seq) technologies have proven to be reliable and robust tools for determining global transcriptome activity in obligate intracellular bacteria 12,[15][16][17] . Comparative genomic studies identified several classes of virulence factors involved in secretion and trafficking of molecules between the pathogen and host cells and modulation of the host immune response [18][19][20] . However, studies focused on Ehrlichia gene expression have been limited mostly to outer membrane proteins genes, Type IV Secretion System (T4SS) genes, tandem repeat protein (TRP) genes, and ankyrin repeat genes (Anks) 9,19,21-23 . Among them, genes encoding for T4SS proteins and p28-OMP proteins have been found to be critical for pathogenicity 9,24 .
The obligate intracellular nature of E. chaffeensis poses a challenge in obtaining cell-free Ehrlichia from host cells 25 . Technical constraints in isolating Ehrlichia RNA from highly abundant host RNA remains an impediment in profiling of pathogen transcripts 26 . To overcome this limitation, we used an effective cell lysis strategy followed by density gradient centrifugation. Further, we enriched Ehrlichia RNA by efficiently removing polyadenylated RNA (poly(A) RNA) and eukaryotic and prokaryotic ribosomal RNAs from host and bacteria RNA mixtures. Sequencing of the enriched RNA aided in the detection of transcripts for 66-80% of the annotated E. chaffeensis genes as per the annotated genome: GenBank #CP000236.1. Comparison of transcript levels from wildtype and mutant strains revealed the highest degree of modulation in immunogenic and secretory protein genes, particularly in the mutant strains of ECH_0490 and ECH_0379, while minimal changes were observed in the ECH_0660 mutant strain.

Results
Isolation and purification of cell-free E. chaffeensis from host cells. The major challenge of undertaking transcriptome studies of intracellular pathogens is the difficulty in isolating host-cell free bacteria and subsequently recovering high-quality bacterial RNA. Rickettsial organisms, including E. chaffeensis, constitute only a very small fraction of isolated total RNA 27,28 . Because of the presence of highly abundant host cell RNA, recovery of bacterial RNA is a challenge for executing RNA seq analysis experiments. In this study, we first purified the host cell-free bacteria from infected host cells (canine macrophage cell line, DH82) by employing an efficient cell lysis method, coupled with density gradient centrifugation protocols. Host cell lysis was performed to efficiently rupture the host cells without causing a major damage to the bacteria. E. chaffeensis organisms are about 0.5 to 1 µm in diameter. Therefore, infected host cell lysate was filtered through 2 µm membrane to remove most of the host cell debris. A high-speed Renografin density gradient centrifugation of the resulting E. chaffeensis cell suspension aided in pelleting bacteria while host cell debris remained at the top layer of the solution. After total RNA isolation and DNase treatment, Bioanalyzer analysis revealed that despite the prior fractionation of host cell-free bacteria, the host 28 S and 18 S RNA remained at high concentrations in the recovered RNA. Bacterial mRNA enrichment was carried out by depleting the host poly(A) RNA and eukaryotic ribosomal RNA using a bacterial RNA enrichment protocol, resulting in nearly undetectable levels of host 28 S and 18 S RNA (Supplementary Figures; Fig. S1 and Fig. S2). The absence of contaminating E. chaffeensis genomic DNA in the purified RNA samples was confirmed by real-time quantitative PCR using E. chaffeensis 16 S rRNA gene primers 27 . We also confirmed the absence of DNA sequences in the RNA seq raw data by aligning 20 randomly selected E. chaffeensis intergenic non-coding DNA sequences (data not shown).
Differential transcriptional regulation of T4SS and p-28 OMP gene cluster genes in mutant ECH_0490. In the ECH_0490 mutant strain, 37 genes were significantly downregulated and 17 genes were up-regulated ( Table 3).
Mutation in ECH_0660 gene led to minimal transcriptional alterations. While we observed drastic gene expression changes in both ECH_0379 and ECH_0490 mutants, ECH_0660 mutant transcriptome showed minimal variations compared to wildtype; we observed only five genes as notably differentially expressed in this mutant ( Table 4). The genes included nitrogen regulation protein (NtrY) (ECH_0299) and the ABC transporter permease protein (ECH_0972) as down-regulated genes, whereas the heme exporter protein CcmA (ECH_0295) and chaperonin (ECH_0364) were upregulated. We also identified several commonly differentially-expressed genes in ECH_0379 and ECH_0490 ( Table 5). The ribonuclease D (ECH_0300) and potassium uptake protein (ECH_1093) were commonly down regulated in ECH_0379 and ECH_0490. T4SS protein VirB4 gene was down-regulated in ECH_0490 mutant, whereas this gene was up-regulated in ECH_0379 mutant. Contrary to this, ClpB was down-regulated in ECH_0379 mutant and upregulated in ECH_0490 mutant.

Validation of RNA seq data by quantitative real-time reverse transcription PCR. Quantitative
real-time quantitative reverse transcriptase-PCR (qRT-PCR) analysis was carried out on thirteen randomly selected genes identified as differentially transcribed according to the RNA seq data. To generate qRT-PCR data, we first normalized RNA samples to a constitutively expressed E. chaffeensis gene coding for the16S RNA as previously described in Cheng et al. 6 . The primers and genes selected for the qRT-PCR analysis are listed in Table S3. Transcript abundance for 7 down-regulated genes in ECH_379 mutant, including ECH_0466 and mrpC, ClpB, ECH_0033, NtrY, TrkH, and ECH_0972 were validated (Fig. 3A). Similarly, 6 upregulated genes from ECH_0490 mutant strain, including four transcripts belonging to an OMP gene cluster (OMP-p28, OMP-1B, OMP-1N, OMP-p28-2) and one each from ClpB and RpoH genes were verified by qRT-PCR (Fig. 3B). Likewise, the down-regulation of transcripts for the ECH_0299 and ECH_0972 genes were confirmed in ECH_0660 mutant by qRT-PCR (Fig. 3C).

Discussion
Isolation of cell-free bacterial RNA from highly abundant host RNA is the first challenge in transcriptional profiling of intracellular pathogens 25,28,29 . Rickettsiales require culturing in host cells and then need to be purified before extracting RNA for transcriptome evaluation experiments. To document the impact of three transposon mutations on E. chaffeensis transcription, we first developed a method for isolation and purification of host cell-free E. chaffeensis organisms, from which we isolated RNA and then subjected to next generation sequencing (NGS) analysis. To isolate cell-free E. chaffeensis, we started with an efficient host cell lysis protocol, and then filtration of whole cell lysate, followed by a renografin density gradient centrifugation. The second challenge was to obtain host cell-free RNA for transcriptome profiling. Previous studies report that bacterial RNA enrichment methods result in the enrichment of bacterial RNA reads only 3-10% 29,30 . Isolation of host cell-free bacteria and the bacterial RNA purification steps implemented in our study allowed a greater enrichment of E. chaffeensis RNA. In our current studies, we were able to enrich the bacterial RNA, which helped in generating up to 19% high mapping RNA reads. Notably, deep RNA sequencing analysis aided in mapping 80% of E. chaffeensis genes expressed in infected macrophage host cells. Among the highly expressed genes, the p28-OMP multigene cluster was dominant in the transcriptome. The E. chaffeensis p28-OMP multigene locus contains 22 tandemly arranged genes coding for the bacterial immunodominant proteins [31][32][33] . The presence of all 22 transcripts in the RNA seq data suggest that the gene cluster is among the most abundantly expressed genes. These observations are consistent with our previous proteomic study where we reported the p28-OMP genes' expression abundance 33 . NADH dehydrogenase I complex genes were also highly expressed in E. chaffeensis. NADH dehydrogenase counters the phagosomal NOX2 response to inhibit host cell apoptosis 34 . T4SS effector proteins in some pathogenic bacteria are considered as important in   [37][38][39] . The RNA seq analysis identified several transcripts encoding for T4SS proteins, including VirB3, B4, B6, B8, B9, B10, and B11. Chaperone protein genes DnaK, DnaJ, GroE, and ClpB were also highly expressed in both wildtype and mutant strains. The presence of such proteins involved in cell homeostasis and the oxidative stress response is reported in other rickettsiales [39][40][41] , suggesting that their gene products are also critical for the E. chaffeensis stress response if the pathogen proteome is similarly altered as per the transcriptome reported in the current study. Indeed, our recent study suggests that the stress response proteins are important for E. chaffeensis 11 . Other highly expressed protein genes included those encoding for house-keeping ribosomal proteins involved in protein synthesis, putative membrane proteins, ABC transporter, and lipoprotein;   all of which are likely important for the pathogen's protein synthesis, transport, trafficking, and effector secretion into the host cells. ATP synthase subunit, cytochrome c oxidase, DNA polymerases, GTP-binding protein and translation elongation factors involved energy metabolism, cell division, and transcriptional regulation were also among the highly expressed genes in both wildtype and mutant organisms. The extent of transcriptome coverage is higher than the previously reported for E. chaffeensis in ISE6 and AAE2 tick cells 8 . This is substantial for both the enhanced detection of intracellular pathogen transcripts and also because of the abundance of gene expressions observed. Higher coverage of the transcriptome likely resulted from deep sequencing of the RNAs by next-generation sequencing compared to microarray analysis 8 . This global set of highly expressed genes may represent products involved in pathogenicity, replication and survival of E. chaffeensis in host cell environment 42,43 . Four transcripts that code for ankyrin repeat proteins, which are shown to mediate protein-protein interactions 44 , were also identified in the transcriptome. Notably, the transcriptome from the wildtype and mutant organisms contained 216 transcripts that code for hypothetical proteins with unknown function. As these were within the core transcriptome, we anticipate that they represent an important set of transcribed genes for E. chaffeensis replication. Transcription from large numbers of genes in ECH_0379 mutant was found to be reduced compared to wildtype. Genes representing antiporters, ABC transporters, chaperons, metabolic enzymes, and transcription regulators are among the down-regulated genes ( Table 2). We predict that the mutation in the anti-porter protein gene caused a metabolic depression. Antiporter and transport proteins play an important role in the transport of ions and solutes across the cell membranes of bacteria 45 . Antiporters are integral membrane proteins that perform secondary transport of Na + and/or K + for H + across a phospholipid membrane 5 . The E. chaffeensis genome contains several genes having homology to antiporter proteins or their subunits, suggesting that they are needed for the pathogen's intraphagosomal replication and survival in a host. In particular, antiporters aid bacteria in maintaining pH, salt, and temperature conditions 46 . We observed a significant decline in transcription of antiporter genes such as monovalent cation/H + antiporter subunit C (ECH_0469) and ECH_0466. Disrupting the antiporter function or preventing their expression may affect the pathogen's growth in vivo. Indeed, mutation in the ECH_0379 gene resulted in the attenuated growth of the organism in both an incidental host (dog) and in the reservoir host (white-tailed deer) 5,6 . ABC transporters also are involved in uptake of ions and amino acids and may play an important role in a pathogen's ability to infect and survive in a host cell environment 47 . The ECH_0379 mutant had low levels of transcriptional activity of the genes ECH_0517 and ECH_0972 encoding for ABC transporters, which function at different stages in the pathogenesis of infection 47,48 . These proteins promote the survival of pathogens in the host microenvironments 49 . The mutation possibly interferes with transport mechanisms, thereby affecting its ability to infect and survive in host cells 5,6 . The mutation may have also caused alterations to the transcriptions of genes involved in physiological responses, such as regulating the pathogen's metabolic activities. We also found down-regulation of several transcripts encoding for metabolic enzymes: glutamate-cysteine ligase, DNA/pantothenate metabolism flavoprotein family protein, ATPase, uroporphyrinogen-III synthase, diaminopimelate decarboxylase, biotin-acetyl-CoA-carboxylase ligase, and argininosuccinate lyase. In general, a pathogen's survival in an intracellular environment depends on its ability to derive nutrients from the host cell 50 . Pathogenic bacteria use metabolic pathways and virulence-associated factors that undermine the host immune system so that they can derive nutrients from their host cells 51 . It is possible that the downregulation of the transcripts from the aforementioned genes in the ECH_0379 mutant hampers the bacterial metabolic response and its capacity to derive nutrients from the host. The mutation also caused decreased expression of genes encoding DNA replication and repair protein, formamidopyrimidine-DNA glycosylase, dimethyladenosine transferase, and leucyl-tRNA synthetase. This may have also contributed to defects in pathogen's intracellular growth and survival. Our prior studies suggest that despite the mutant's attenuated growth, it failed to offer complete protection against wildtype infection challenge 14 . If the changes in the transcriptome correlate with changes in the proteome, variations in the mutant organisms' protein expression relative to the wildtype E. chaffeensis may result in an altered host response, thus making the host less effective in initiating a protective host response when exposed to the mutant organisms 14 .
Pathogenic bacteria produce T4SS effectors to weaken the host cell gene expression and contributes to bacterial virulence 52,53 . RNA seq data suggested declined expressions of various T4SS component protein gene transcripts in ECH_0490 mutant. We also observed decreased transcription of chaperone proteins and several genes involved in the transcription and translational machinery, and exonuclease and DNA-binding regulator gene transcripts in the ECH_0490 mutant strain. On the contrary, ClpB (a major stress response heat shock protein) and RpoH (stress response RNA polymerase transcriptional subunit) showed increased transcription in the mutant.
Chaperone proteins play a key role in protein disaggregation and in aiding the pathogen to overcome the likely host cell-induced stress 54 . ClpB reactivates aggregated proteins accumulating under stress conditions and it was abundantly expressed during replication stage of E. chaffeensis 54,55 . Preventing or reducing protein aggregation and the associated protein inactivation during the bacterial growth within a host cell may benefit the pathogen in enhancing its survival 11 . The RNA polymerase transcription regulator, RpoH, is also important for the pathogen's continued growth as it aids in promoting the expression of stress response proteins 10 . Consistent with the prediction, increased expression of ClpB and RpoH was observed in the current study for ECH_0490 mutant. The enhanced expression from these two important genes likely enables the mutant to grow similarly to wildtype E. chaffeensis in vertebrate and tick hosts, as reported in our previous studies 5, 6 . Outer membrane proteins perform a variety of functions such as invasion, transport, immune response, and adhesion that are vital to the survival of Ehrlichia species, including E. chaffeensis and E. ruminantium in a host 32,[56][57][58][59] . The ECH_0490 mutant had increased abundance of OMPs compared to wildtype organisms. We found seven transmembrane genes coding for immunodominant P28/OMP family of proteins (OMP_p28, OMP_p28-2, OMP-1B, and OMP-1N) and membrane proteins (ECH_0039, ECH_0009, and ECH_0230) to be upregulated. Significant changes in the SCiEntifiC RepoRts | (2018) 8:6162 | DOI:10.1038/s41598-018-24471-3 abundance of the outer membrane proteins may be associated with overall changes in the membrane architecture, thereby altering the pathogen's susceptibility to host defense. The transcriptional changes noted in the ECH_0490 mutant may not have had any negative impact on the pathogen, as the mutant grows similar to the wildtype pathogen both in white-tailed deer (the reservoir host) and in dogs (an incidental host), and in its tick host, Amblyomma americanum 5,6 . Transcriptional activity assessment of the genes ECH_0490 (lipoic acid synthetase) and ECH_0492 (putative phosphate ABC transporter), both of which are located up and down stream to the transposon insertion mutation, respectively, suggested that the mutation has no effect on these genes' transcription (Fig. S4). The diverse changes in the transcriptome of the mutant, while having no impact near the mutation site, suggest that the mutation impacted global gene expression and yet did not adversely affect the pathogen's survival in vertebrate and tick hosts 5,6 .
The most notable observation was the apparent minimal variation in the transcriptome of the ECH_0660 mutant compared to the wildtype E. chaffeensis. Importantly, mutation within ECH_0660 gene causes severe growth defects in vivo in vertebrate hosts 5,6 . Further, infection with this mutant also initiates a strong host response and confers protection against wildtype pathogen infection challenge 14,60 . In the current study, we observed only minor changes in the gene expression in this mutant compared to wildtype. The minor changes in gene expression included genes encoding for putative nitrogen regulation protein, ABC transporter, heme exporter protein and GroES, but the variations were significantly less compared to numerous changes described in the previous two mutants. Together, these data suggest that the mutation in ECH_0660 gene led to fewer transcriptional alterations. Assuming that the proteomes of the wild type and mutant strains of E. chaffeensis are similarly altered as the transcriptomes, then ECH_0660 mutant proteome may be very similar to the wildtype bacterium. The greater degree of similarity between this mutant and the wildtype may enable the vertebrate hosts to recognize this mutant as closer to wildtype organism, thus inducing a stronger host response that mimics wildtype infection 14,60 . The replication defect reported earlier with this mutant may have resulted due to the loss of gene expression from fewer genes such as ECH_0659 and ECH_0660, while maintaining most of the transcriptome similar to the wildtype.

Conclusions
RNA deep sequencing studies in intracellular bacteria are still a major challenge. The RNA seq data reported here provide the first snapshot of comparative transcriptomics of E. chaffeensis. Sequencing of enriched bacterial RNA from wildtype and mutant strains yielded a high coverage of genes. A mutation in the ORF of ECH_0379 gene caused drastic down-regulation of genes leading to metabolic depression, which may have contributed to the mutant's attenuation in vertebrate hosts. While a mutation downstream to the protein coding sequence of ECH_0490 gene induced global changes in gene expression, up regulation of stress response regulatory genes may have helped the mutant survive in the vertebrate hosts and tick hosts. A mutation within ECH_0660 gene coding sequence resulted in few transcriptional changes, thus keeping the integrity of its transcriptome similar to wildtype. While the transcriptome data are suggestive of protein expression variations, additional experimental validation from protein analysis studies is necessary to confirm the results. Together, this study offers the first detailed description of transcriptome data for E. chaffeensis, suggesting that variations observed in the pathogen's ability to survive in a host and the host's ability to induce protection against the pathogen may be the result of global changes in the gene expression, which in turn may impact changes in the pathogen's proteome.

Materials and Methods
In vitro cultivation and cell-free E. chaffeensis recovery. E. chaffeensis Arkansas isolate wildtype and the mutants were grown in the canine macrophage cell line, DH82 58,61 . Isolation and purification of cell-free E. chaffeensis wildtype and its mutants were carried out as outlined in Fig. S5. Briefly, the bacterial infection rate in DH82 cells was assessed with Diff-Quik staining. After 72 h of infection when the infection reached to about 80-90%, the culture from four T-150 confluent flasks was harvested and centrifuged at 500 × g for 5 min. Cellular pellets were resuspended in 1 × phosphate buffered saline (PBS) containing protease inhibitors (Roche, Indianapolis, IN) and cells were homogenized on ice by passing through, 15-20 strokes with a 23 g needle in a 10 mL syringe. Efficiency of homogenization, 80-90% lysis, was checked under light microscope. Whole cell lysate was centrifuged at 500 × g for 5 min at 4 °C. The resulting supernatant containing cell-free Ehrlichia organisms was filtered through a 2 µm sterile membrane filter (Millipore, Billerica, MA). Cell-free Ehrlichia from filtrates were pelleted by centrifuging at 15,000 × g for 15 min and the pellet was suspended in PBS and then layered onto 30% diatrizoate meglumine and sodium solution (Renografin) MD-76R (Mallinckrodt Inc, St. Louis, MO). The suspension was centrifuged for 1 h at 100,000 × g at 4 °C in a S50-ST swinging bucket rotor (Beckman, Indianapolis, IN). The pellet of cell-free Ehrlichia were washed at 15,000 × g for 15 min and used for experiments.
Bacterial mRNA enrichment and sequencing. Figure S6 outlines the workflow for bacteria mRNA enrichment and cDNA library preparation and RNA sequencing. Briefly, RNA form wildtype and mutants were isolated from purified cell-free Ehrlichia using TRIzol Reagent (Sigma-Aldrich, St. Louis, MO). RNA samples were then treated with DNase I (Invitrogen, Carlsbad, CA) and bacterial RNA was enriched by removing host 18 S rRNA, 28 S rRNA, and polyadenylated mRNA using MICROBEnrich Kit (Ambion, Foster City, CA). The quantity and integrity of bacterial RNA before and after enrichment was assessed using a NanoDrop 2000 spectrophotometer (Thermo Scientific, Waltham, MA) and Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). The Ribo-Zero Magnetic Kit was used to isolate mRNA from total RNA samples and then fragmented into short fragments as per the manufacturer's protocols (Epicentre, Madison, WI). Subsequently, cDNA was synthesized using the mRNA fragments as templates. Libraries of cDNAs for wildtype and mutants were prepared using the TruSeq RNA Sample Prep Kit (Illumina, Ingolstadt, Germany). Sample libraries were quantified using Agilent Bioinformatics analysis. The original image data were transferred into raw sequence data via base calling.
Raw reads were subjected to quality assessment to determine whether the raw reads were qualified for mapping (Fig. S5). The bases with low quality (<20) were excluded from the analysis. Raw reads were then filtered to remove adapter sequences and low quality reads, then clean reads were aligned to the E. chaffeensis Arkansas strain complete genome as per the first annotated GenBank # CP000236.1 using SOAPaligner/SOAP2 62 . We opted to use this accession number because our prior publications, and similarly other investigators, widely used it for referring to gene names and numbers listed in it. Not more than five mismatches were allowed in the alignment, which is a standard cut off used for the alignment analysis. The alignment data were used to calculate distribution of reads on reference genes and determine the gene coverage. Alignment results were assessed for quality check and then proceed with analysis of DGE. The gene expression level was calculated using RPKM method of normalizing for total read length and the number of sequencing reads 63 . We used p-value < 0.05, False Discovery Rate (FDR) ≤ 0.001, and the absolute value of Log2 Ratio ≥ 1 as the threshold to judge the significance difference in gene expression. The FDR uses accurate p-values as a measure of control in multiple sample comparison of RNA seq data. Corrections for false positive and false negative errors were performed using the method described by Benjamini and Yekutieli 64 .
Quantitative real-time reverse transcription PCR. SYBR green detection-based quantitative real-time reverse transcription PCR (qRT-PCR assays were performed to validate the gene expression changes observed in the RNA seq data analysis. Wildtype, ECH_0379, ECH_0490, and ECH_0660 mutants' RNAs used in generating the RNA seq data were also used to determine transcript levels by performing quantitative RT-PCR by SYBR Green assays using a SuperScript ® III Platinum SYBR Green One-Step qRT-PCR Kit (Invitrogen, Carlsbad, CA). RNA was reverse transcribed from all the replicates using SuperScript III and then quantitative-PCRs were performed in a 25 μL reaction containing 0.5 μM each of forward and reverse primers. Thermal cycler conditions were; 94 °C for 15 sec, 60 °C for 30 sec, and 74 °C for 15 sec for 40 cycles. Thirteen randomly selected differentially transcribed genes were used in validation experiments using StepOnePlus ™ Real-Time PCR instrument (Applied Biosystems, Foster City, CA) and the data were analyzed by StepOne Software v2.3. E. chaffeensis 16 S rRNA was quantitated by real-time RT-PCR as described in 27 and used for normalization of RNA concentrations among different RNA batches, prior to performing the validation experiments. For qRT-PCR data, the delta-delta Ct (ΔΔCt) calculation was employed to calculate relative change in the expression and fold change was obtained by averaging the replicate values of gene expression and the standard error. Semi-quantitative one-step RT-PCR (Life Technologies, Carlsbad, CA) targeting to E. chaffeensis genes ECH_0490 and ECH_0492 near the transposon mutation downstream to ECH_0490 gene was performed with 30 cycles of amplification using the gene specific primers as described in a previous study 6 . Briefly, RNA from wildtype and ECH_0490 mutant were used as the templates for RT-PCR. One tube without reverse transcriptase or template RNA was used as negative control. One tube with DNA as the template was used as positive control. Thermal cycler conditions were as follows: 50 °C for 1 h for reverse transcription step then followed by 35 cycles of 94 °C for 30 sec, 55 °C for 30 sec, and 72 °C for 30 sec; finally a 2-min 72 °C extension step was part of the reaction.