Novel multiparameter correlates of Coxiella burnetii infection and vaccination identified by longitudinal deep immune profiling

Q-fever is a flu-like illness caused by Coxiella burnetii (Cb), a highly infectious intracellular bacterium. There is an unmet need for a safe and effective vaccine for Q-fever. Correlates of immune protection to Cb infection are limited. We proposed that analysis by longitudinal high dimensional immune (HDI) profiling using mass cytometry combined with other measures of vaccination and protection could be used to identify novel correlates of effective vaccination and control of Cb infection. Using a vaccine-challenge model in HLA-DR transgenic mice, we demonstrated significant alterations in circulating T-cell and innate immune populations that distinguished vaccinated from naïve mice within 10 days, and persisted until at least 35 days post-vaccination. Following challenge, vaccinated mice exhibited reduced bacterial burden and splenomegaly, along with distinct effector T-cell and monocyte profiles. Correlation of HDI data to serological and pathological measurements was performed. Our data indicate a Th1-biased response to Cb, consistent with previous reports, and identify Ly6C, CD73, and T-bet expression in T-cell, NK-cell, and monocytic populations as distinguishing features between vaccinated and naïve mice. This study refines the understanding of the integrated immune response to Cb vaccine and challenge, which can inform the assessment of candidate vaccines for Cb.

Longitudinal immunological assessment of Cb vaccination and challenge. We assessed the longitudinal profile of cellular immune responses to vaccination and challenge in tgHLA-DR3 mice in two independent replicate studies (Fig. 1A). Each study included 16 mice divided into naïve and vaccinated groups (n = 8 per group per study) that were sub-divided into challenge and uninfected groups (n = 4 per group per study, Fig. 1A). One mouse assigned to the naïve-challenge group died on day 35, prior to challenge. On day 42 postvaccination, a subset of naïve and vaccinated mice was challenged i.n. with Cb. Blood was collected on days 10, 24, and 35 post-vaccination and days 2 and 10 post-infection (days 44 and 51 post-vaccination) (Fig. 1B). For each cytometry timepoint, surface antigens were antibody-labelled following collection, and samples fixed to inactivate viable Cb (Supplementary Table 1). Following confirmation of inactivation and release from biocontainment, intracellular epitopes were labeled, and samples analyzed by mass cytometry.
Pre-challenge antibody titers and quantitative measures of infection were also determined. Cb-specific antibodies increased from Days 10 to 35 post-vaccination (Fig. 1C). Following sacrifice on Day 10 post-challenge, measures of bacterial burden, splenomegaly, and histopathology were determined. Challenge with Cb significantly increased splenomegaly (%BW) and bacterial burden in naïve mice as compared to unchallenged mice (either naïve or vaccinated). In vaccinated mice, challenge did not significantly increase splenic measures of Cb burden or %BW, as compared to unchallenged mice ( Fig. 1D-E). Vaccination similarly reduced histology scores for spleen, heart and lung. However, both vaccination and challenge independently resulted in elevated liver histopathology (Fig. 1F). For vaccinated challenged mice, all but one exhibited moderate hepatic histopathology (score = 1). One vaccinated challenged mouse had elevated spleen %BW, heart and spleen histopathology, though no bacterial burden. Among the three vaccinated challenged mice with detectable bacterial burden, only one was found to have both lung and spleen histopathology. Thus, vaccination with inactivated whole-cell Cb elicited protective immunity sufficient to reduce bacterial burden and pathological measures overall, with evidence of vaccine enhancement of splenic clearance by 10 days post-challenge in a subset of mice.

Identification of immune cell populations consistent between replicate experiments. The
high-parameter nature of mass cytometry data can provide multi-marker definitions of immune populations for a detailed description of immune responses. Computational tools can simultaneously assess the abundance of markers on individual cells to identify clusters of highly similar cells. We used a barcoding strategy to antibody Figure 1. Clinical outcomes of Coxevac vaccination and Cb challenge in tgHLA-DR3 mice. (A) Treatment groups and numbers of mice for the tgHLA-DR3 study (B) Experimental schedule. Mice were injected subcutaneously with saline or 10 µg Coxevac on day 0. After 42 days mice were challenged intranasally with live Cb. Whole blood was collected for CyTOF analysis on Day 10, 35, 44 and 51. Sera were collected for serology on Day 10, 24 and 35. (C) Antibody production against Cb was evaluated at Day 10, 24, and 35 post-vaccination by ELISA (D) Spleen-to-body-weight ratio and (E) spleen bacterial burden (genome equivalents (GE) determined by qPCR) were assessed for each of the experimental groups. Significant differences between experimental groups in panels (C-E) were assessed by one-way ANOVA with the Tukey post-hoc multiple comparison correction (****p = 0.0001, ***p < 0.0003, *p < 0.01). (F) Histopathological scores from lung, liver, spleen, and heart. Kruskal-Wallis test was used to compare between groups ( ### p = 0.0003, ## p = 0.005).
label and acquire all samples from each timepoint as one cohesive unit. However, technical variations between timepoints and between replicate experiments resulted in batch effects that precluded simultaneous identification of populations from all timepoints and replicates by clustering analysis.
We developed an approach to combine data from the two replicate experiments and thereby improve statistical power for comparisons among treatment groups ( Supplementary Fig. 2). Briefly, each timepoint from individual replicate experiments was analyzed independently and cell clusters conserved between replicate timepoints were identified (e.g. Day 10 experiment 1 and Day 10 experiment 2). First, cell profiling data from each timepoint was analyzed individually to identify cell clusters computationally using X-shift 32 . Next, the resulting clusters from replicate timepoints were matched on the basis of their phenotypic profiles and candidate pairs refined according to abundance and overall distribution between treatment groups. For the resulting "matched clusters" representing clusters conserved between two replicate timepoints, the data were combined to improve statistical power of comparisons among treatment groups (see "Methods").
temporal changes in abundance of phenotypic immune cell classes. Matched clusters from all timepoints were organized hierarchically to identify phenotypically similar clusters from across the experiment for T-cell, B-cell, and innate populations (Fig. 2). The x-axis dendrogram indicates the influence of each marker in defining cell classes. The antibody panel, which recognized a broad range of immune populations, provided subsets of markers that dominate the heatmap for each of the three major populations. Through hierarchical clustering, we identified "phenotypic classes" of cells which contain matched clusters bearing similar, though not identical, marker profiles. Some classes contain clusters from multiple days or multiple clusters from a single day. Phenotypic classes containing a single matched cluster were included, as they represented features observed in two independent experiments. The analysis of T-cells resulted in 77 matched clusters in 19 phenotypic classes ( Fig. 2A). For B-cells, 92 matched clusters separated into 20 phenotypic classes (Fig. 2B). Within innate immune populations, 60 matched clusters segregated into 14 phenotypic classes (Fig. 2C).
The capacity of each phenotypic class to predict classification of an individual mouse as naïve or vaccinated was determined for pre-challenge timepoints using an elastic net (EN) logistic regression analysis (Fig. 3A,B) 33 . EN analysis was conducted on post-challenge data to identify those classes that distinguish vaccinated challenged and naïve challenged mice (Fig. 3C,D). To evaluate the temporal distribution of key classes, the frequency of each phenotypic class through the time course was determined (Fig. 3E, Supplementary Fig. 4A-C). Following vaccination, on Day 10 T-cell and innate immune classes distinguish naïve and vaccinated groups, with activated T-cells linked to vaccinated mice. After 35 days, predictive B-cell classes appear, with Day 10 T-cell and innate cell trends continuing. Challenge strongly polarized the predictive immune populations within 2 days post-challenge. By Day 51, the most predictive clusters were activated CD4+ T-cells and monocyte/macrophages found in naïve challenged mice. Detailed observations related to phenotypic families, as well as subsequent evaluation of individual clusters and correlation to experimental outcomes, are summarized in Table 1 and described further in the Supplementary Material.

correlation of cell clusters to measures of vaccination and infection. Grouping cell clusters into
phenotypic classes provides information on the distribution of cell populations through time and enables detection of phenotypic cell classes with predictive value. However, this approach does not assess the relationship of individual populations to either experimental group or to measures of vaccination and infection. To determine the link of individual populations to measured physiological data and treatment group, we conducted correlation and EN logistic regression analyses.
First, we determined the correlation of each matched cluster to antibody titer, splenic %BW, bacterial burden, and histopathological scoring. Hierarchically organizing clusters on the basis of correlation values, independent of timepoint or cellular marker profiles, delineated seven modules which were visualized using a t-SNE plot 34 Table 2). The spatial distribution of clusters within the t-SNE is a function of the similarity between cell clusters with respect to their correlations to measures of vaccination and infection ( Fig. 4B-I). The subset of clusters in Module 1 (Supplementary Table 3 For each module, the median correlation of constituent clusters to antibody titer and measures of infection was determined (Fig. 4J, Supplementary Table 3). Module 1 had the strongest association with antibody titer and a negative correlation to all measures of infection. Modules 2 and 3 also correlated to increased antibody titer, though with elevated correlation to splenic measures, Module 2 also showed an increased correlation to liver histopathology, as compared with 1 & 3. Module 4 followed by Module 5 was the most linked to elevated histopathological scores and negatively correlated to antibody titer; their adjacent positioning on the t-SNE is consistent with their similar correlation properties. Overall, Modules 1, 4, and 5 had the strongest correlations to vaccination-induced antibody production or measures of infection. The module assignments for key populations are noted in Table 1. predictive cells clusters to distinguish vaccinated from naïve mice. We next identified individual clusters and combinations of clusters that best differentiated, at each timepoint, between either vaccinated and naïve mice or between challenged naïve and vaccinated mice at each timepoint. Individual clusters were assessed as predictors of group assignment by EN logistic regression. The predictive clusters for each timepoint are graphed to show their relative rank, as well as position on the correlation module t-SNE map ( To investigate the immune response in a more integrated manner, combinations of clusters that most clearly distinguished vaccinated from naïve mice following either vaccination or challenge with Cb were determined. The predictive capacity for every possible combination of up to 10 clusters for each timepoint was determined using four computational selection methods ( Supplementary Fig. 3). The optimal combination of clusters for each timepoint was selected based on the consensus rank among the four measures, with strong agreement across methods regarding accuracy and sensitivity (Table 2). Notably, for each day, the selected combination of clusters was ranked highest by the neural net (NN) method at all timepoints, other than Day 51, when a requirement that one or more clusters linked to vaccinated mice were included meant the combination ranked second by NN was chosen, wherein innate cluster 4 was exchanged for innate cluster 5 (Supplementary Table 3). Consistent with the individual cluster EN analysis, post-vaccination timepoints required more clusters to achieve maximum performance as compared to timepoints post-challenge: six clusters representing all major immune cell populations were required 10 days after vaccination to best distinguish vaccinated and naïve mice. By Day 35 post-vaccination, the combination of a single B-cell population and four innate immune populations, without T-cell populations, was sufficient to discriminate between vaccinated and naïve mice. Following challenge, vaccinated mice could be distinguished from naïve mice with four populations at two days post-challenge, and with two populations on Day 10 after challenge. At both post-challenge timepoints, the indicated T-cell and innate populations ( Table 2) were sufficient to distinguish these groups, without requiring consideration of B-cell populations.
Together, these alterations in distribution of phenotypic classes and their constituent clusters provide immunological signatures that distinguish treatment groups (Table 1). Vaccination increased the abundance of CD8+ Ly6C+ CD73+ T-cells as well as CD4 + Ly6C+ T-cells by Day 10 post-vaccination. By Day 35 postvaccination these CD4+ cells were not apparent, while CD8+ cells were also expressing CD44+, suggesting transition to a central memory phenotype. Both CD4+ and CD8+ T-cell populations expressing CD62L and Ly6C were substantially increased 10 days post-challenge. Post-challenge, these increases were more prominent in naïve mice, indicating an overall greater response from activated naïve T-cells.
These observations indicate that Ly6C+ cells in both naïve and memory peripheral T-cell populations are increased following exposure to Cb, either through vaccination or intranasal challenge. Notably, the majority of differences between treatment groups observed in innate cell populations were higher frequencies of circulating monocyte, macrophage, and NK cell populations in naïve mice compared to vaccinated mice, following either vaccination or challenge. In particular, T-bet+ NK populations, likely representing mature cells, were reduced following vaccination, suggesting their mobilization to sites of inoculation. Overall, the number, distribution, and respective phenotypes of clusters that best distinguish groups of mice at each timepoint ( Table 2) identify hallmarks of the immunological processes induced by vaccination and the differential response to challenge in naïve and vaccinated mice.
We sought to determine if there were immunologic differences between the vaccinated mice that had no detectable bacterial burden by Day 10 post-challenge and the three vaccinated mice that retained detectable Cb (Fig. 1E). For each timepoint, combinations of up to three clusters were ranked by linear discriminant analysis (LDA) for ability to distinguish between naïve challenged mice and the Cb-negative vaccinated challenged mice. Next, to identify cell populations that distinguish Cb-positive vaccinated mice, the distributions of cluster combinations were compared between the three challenge groups: naïve, Cb-negative, and Cb-positive. On Day 10 post-vaccination three T-cell clusters distinguished challenged naïve and vaccinated mice and also showed a distribution in the three Cb-positive mice that trended towards naïve challenged mice (Fig. 6A). These three clusters correspond to naïve Ly6C+ CD4+ and CD8+ populations increased in vaccinated mice and a naïve Ly6C-CD8+ population enriched in naïve mice. Comparison of the three groups of mice on Day 35 identified a combination of mature NK cells and a monocyte population associated with naïve mice and an activated memory B-cell population in vaccinated mice (Fig. 6B). At two days post-challenge, a combination of CD4+ Ly6C+ Tbet+ T-cells and CD68+ mononuclear cells, both expressing TGFβ in vaccinated mice, along with a macrophage population in naïve challenged mice, was selected by the analysis (Fig. 6C). At Day 10 post-challenge distinctions between the three groups were less robust. Nonetheless, a combination of central memory CD8+, effector memory CD4+ T-cells, and a mature NK population all enriched in vaccinated mice was identified (Fig. 6D). Despite the small group size, these data suggest signatures associated with vaccine efficacy that can be further investigated in studies that include larger groups of mice or for evaluation of candidate Cb vaccines. Figure 3. Distribution of phenotypic classes across vaccination and challenge. For each timepoint, an EN regression analysis was used to determine the probability of assignment to either experimental group, using the median frequency of each phenotypic class per mouse on that day. The graphs identify cell classes with highest predictive value for distinguishing vaccinated vs naïve mice pre-challenge (A,B) or vaccinated challenged v naïve challenged mice (C,D) based on the magnitude of their logit coefficient, which is a measure of the influence of change in the abundance of the cluster between experimental groups on the predictive value of the cluster. Phenotypic class names are as assigned in Fig. 2 . 3B), and the frequency of cells expressing Ly6C, T-bet, or CD73 was determined. In this analysis, we observed increased expression of each of these individual markers following vaccination and challenge (Fig. 7), consistent with the changes in cellular phenotypes identified computationally (Table 1)  www.nature.com/scientificreports/ Thus, manual gating analysis confirms the key features and markers identified based on computational clustering of immune populations. A Circos plot was generated to visually summarize key data from all aspects of the experiment, while retaining information for individual mice (Fig. 8) 35 . The distinct profiles of naïve and vaccinated groups are readily distinguished, as is the impact of challenge to further separate treatment groups. Features of naïve challenged mice are particularly evident at Day 51, as are features of vaccinated challenged mice, such as the abundance of mature NK cells that are shared with the unchallenged naïve and vaccine groups. Other features of vaccination persist from 10 days post-vaccination until 2 days post-challenge, demonstrating the long-lasting impact of vaccination, even in the face of challenge with Cb. Pre-challenge naïve and vaccinated mice are largely distinguished by differential abundance monocyte/macrophage populations, mature NK cells, and activated Th1 CD4 and CD8 T-cell populations. Following challenge, vaccinated and naïve mice exhibit distinct distributions of activated naïve CD4 T-cells, Th1 polarized.
CD4 effector memory T-cells, and activated central memory CD8 cells, as well as subsets of monocytes and activated macrophages. Interestingly, the Circos plot also illustrates a naïve challenged mouse (#5) with reduced bacterial burden and splenomegaly with an immune profile similar to vaccinated challenged mice on Day 51.

Discussion
We report a longitudinal study of Cb vaccination and challenge in a transgenic humanized murine model and describe for the first time the dynamics of all major immune cell populations in the response to Cb. Detailed profiling of circulating immune cells by mass cytometry, together with discrete measures of vaccination and infection with Cb, were used to identify novel hallmarks of the immune response to Cb. We developed an analysis workflow to meet the challenge of batch effects and interrogate data from two replicate studies. Subsequent statistical analysis delineated immune populations, both individually and in combination, that distinguish naïve from vaccinated mice through the course of the study. Further analysis determined the correlation of immune populations to measures of vaccination and challenge. Together, these data reinforce the interconnected dynamics of the immune system, highlight cellular populations and markers that are important drivers of the immune response to Cb, and will inform evaluation of future candidate vaccines.
Mass cytometry enabled detailed analysis of T-cell, B-cell, and innate (NK cell, granulocyte, and monocyte/ macrophage) populations in small blood volumes (< 200 μL) from mice under BSL3 containment, and metalconjugated antibodies were amenable to the extensive fixation necessary to inactivate Cb. Technical considerations of working with small blood volumes under BSL3 containment required that samples were processed and antibody-labeled upon collection, rather than banked for simultaneous analysis of all study samples. Consequently, batch effects prevented simultaneous computational analysis of data across timepoints and replicate experiments, thereby limiting statistical power. Several batch correction algorithms have been described for CyTOF data; however, these tools were not sufficient to enable simultaneous clustering-based analysis of the data from this study, or they require additional controls not included in our study [36][37][38] . To analyze the multiple timepoints from two independent replicate experiments, we developed a novel workflow that leverages established clustering and statistical analysis methods 32,33,39,40 . The strategy computationally defined cell clusters from each timepoint separately for each replicate experiment and then matched clusters between replicate experiments by phenotype. The resulting matched clusters retain source identity from each mouse as well as information for each individual cell. The matched clusters from all timepoints were organized hierarchically, on the basis of their expression profiles, to identify phenotypic classes and provide longitudinal context. Subsequent statistical analysis enabled identification of populations associated with vaccination status as well as correlation to serological data and measures of infection.
Our results suggest a potential role for Ly6C+ naïve T-cells soon after initial exposure to Cb, through either vaccination or challenge. We observed multiple post-exposure CD4 and CD8 T-cell populations distinguished by increased expression of Ly6C, a GPI-anchored glycoprotein expressed by hematopoietic derived cells 41,42 . Typically, Ly6C is used to distinguish subsets of murine myeloid populations. In peripheral CD4+ and CD8+ T-cells, Ly6C expression is indicative of engagement with cognate antigen and enhanced proliferation and effector function [43][44][45][46][47] . Ly6C is associated with T-cell activation in the context of intracellular pathogens, most often viruses, and adoption of a Th1 phenotype by CD4+ T-cells 43,44,48 . CD4+ Ly6C+ T-cells are reported to produce elevated levels of cytokines, including IL-27 and IFNγ. In CD8+ T-cells, Ly6C+ cells may identify a subset of memory cells and its expression may facilitate homing to lymphoid tissues 49,50 . While Ly6C lacks a sequencebased homologue in human immune cells, studies of human and murine monocytes suggest that CD14 may be a functional homologue 51

, although expression of CD14 by T lymphocytes remains an open question.
Somewhat surprisingly, neither CD25 nor CD69, two additional markers of T-cell activation, were identified by our analysis as features of T-cell populations that distinguish the treatment groups. This raises the question regarding the role of Ly6C in the context of Cb infection and the mechanism(s) by which it impacts T-cell function, as well as for other markers of T-cell functional status in the context of Cb infection (e.g. KLRG1, CXCR3).
One of the most notable features of the overall response was the expression pattern of T-bet, a transcription factor canonically associated with Th1 type immune responses. Computational analysis revealed that distinct populations of T-bet+ CD4, CD8, and innate cells distinguish challenge responses in vaccinated and naïve mice, suggesting a multi-faceted role for T-bet in response to Cb. Indeed, a report published as this manuscript was in preparation demonstrated that T-bet deficient mice exhibited significant differences from CD4-deficient mice in a vaccine and challenge study, indicating a role for T-bet beyond Th-1 responses 29 . T-bet influences both innate and adaptive immunity by governing the maintenance of maturation in NK cells, enhancing T-cell function and IFNγ production in response to viruses and protozoa, and aiding in B-cell activation to promote viral clearance [52][53][54][55] . Expression of T-bet in CD8+ T-cells induces IFNγ production, which can in turn lead to increased www.nature.com/scientificreports/ T-bet expression in B-cells and promote antibody class switching 53,55,56 . Indeed, we observed T-bet expression not only in T-cells, but also B-cell and innate populations, including NK cells. Post-challenge, increased expression of T-bet in B-cells, which likely promotes class switching to IgG2a, was detected in vaccinated tgHLA-DR3 mice 57 . This pattern of change is generally consistent with a Th1 response to an intracellular pathogen. The broad role we describe here for T-bet in the in vivo response against Cb is in contrast to previous reports indicating that TBX21, the gene for T-bet, is downregulated by Cb during in vitro granuloma formation 58 . These divergent observations likely result from differences in human PBMCs stimulated and cultured in vitro 58, as compared to circulating immune cells obtained from mice following vaccination and challenge. CD8 and NK cell populations expressing CD73 were enriched in vaccinated mice following both vaccination and challenge. CD73 and CD39 are ectonucleotidases that convert extracellular ATP to adenosine, a suppressor of inflammatory function [59][60][61][62][63] . In T-cells, CD39 and CD73 promote adhesion and co-stimulation while curtailing effector function, and naïve CD8+ T-cells downregulate CD73 expression following activation 64,65 . In the context of Cb, we observed expression of CD73 on naïve CD8 T-cells in vaccinated mice on Day 10. However, on both Day 10 and Day 35 post-vaccination, CD73+ CD44+ CD62L+ CD8 T-cells (central memory) are enriched in vaccinated mice, which may reflect previously naïve CD8 cells that are in transition towards memory and helping to suppress inflammation following vaccination.
There are additional considerations for interpretation of data generated in this study. The small blood volumes available in a longitudinal study and BSL3 containment prevented inclusion of ex vivo stimulation and allowed only for an abbreviated incubation with a Golgi secretion inhibitor. This likely limited the detection of cytokines, including IFNγ, an established integral facet of the response to Cb 18,26,66 . The tgHLA-DR3 mice in this study are on the BL/6 background, which exhibit less severe infection and mortality than some other strains 67 . In this regard, BL/6 background mice may more closely approximate the acute non-lethal disease observed in humans.
This study identified multi-faceted roles for multiple population markers and delineated minimal sets of cell clusters to distinguish treatment groups as well as vaccinated mice that controlled Cb efficiently versus those that did not. In particular, the observed importance of T-bet provides evidence supporting a Th1-biased immune response to Cb. Similarly, the capacity of T-cell subsets to delineate naïve and vaccinated mice after 10 days suggests a robust early T-cell response to vaccination. The profile of the immune alterations following Coxevac vaccination and subsequent challenge detail the response to an effective, albeit potentially problematic, vaccine. Future candidate vaccines that aim to provide effective protection while minimizing reactogenicity will likely need to recapitulate canonical features of effective vaccination (IFNγ and antibody production) along with facets of the alterations in immune cell populations including participation of monocytes, depletion of circulating mature NK cells, and the involvement of both naïve and memory T-cell populations.

Materials and methods
Mice. Female wild-type BL/6 mice, NCI A/JCr (AJ) and Balb/c mice (6-8 weeks old) to assess inactivation of Cb were obtained from Jackson Labs (Bar Harbor, ME, US). Female mice transgenic for HLA-DR3 (5-11 weeks old) were obtained under commercial license 68 . Mice were maintained under BSL3 conditions in microisolator cages (Smart Flow, Tecniplast, Westchester, PA, USA) at the Regional Biocontainment Laboratory, Colorado State University, Fort Collins, CO, USA. Animals were provided water and rodent chow ad libitum and evaluated daily to detect changes in body weight, body condition, behavior, and activity level. All studies were performed in accordance with all institutional guidelines and regulations under a protocol approved by the Colorado State University Institutional Animal Care Committee 844) and following approval by the Animal Care and Use Review Office (ACURO).

inactivation of Cb in infected samples.
To allow for analysis outside the BSL3 facility, a sample inactivation procedure was established to ensure that fixed blood samples from infected mice were safe for transport and for subsequent immune marker assessment by CyTOF. Fixation in 4% paraformaldehyde for 1 h was determined sufficient to eliminate infectivity in spleen cell suspensions used to inoculate A/J mice, which are highly permissive for Cb infection 67 . For each timepoint, barcoded and labeled leukocyte samples from infected mice were incubated in 4% formaldehyde PBS solution for two hours to inactivate any infectious Cb and stored frozen. Due to limited blood volumes, to confirm Cb inactivation for each timepoint, a representative splenic sample from an infected mouse that had been previously aliquoted and stored following disassociation by serial passage through 70 mm and 40 mm filters was inactivated in parallel with the blood and stored in the same manner. The parallel inactivation at each timepoint of the banked Cb infected spleen sample provided a common control across experiments to confirm Cb inactivation and enable blood sample release from the BSL3 facility. To confirm inactivation, the representative splenic sample was injected i.p. into naïve susceptible NCI A/JCr mice and on Day 10 post-inoculation with inactivated materials the spleens were recovered from the A/JCr mice. Inactivation was confirmed by assessing splenomegaly and bacterial burden as detected by qPCR.  Serology. Plasma was processed for serology as described previously 71 . Whole blood was collected as described above. A 10 μL aliquot of plasma was recovered after centrifugation (300 g for 3 min) and stored at -80C. Anti-Cb antibody levels in the plasma were determined by ELISA (Q Fever Ab Test, IDEXX Laboratories, Westbrook, ME, USA). The secondary antibody was replaced with peroxidase conjugated protein A/G (Pierce Biotechnology, Rockford, IL, USA) or goat anti-mouse IgG (H + L) (Jackson ImmunoResearch Laboratories, Rockford, IL, USA). ELISA measurements were reported as absorbance values measured at 450 nm (OD 450 ) and baseline corrected, using plasma from naïve animals.

Mass cytometry.
Whole blood collected at each timepoint was depleted of erythrocytes using the eBioscience 10X multispecies RBC lysis buffer. Following lysis, samples were barcoded, surface labelled with antibody, and fixed prior to storage at − 80 °C until release following Cb inactivation confirmation. Subsequently samples were thawed, labelled intracellularly, and analyzed by mass cytometry (see Supplementary Material).
Data analysis. Following acquisition of mass cytometry data, the resulting files were processed in accordance with previously reported methods [72][73][74] . For each sample, data were normalized using bead standards to account for machine variance, using proprietary software from Fluidigm. Viable single cells, T-cells, B-cells, and innate cells were identified by manual gating with FlowJo 10 software (Supplementary Fig. 3a). Values from mass cytometer channels containing non-biological data were removed and the resulting FCS files used for downstream analysis. Statistical tests were performed as indicated in the figure legends using GraphPad Prism V8, unless otherwise noted. Additional details provided in supplementary methods.
clustering analysis and cluster matching. First, the data from each replicate experiment were considered independently. For each timepoint the data were manually divided into T-cell, B-cell, and innate cell populations ( Supplementary Fig. 3). Then resulting populations from each timepoint were clustered independently to identify the distinct subpopulations of cells present. Cell clusters were defined based on the abundance of the markers detected on each cell, grouping together cells with highly similar profiles using the Vortex implementation of the X-shift algorithm, a k-nearest neighbors method 32 , according to the labels as indicated in Fig. 2 and in Supplementary Information. Cluster analysis used between 3.4e5 and 1.2e6 cells per timepoint from each experiment. Subsequently, the differential abundance of each cluster between treatment groups was determined for individual timepoints by pairwise t tests and visualized using SPADEVizR 39 ( Supplementary Fig. 6).
To identify phenotypically similar clusters conserved between two cognate replicate timepoints, cluster profiles from the replicate experiments were compared using Pearson's correlation. The list of putative pairings was ranked according to correlation value. Clusters with a frequency of less than 0.02% of input cells were removed from the analysis and a lower limit of r = 0.6 was used to limit the number of candidate pairings. For each cluster, starting with the highest correlation value, candidate pairings were manually evaluated with the basis of phenotype (cluster profile), overall distribution between treatment groups, and cluster size. In some instances, candidate pairings with good phenotypic correlation and similar abundance did not have similar distributions between experimental groups; however, this did not preclude acceptance as a matched pair. As a result of this analytical approach, clusters with no corresponding match in the replicate experiment or with a frequency of less than 0.02% of total input were excluded from further analysis.
Heatmaps, logistic regressions, correlation matrix, discriminant analysis, and t-SNE. Details related to the production of heatmaps, Elastic Net logistic regression, correlation matrix calculation, t-SNE map generation, and linear discriminant analysis are provided in Supplementary Material. Briefly, heatmaps and the correlation matrix were both created used the R packages Pheatmap, using the Ward.D2 method for hierarchical clustering, Pvclust for bootstrap analysis, and rcorr function for Spearmann correlation analysis 75,76 . NbClust was used to determine the appropriate segmentation parameters 77 . Elastic Net analysis of phenotypic classes and individual clusters, as well as multi-mode classification to identify the minimal set of highly predictive clusters and linear discriminant comparison of challenged mice, was performed using the R package caret 78 . Detailed methods provided in Supplementary Material. Figure 6. Identification of features characterizing vaccine efficacy following challenge. Linear discriminant analysis (LDA) models were constructed for each timepoint to rank combinations of up to 3 clusters based on the ability to distinguish naïve challenged mice (N = 7) from Cb-neg vaccinated challenged mice (N = 5) and subsequently assess the distribution of Cb-pos mice (N = 3). The discriminant functions with high predictive accuracy were used to project the three groups onto corresponding feature space and identify combinations of cell populations where the Cb-pos mice trend towards the naïve vaccinated mice. LDA plots depict the distribution of each group, where the center of the x-axis denotes the point where the naïve challenged and Cb-neg vaccinated challenged groups are maximally separated and the y-axis reflects the count (or density estimate). Adjacent to each LDA plot is a scatter plot for the respective constituents. The title of each plot indicates the cluster identity and the frequency parent on the y-axis. www.nature.com/scientificreports/ circos plot. The details of how the Circos plot was constructed is detailed in the Supplementary data section 35 . The Circos plot was constructed with the R package Circlize 79 and was structured to include 31 sections, representing each mouse included throughout the duration of the study: 8 sections each for Coxevac unchallenged, Coxevac challenged and Naive unchallenged, and 7 sections for Naive challenged. For each timepoint, data from the two clusters with the highest positive or negative correlation coefficient to vaccinated or vaccinated challenged mice were used. The heatmap mode was utilized to represent the correlation coefficient with the value for each cluster rescaled to 0-1. Similarly, antibody titer, histopathology score, spleen %bw, and bacterial burden were integrated into the Circos plot. Scatterplot mode was used to show the value of antibody and body weight, while the histogram mode was used to indicate the pathology. A gradient color scheme was applied to reflect the value of bacteria burden.

Data availability
Raw data, processed data, and source code for reproduction of the results are publicly available via https ://flowr eposi tory.org/id/FR-FCM-Z2LH and https ://githu b.com/reeve s-lab.