Metabolomic Profiling of Mice Serum during Toxoplasmosis Progression Using Liquid Chromatography-Mass Spectrometry

Better understanding of the molecular changes associated with disease is essential for identifying new routes to improved therapeutics and diagnostic tests. The aim of this study was to investigate the dynamic changes in the metabolic profile of mouse sera during T. gondii infection. We carried out untargeted metabolomic analysis of sera collected from female BALB/c mice experimentally infected with the T. gondii Pru strain (Genotype II). Serum samples were collected at 7, 14 and 21 day post infection (DPI) from infected and control mice and were subjected to liquid chromatography-quadrupole time-of-flight mass spectrometry (LC-Q-TOF-MS)-based global metabolomics analysis. Multivariate statistical analysis identified 79 differentially expressed metabolites in ESI+ mode and 74 in ESI− mode in sera of T. gondii-infected mice compared to the control mice. Further principal component analysis (PCA) and partial least squares-discrimination analysis (PLS-DA) identified 19 dysregulated metabolites (5 in ESI+ mode and 14 in ESI− mode) related to the metabolism of amino acids and energy metabolism. The potential utility of these metabolites as diagnostic biomarkers was validated through receiver operating characteristic (ROC) curve analysis. These findings provide putative metabolite biomarkers for future study and allow for hypothesis generation about the pathophysiology of toxoplasmosis.


Results
Characteristics of infected mice. Control mice did not exhibit any clinical signs during the entire course of infection. However, infected mice showed time-dependent increase in the progression of illness. Infection has been assessed via the detection of T. gondii B1 gene in various mouse tissues and all infected mice were tested positive (data not shown). At 7 DPI mice showed mild signs, which progressed to typical clinical signs of acute infection at 10 DPI. By 14 DPI mice began to recover and at 21 DPI all mice restored their normal physical state. Using H&E staining, spleen and brain samples were examined for histopathological damages caused by T. gondii. As expected, all control mice tissues did not show any obvious pathological changes. In contrast, infected mice showed splenomegaly at 14 DPI (Fig. S1). Also, a reduced white pulp with an expansion of the red pulp compartment was observed at 14 DPI (Fig. 1A). Brain of infected mice had inflammatory reactions characterized by focal mononuclear cell infiltrates and glial nodules at 14 and 21 DPI. A number of multifocally distributed tissue cysts filled with bradyzoites were detected at 21 DPI (Fig. 1C).
Comparative metabolite profiling of serum from infected mice and controls. We next analyzed all total ion chromatograms of mice sera. We noted a stable retention time without peaks' drifts ( Fig. S2). Representative TIC chromatograms of serum samples from infected mice are shown (Fig. 2). Baseline correction, peak deconvolution, alignment, normalization and median-centering are applied to process the data, and there are 509 and 532 retention time-exact mass pairs determined in each sample profile analyzed in positive or negative ion mode, respectively. The majority of peaks in the chromatograms are identified as endogenous metabolites and most of these metabolites are involved in multiple biochemical processes, especially in energy and lipid metabolism. The LC-MS system stability and reproducibility for the large-scale sample analysis were observed in the PCA scores plot representation of QC samples by using the metabolite profiles obtained in positive and negative ion modes, as shown in Fig. S3. Six QC samples were run for serum samples throughout the entire analysis. There was no drift in the PCA scores plot representation of QC samples, and 87.2% and 93.2% of RSDs were less than 30% for serum samples in the positive mode and negative mode, respectively. Therefore, the LC-MS system repeatability and stability were deemed acceptable.
To investigate the metabolic variations during the disease progression, all observations acquired in both ion modes were first analyzed using two components PCA score trajectories plot (Fig. S4). The score plot did not illustrate good classification of the infected mice groups and their corresponding control groups. Subsequently, the cross-validated two-component PLS-DA models were used for further multivariate analysis, and showed a satisfactory modeling. To reduce the system error and interference of impurity peak, the OPLS-DA model of SIMCA-P were then established for the following analysis.
Temporal variation in metabolic profiling. To better define the metabolic variations in mice serum after T. gondii infection, all observations acquired in both ion modes were analyzed using OPLS-DA. Generally, the model is believed to be reliable when the Q 2 > 0.4. As shown in Fig. 3 the scores plots of OPLS-DA model discriminated the infected groups from their corresponding control groups, which exhibited satisfactory classification. The differential metabolites at different times post infection were selected according to the VIP threshold (VIP > 1) in the OPLS-DA model and the p-value of student's t-test (p < 0.05) after FDR correction. As shown in Fig. 4A-C the serum metabolic states of the infected mice deviated significantly from their corresponding controls in the ESI+ mode. The serum metabolic profiling in the infected mice changed most significantly from their controls at 7 DPI. As the infection time increases the number of differential metabolites decreased, indicating a restoration of the dysregulated metabolic state in a time-dependent manner. Meanwhile, metabolic profile observed in ESI-mode also exhibited the same trend (Fig. S5). Of the 79 differential metabolites detected in ESI+ mode, 43 were shared between at least two of the three infection stages (Fig. 4D). Also, we detected 74 significant differential metabolites in ESI-mode, of which 43 were shared between at least two of three infection stages (Fig. 4E). As shown in Table 1, we detected 15 metabolites under ESI+ and 7 metabolites under ESI-that varied at all three time points. The related pathways of each metabolite were also listed by searching the KEGG pathway database (http://www.genome.jp/kegg/), and the majority of these 22 metabolites were involved in lipid metabolism and amino acid metabolism. infection on host serum metabolism, we examined the changes in the chemical composition of serum extracts during different stages of the infection as defined by histopathology. As shown in Fig. 5 score plot from supervised OPLS-DA showed good discrimination between different infected animal groups in both positive and negative ion modes, which indicated that their metabolic profiles were distinct. Those significantly changed metabolites were then used to construct heat maps for unsupervised clustering. As shown in Fig. 6, heat maps of the differential metabolites detected in ESI+ mode showed a clear clustering for each group in agreement with the OPLS-DA results. Mice at 14 and 21 DPI were recovering from the acute infection, and exhibited the minimal difference in the number of differential metabolites (Fig. 6C). At 7 DPI mice would be soon developing an acute infection stage, exhibited the larger number of differential metabolites when compared with other two infected groups (Fig. 6A,B). We also constructed heat maps using differential metabolites detected in ESI-mode. Even though sample clusters overlapped slightly, most samples were clearly grouped into two differentiated clusters and subsequent statistic analysis revealed dramatic time-dependent changes in the number of differential metabolites (Fig. S6).
Identification of potential biomarkers. The potential biomarker metabolites that significantly contributed to the discrimination of infected mice from controls were identified by using the receiver operating characteristic (ROC) curve analysis in MetaboAnalyst 3.0 21 . Firstly, univariate ROC curve analyses were applied to quantify the predictive performance of each potential biomarker. The sensitivity and specificity trade-offs were calculated for each selected metabolite using the area under the ROC curve (AUC). This analysis identified 5 significantly differential metabolites in ESI+ mode and 14 significantly differential metabolites in ESI-mode with AUC > 0.8, which were selected after standardization of data as shown in Table 2. 3D PCA models and heat maps were constructed using the marker metabolite intensities as variables. As shown in Fig. 7 and Fig. S7, the infected mice group and the control group were separated into two different regions.
To evaluate the utility of metabolite combination for the diagnosis of T.gondii infection, multivariate exploratory ROC analysis was performed. ROC curves are generated by Monte-Carlo cross validation (MCCV) using balanced subsampling. Different biomarker modules were created through the built-in feature selections. As shown in Fig. 8, the AUC values obtained in this analysis were 0.996 (95% C.I. 0.951-1) and 1(95% C.I. 1-1) for all the 5 marker metabolites in ESI+ mode and 14 marker metabolites in ESI-mode, respectively (an ideal model would have an AUC of 1). Together, these results indicate that the indentified potential biomarkers, alone or in combination, can discriminate the infected samples from the controls with high accuracy and could be powerful indicators for Toxoplasmosis monitoring.

Discussion
Host-pathogen interactions have been studied using several "omics" techniques, such as genomics, transcriptomics, and proteomics, and more recently metabolomics, a newly established omics' methodology based on the analysis of small metabolites to study chemical changes in biological properties 22 . The use of these omics methodologies has revealed important information about many biological systems and has greatly contributed to the development of systems biology. In the present study we performed a LC-Q/TOF-MS-based untargeted metabolomic study to investigate the global effect of T. gondii infection on the serum metabolism of BALB/c mice. Our results show that T. gondii infection causes significant dynamic changes in the metabolite balance of the serum of T. gondii-infected mice and identified novel biomarkers with potential diagnostic value.
In the OPLS-DA score plot, each data point represents one mouse serum sample, and the distance between points in the plot indicates the similarity between samples. Fifteen metabolites detected in ESI+ mode and 7 metabolites in ESI-mode were found to vary during the 21 day of the infection period (Table 1). They represent perturbations in multiple pathways, such as pyrimidine metabolism, lipid metabolism and tryptophan metabolism 23 . Among these metabolites, kynurenine, phosphatidylethanolamine (PE) (18:0)/phosphatidylcholine (PC) (15:0) were increased, whereas, the level of choline, glycerophosphocholine, serotonin, cytidine, N-Acetyl-DL-tryptophan, arachidonic acid and eicosatrienoic acid were decreased. These infection-specific metabolic alterations are consistent with the current understanding of host-T. gondii interaction biology. Metabolic function is known to influence the activity of the immune system 24 . In this regard, T. gondii infection induces major immunological changes, that in some cases are linked to the production of amino acid metabolites, such as kynurenine (consistent with our finding), a product of indoleamine 2,3 dioxygenase (IDO) enzymatic degradation of tryptophan; a key antiparasitic mechanism to control parasite growth 25 . The induction of IDO by interferon-γ (IFN-γ ) and some pro-inflammatory cytokines is associated with depleted plasma tryptophan (in agreement with our finding), which may interfere with brain 5-HT synthesis, and increased production of anxiogenic and depressogenic tryptophan catabolites 26 . Evidence showed that kynurenine might be involved in depression and cognitive deficits 20,27,28 , which explains the depressive and anxiety symptoms exhibited by the infected mice. Serotonin is another metabolite in tryptophan metabolism and its depletion is also associated with depression, and anxiety 29 . The exact mechanism of this behavioral manipulation is unknown, but parasites cysts to disturb dopamine metabolism has been suggested as one of the most appealing causes 30 . A previous study showed that infection of mice by T. gondii caused a 14% increase in whole-brain dopamine levels upon establishment of chronic infection 31 . In agreement with this finding our study showed a increased level of serum DA in the infected mice compared to the controls only at 21 DPI (Fig. 4C), around the beginning of the chronic infection 32 . This time-dependent differential expression of DA may explain the link between abnormal behavioral alterations and T. gondii infection.
A clear metabolites' separation has been observed not only between infected mice and the corresponding controls, but also between different infected mouse groups throughout the course of the infection (Fig. 6). The alterations in the serum metabolic profile of T. gondii-infected mice paralleled the clinical course of the disease, where the significant changes in metabolites were detected in infected mice at 7 DPI, followed by restoration of the metabolic profile to a normal level as infection proceeds to the chronic stage, indicating a re-balance of the disturbed metabolic state coinciding with regaining apparently normal physical status.
Previous applications of LC-MS based unbiased metabolomic technologies has enabled researchers to identify biomarkers for various diseases, such as cancer 33,34 and other types of disease 35,36 , and to investigate host-parasite interaction and metabolic alterations in trypanosomiasis, schistosomiasis and malaria in a high-throughput manner [37][38][39] . In agreement with previous studies, our results demonstrated the power of the untargeted LC-MS-based metabolomic's approach in profiling the metabolic changes in serum from T. gondii-infected mice. We identified 19 infection-specific metabolites, 5 in ESI+ mode and 14 in ESI-mode, in the serum samples. PCA models and heat maps constructed based on those selected metabolites showed reasonable separations between infected and control groups. ROC analysis supported the robustness of the PCA models by cross validation, in which all ROC value were high. Among these compounds, the 5 detected in ESI+ mode, including kynurenine, serotonin, glycerophosphocholine and choline and N-Acetyl-DL-tryptophan, showed higher sensitivity and specificity (Fig. 8). These compounds play critical roles in mediating host-pathogen interaction as described above. Most of the metabolites identified in ESI-mode decreased in the infected group, including oleic acid, eicosapentaenoic acid, arachidonic acid (the main precursor of eicosanoid hormones), HOME, N-Acetyl-DL-tryptophan, fumaric acid, citric acid, glutathione, malic acid and cytidine, probably due to the increased metabolomic demands during infection. Overall, these findings provide insight into the pathophysiology of T. gondii infection and have the potential to improve toxoplasmosis diagnosis and management. Mice. Female BALB/c mice were purchased from Lanzhou University, China. All mice were bred and maintained under specific-pathogen-free (SPF) conditions in the experimental facility at Lanzhou University, where they received sterilized food and water ad libitum. All mice were kept in a temperature-controlled room (22 ± 0.5 °C) on reverse 12/12 h light/dark cycle. Mice were acclimated for about seven days prior to the start of the experiment and were weighed (weight > 20 g) to establish a baseline for detecting any reduction in body weight caused by infection.    Confirmation of infection. Following anesthetizing animals, 6 mice from infected or control groups were sacrificed at 7, 14 and 21 DPI. Mice tissues, such as brain, blood, liver, spleen, lung, small intestine and kidney were collected and examined for the presence of T. gondii. Briefly, genomic DNA was extracted from these tissues using TIANamp Genomic DNA kit (TianGen ™ , Beijing, China) according to the manufacturer's instructions.
Then, a PCR targeting the T. gondii B1 gene was performed to detect infection using the specific primers (5′ -AAC GGG CGA GTA GCA CCT GAG GAG-3′ and 5′ -TGG GTC TAC GTC GAT GGC ATG ACA AC-3′ ) 40 . Positive control (DNA from parasites) and negative control (PBS) samples were included in each test. Infection was also confirmed by histopathological examination of brain and spleen collected from the mice after euthanasia. Tissue samples were fixed in 10% neutral buffered formalin solution for 2 weeks and then were dehydrated by gradually soaking in alcohol and xylene and embedded in paraffin. The paraffin-embedded specimens were cut into 5-μ m sections, stained with hematoxylin-eosin (H&E), and examined under a digital optical microscope (Olympus, Tokyo, Japan).
Non-targeted metabolic profiling and spectral acquisition. The blood samples from different infection groups (7d, n = 6; 14d, n = 6; 21d, n = 6) and their corresponding control groups (n = 6) were prepared as described previously 41 . Blood was collected by retro-orbital bleed into Eppendorf tubes directly and allowed to clot for 30 min followed by centrifugation at 3000 g for 10 minutes at 4 °C to collect the serum fraction.  Supernatants were frozen immediately at − 80 °C and shipped on dry ice. A total of 100 μ l serum sample was thawed on ice at 4 °C and then precipitated by a solvent of acetonitrile-water (7:3), treated with ultrasonic wave for 15 min. The mixture was centrifuged at 12,000 rpm for 15 min at 4 °C and then the supernatant was transferred to 1.5 ml polypropylene tube. Metabolomic analysis was performed using an Agilent 1290 Infinity LC System coupled to an Agilent 6530 Q-TOF/MS (Agilent Corp.) (LC-Q/TOF-MS) equipped with an electrospray ionization (ESI) source. The separation of all samples was performed on a C18 column (Agilent Technologies, Santa, Clara, CA) (dimension 100 × 2.1 mm, 1.8 μ m particle size, 100 Å pore size) with the column temperature maintained at 40 °C. The LC-MS system was run in a binary gradient solvent mode consisting of 0.1% (v/v) formic acid in water (solvent A) and 0.1% formic acid in acetonitrile (solvent B) and the flow rate was 0.4 ml/min. The linear gradient is described in Table 3.
The sample injection volume was 4 μ l. Sample analysis was carried out in both positive and negative ion modes. The mass scanning range was 50− 1000 m/z. Nitrogen was used as the dry gas and cone gas with parameters described in Table 4. The mass spectrometry was operated in V optics mode and the data acquisition rate was set to 0.03 s, with interscan delay of 0.02 s. To ensure accuracy and reproducibility, LockSpray was used in all analyses and Leucine-enkephalin was selected as the lockmass for positive ([MH] + = 556.2771) and negative ([MH] − = 554.2615) ion modes.
We used Quality control (QC) samples to assess the reproducibility and reliability of the LC-MS system. QC samples were prepared by mixing equal volumes (10 μ l) from each serum sample as they were aliquoted for analysis. This "pooled" sample was used to provide a "mean" profile representing all the analytes encountered during the analysis 42 . The pooled "QC" sample was injected five times at the beginning of the run to ensure system equilibrium and then every 6 samples to further monitor the stability of the analysis 43 .
Data processing and statistics. Raw data were acquired by using LC-MS-Q-TOF MassHunter Workstation QTOF Acquisition software (B.03.01) (Agilent Technologies, Santa Clara, CA, USA) in an untargeted mode. LC-MS raw data file was converted into common format, and then directly processed by the XCMS toolbox (http://metlin.scripps.edu/xcms/) 44 . Peak picking, peak grouping and retention time correction were all involved in data pre-treatment, which was done through the XCMS software that was implemented with the freely available R programming language (v 2.13.1). Retention time (RT)-m/z data pairs were used to identify ion intensities of those detected peaks. The consistent variables (ion intensity information) were obtained by filtering peaks with 80% missing values and those with isotope ions from each group.
In order to remove the offsets and adjust the importance of high and low abundance metabolites to an equal level, the data was pre-processed by both mean-centering and variance-scaling prior to multivariate analysis. Then, the resulting three-dimensional matrix, which includes assigned peak indices (retention time-m/z pairs), sample names and variables, was further subjected to multivariate data analysis.
The resulting scaled datasets were imported into SIMCA-P + 11.0 (Umetrics, Umea, Sweden) where multivariate analyses such as principal components analysis (PCA) and orthogonal partial least-squares-discriminant analysis (OPLS-DA) were carried out to investigate and visualize the pattern of metabolite changes in an un-supervised and supervised manner, respectively. PCA was used to validate quality of the analytical system performance and to observe possible outliers. OPLS-DA was applied to obtain an overview of the complete data set and discriminate the variables that are responsible for variation between the groups. The quality of the models was evaluated with the relevant R 2 and Q 2 as well discussed elsewhere 45 . The differential metabolites were selected when the statistically significant threshold of variable influence on projection (VIP) values obtained from the   OPLS-DA model was larger than 1.0. Meanwhile, the p values from a two-tailed Student's t-test on the normalized peak areas were less than 0.05 46 . Log 2 fold change (FC) was used to show how these selected differential metabolites varied between groups. The dataset of selected differential metabolites was imported into MetaboloAnalyst 3.0 for heatmap generation and multivariate statistics. The areas (AUC) under the receiver operating characteristic curves (ROC) were constructed to evaluate the effectiveness of potential biomarkers. Results were considered significant when p value is less than 0.05. Putative metabolites were first derived by searching the exact molecular mass data from redundant m/z peaks against the online HMDB (http://www.hmdb.ca/), METLIN (http://metlin.scripps.edu/) and KEGG (www.genome.jp/kegg/) databases. A specific metabolite was sieved out when a match with a difference between observed and theoretical mass was less than 10 ppm. Then, the metabolite molecular formula of matched metabolites was further identified by the isotopic distribution measurement 47 .