Exploring the rumen fluid metabolome using liquid chromatography-high-resolution mass spectrometry and Molecular Networking

The rumen primary and secondary metabolite content is intimately related to its community of bacteria, protozoa, fungi, archaea and bacteriophages, ingested feed and the host. Despite the myriad of interactions and novel compounds to be discovered, few studies have explored the rumen metabolome. Here, we present the first study using ultra-high performance liquid chromatography tandem mass-spectrometry and Molecular Networking approach, and various extraction methods on the cell-free rumen fluid of a non-lactating Holstein cow. Putative molecules were annotated based on accurate fragmentation matching the Global Natural Products Social Molecular Networking library, public spectral libraries, or annotated manually. The combination of five extraction methods resulted on 1,882 molecular features observed. Liquid-liquid extraction resulted on the highest molecular features abundance, 1,166 (61.96% of total). Sixty-seven compounds were annotated using Global Natural Products Social Molecular Networking library and public libraries, such as hydrocinnamic and azelaic acid, and monensin. Only 3.56% of molecular features (67) observed had positive match with available libraries, which shows the potential of the rumen as reservoir of novel compounds. The use of untargeted metabolomics in this study provided a snapshot of the rumen fluid metabolome. The complexity of the rumen will remain long unknown, but the use of new tools should be encouraged to foster advances on the rumen metabolome.

246 molecular features were identified, of which 116 were derivatized and extracted using a commercial kit and analyzed using direct flow injection tandem MS. Recently, Artegoitia et al. 9 analyzed rumen fluid collected after slaughter of beef cattle using liquid chromatography-mass spectrometry (LC-MS) to explore potential metabolite markers related to average daily gain. Thirty-three metabolites were reported to be associated with differences on average daily gain. Metabolome investigations require a comprehensive approach to represent the molecular features present in a sample. The association between extraction methods, analytical techniques and annotation tools are essential to cover metabolites from various chemical classes, especially for complex samples. The recent developed annotation tool Molecular Networking (MN) based in MS allows users to visually and structurally evaluate related metabolites with similar fragmentation patterns 10,11 . Thus, molecular families can be clustered within groups on a chemical map, providing the possibility to comprehensively interpret large metabolomics datasets.
In this study, Liquid-Liquid Extraction (LLE), Solid Phase Extraction (SPE) and 3 variations of the Quick, Easy, Cheap, Effective, Rugged, and Safe (QuEChERS) extraction method were used to provide a snapshot of the rumen fluid metabolome of a non-lactating dairy cow. Different extraction methods aimed to extract metabolites varying in chemical properties, such as polarity and pKa. Also, we present the first study using MS-based MN approach to explore the rumen fluid metabolome.

Animal and diet.
A multiparous non-lactating Holstein cow with 572 kg of body weight fitted with a ruminal cannula (10 cm; Bar Diamond Inc., Parma, ID, USA) was used. All animal care and experimental procedures were conducted under the surveillance of the Animal Care and Use Committee of the Universidade Estadual de Maringá, Brazil (protocol no. 9013160518) and met the guidelines of the National Council for the Control of Animal Experimentation (CONCEA). Diet was offered for 21 days and consisted, on a DM basis, of 60% corn silage, 24% of corn grain, 10% of wheat bran, 4.8% of soybean meal, and 1.2% of mineral mixture (which contained 480 mg/kg of monensin). The cow was fed at 07 h 00 and 15 h 00 for ad libitum intake and was housed in individual stall with free access to clean water.
Rumen fluid sampling. At day 21, rumen content was sampled (300 mL) through a ruminal cannula 10 min before the morning feeding. The rumen content was filtered through 4 layers of cheesecloth into a glass amber container on ice. The pH of the rumen fluid was determined immediately after sampling using a pH meter (6.6; Tecnal, SP, Brazil). Filtered rumen fluid was centrifuged within 30 minutes at 1,000 g for 5 min at 4 °C, the pellet was discarded, and the supernatant was subsequently centrifuged at 13,000 g for 30 min at 4 °C. The cell-free supernatant was then filtered using 0.45 µm pore size membranes, and subsequently using 0.22 µm pore size membranes, both using a Millipore filtering system. Aliquots were stored at −20 °C for one day until extraction procedures were performed.
Liquid-liquid extraction of rumen fluid metabolites. One aliquot of cell-free rumen fluid (10 mL) was mixed with 10 mL of ethyl acetate and 1.0 g of NaCl for the LLE (Supplementary Fig. S1). The solution was vortexed for 2 min and allowed to separate into two phases. The upper organic layer was decanted, and the bottom layer was mixed with another 10 mL of ethyl acetate and 1.0 g of NaCl. This procedure was repeated one more time. The organic phases were combined and concentrated under nitrogen flow (LLE-1). Similar procedures were repeated using ethyl acetate and 1% acetic acid (v:v; LLE-2), ethyl acetate and 1% ammonium hydroxide (v:v; LLE-3) instead of ethyl acetate. Extracts were analyzed using ultra-high performance liquid chromatographytandem mass-spectrometry (UHPLC-MS/MS). Fig. S2) were performed using different solid phase characteristics (C18 and CN) and elution solvents (acetonitrile and methanol). C18 and CN cartridges contained 1 g of sorbent and had 6 mL of reservoir volume. Elution rate was 6 mL min −1 . C18 cartridges were activated with 5 mL of acetonitrile followed by conditioning with 10 mL H 2 O. Homogenized cell-free rumen fluid (6 mL) was loaded into C18 cartridges and collected for analysis (SPE-1B). The cartridge was washed with 10 mL H 2 O, followed by elution with 6 mL acetonitrile (SPE-1A). C18 cartridges were activated with 5 mL of methanol followed by conditioning with 10 mL H 2 O. Homogenized cell-free rumen fluid (6 mL) was loaded into C18 cartridges and collected for analysis (SPE-2B). The cartridge was washed with 10 mL H 2 O, followed by elution with 6 mL methanol (SPE-2A). CN cartridges were activated with 5 mL of acetonitrile followed by conditioning with 10 mL H 2 O. Homogenized cell-free rumen fluid (6 mL) was loaded into CN cartridges and collected for analysis (SPE-3B). The cartridge was washed with 10 mL H 2 O, followed by elution with 6 mL acetonitrile (SPE-3A). CN cartridges were activated with 5 mL of methanol followed by conditioning Original QuEChERS extraction of rumen fluid metabolites. Homogenized cell-free rumen fluid (10 mL) was mixed with an equivalent volume of acetonitrile and vortexed for 1 min for the QuEChERS procedure 12 ( Supplementary Fig. S3). Four g anhydrous MgSO 4 and 1 g NaCl were added into the solution, which was vortexed for 1 min. The solution was centrifuged for 5 min at 5,000 g. The supernatant was collected, 2 mL were split in 2 aliquots for cleanup, and the remainder (5 mL) was concentrated under nitrogen flow (OQ-1). An aliquot of 1 mL of the upper acetonitrile layer was mixed with 150 mg anhydrous MgSO 4 and 25 mg PSA sorbent, vortexed for 30 s and centrifuged for 1 min at 6,000 g (OQ-2). The OQ-2 procedure was repeated, however 25 mg of C18 was used instead of PSA (OQ-3). Extracts were analyzed using UHPLC-MS/MS.

Buffered QuEChERS extraction of rumen fluid metabolites.
Homogenized cell-free rumen fluid (10 mL) was mixed with an equivalent volume of 1% acetic acid in acetonitrile (v:v), 4 g anhydrous MgSO 4 and 1.7 g NaOAc.3H 2 O, for the buffered QuEChERS procedure 13 ( Supplementary Fig. S4). The solution was vortexed for 1 min and centrifuged for 3 min at 11,000 g. The supernatant was collected, 2 mL were split in 2 aliquots for the cleanup step, and the remainder (5 mL) was concentrated under nitrogen flow (BQ-1). A 1 mL aliquot was mixed with 150 mg anhydrous MgSO 4 and 50 mg sorbent PSA, vortexed for 20 s and centrifuged for 1 min at 6,000 g (BQ-2). The BQ-2 procedure was repeated, however 50 mg of C18 was used instead of PSA (BQ-3). Homogenized cell-free rumen fluid (10 mL) was also mixed with an equivalent volume of 1% acetic acid in methanol (v:v), 4 g anhydrous MgSO 4 and 1.7 g NaOAc.3H 2 O. The solution was vortexed for 1 min and centrifuged for 3 min at 11,000 g. The supernatant was split in 2 aliquots of 1 mL for the cleanup step. A 1 mL aliquot was mixed with 150 mg anhydrous MgSO 4 and 50 mg sorbent PSA, vortexed for 20 s and centrifuged for 1 min at 6,000 g (BQ-4). The BQ-4 procedure was repeated, however 50 mg of C18 was used instead of PSA (BQ-5). Extracts were analyzed using UHPLC-MS/MS.

Acid-base QuEChERS extraction of rumen fluid metabolites. Homogenized cell-free rumen fluid
was mixed with 1% acetic acid in acetonitrile (v:v) and 6 g Na 2 SO 4 anhydrous according to Wang et al. 14 with minor modifications (Supplementary Fig. S5). The solution was vortexed for 2 min, centrifuged for 10 min at 6,000 g and the upper layer was collected (acid phase). The bottom layer was mixed with 10 mL solution of 1% ammonium hydroxide in acetonitrile (v:v). The extract was vortexed for 2 min, centrifuged for 10 min at 6000 g and the upper layer was collected (basic phase). An aliquot of 5 mL from the acid and basic phases were mixed and the remainder aliquots were concentrated separately under nitrogen flow (ABQ-1 and ABQ-2; respectively). The combined 10 mL aliquot was split in two aliquots. One aliquot was mixed with 0.5 g NaAc, 50 mg C18 and 75 mg PSA sorbent, vortexed for 2 min and centrifuged for 10 min at 6,000 g (ABQ-3). The other aliquot was mixed with 0.5 g NaAc and 50 mg C18, vortexed for 2 min and centrifuged for 10 min at 6,000 g (ABQ-4). Extracts were analyzed using UHPLC-MS/MS.

Untargeted metabolomics analysis.
Extracts were analyzed using an ultra-high performance liquid chromatograph (Shimadzu, Nexera X2, Japan) coupled to a hybrid quadrupole time-of-flight high resolution mass spectrometer (Impact II, Bruker Daltonics Corporation, Germany) equipped with an electrospray ionization source. Chromatographic separation was performed with an Acquity UHPLC ® CSH TM C18 packed with 135 Å pore, 1.7 µm particle size, 2.1 × 100 mm column (Waters, UK), at a flow rate of 0.2 mL min −1 . The gradient mixture of solvents A (H 2 O with 0.1% formic acid; v:v) and B (acetonitrile with 0.1% formic acid; v-v) was as follow: 5% B 0-1 min, 50% B 1-5 min, 95% B 5-10 min, maintained at 95% B 10-16 min, 5% B 16-18 min, and maintained at 5% B 18-21 min at 40 °C. The capillary voltage was operated in positive and negative ionization modes, set at 4500 and 3000 V, respectively; with an end plate offset potential of −500 V. The dry gas parameters were set to 8 L min −1 at 180 °C with a nebulization gas pressure of 4 bar. Data were collected from m/z 50 to 1300 with an acquisition rate of 5 Hz, and the 4 ions of interest were selected by auto MS/MS scan fragmentation. Molecular networking approach required the conversion of mass spectrometry raw data into mzXML file format followed by upload to the Global Natural Products Social Molecular Networking (GNPS) to generate the MN, according to GNPS guidelines 10 (Supplementary Method S1). The GNPS approach consists of comparing fragmentation spectra (MS/MS experiments) and grouping molecules with similar chemical structures. Each spectrum is represented as a node in the visual MN, and spectrum-to-spectrum alignments are represented as lines that connect nodes, evidencing ions that are correlated with each other. Communication between the nodes are related to similarities in fragmentation spectra between the ions, and similarity between fragmentations patterns are evaluated via vector relation. Structurally related molecules exhibit similar fragmentation patterns; therefore, molecular families tend to unite within groups in the MN, referred as clusters. In addition, it is possible to determine the difference of m/z between nodes, defining spectral proximity between all MS/MS spectra in a dataset. This tool innovates and facilitates the analysis of large datasets. It also permits the comparison of molecular features to the GNPS spectral library and all publicly available data. Molecular networking was visualized using Cytoscape 15 . Metabolites with positive match with the GNPS library had both parent and fragment ions manually compared with GNPS spectral library and publicly available data ( Supplementary Figs S6-S72). Mass error was lower than 10 ppm (Supplementary Table S1). A Venn diagram was constructed in Microsoft Excel from exported molecular features data using GNPS (Fig. 1).  Tables S1 and S2), which represents 3.56% of the observed molecular features. Data compiled by extraction method (LLE, SPE, original QuEChERS, buffered QuEChERS and acid-base QuEChERS) was used to generate a Venn diagram (Fig. 1). A total of 186 molecular features (9.88% of total) were present on the five tested extraction methods, of which 27 were identified. Liquid-liquid extraction had the greatest molecular features abundance: 1,166, which was 61.96% of total. Moreover, LLE extracts had 614 exclusive molecular features (32.62% of total). The LLE also provided the greatest amount of annotated molecular features: 57 of 67. Compounds such histamine and tyramine (Supplementary  Table S1), were only present when ammonium hydroxide in ethyl acetate were used (LLE-3). Extractions of syringic acid, decanedioic acid, dodecanedioic acid and 1,11-undecanedioic acid were present when acidified ethyl acetate (LLE-2) was used. Phenacylamine, niranthin and monensin were present on all LLE extracts. The main mammalian enterolignan produced in the rumen, enterolactone (EL), was extracted by the LLE-1, LLE-3, SPE-1, SPE-2, SPE-3, SPE-4 and ABQ-4 methods. The molecular feature 3-(2-hydroxyphenyl) propanoate, 12,13-EpOME and 12,13-DiHOME were present in multiple extractions.
The original QuEChERS extraction method resulted on the extraction of 531 molecular features (24.92% of total) and had the lowest abundance of exclusive molecular features (38; 2.02% of total). A total of 34 molecular features extracted using the original QuEChERS were annotated. Acid-base QuEChERS and buffered QuEChERS extracted a total of 707 and 531 molecular features, respectively. This corresponded to 37.57 and 28.21% of total observed molecular features, respectively. Acid-base QuEChERS and buffered QuEChERS had more exclusive molecular features compared to original QuEChERS, 109 (5.79% of total) and 177 (9.40% of total), respectively. A total of 50 and 45 identified molecular features were extracted by the acid-base QuEChERS and buffered QuEChERS method, respectively. Mass spectrometry-based Molecular Networking of the rumen fluid. The five extraction methods resulted on a diversity of annotated compounds, such as amino acids, dicarboxylic acids, carboxylic acids, lactones, lignans, fatty acids derivatives and indole compounds. Molecular features were visualized using the GNPS (Fig. 2). Four clusters with identified compounds are presented. Cluster A ( Fig. 2A) was formed by 12,13-EpOME (m/z 297.243) and 12,13-DiHOME (m/z 315.253). Cluster B (Fig. 2B)

Discussion
The extraordinary activity of microorganisms is based on their remarkable metabolic diversity and genetic adaptability, which makes them an important source of genetic resources for biotechnological advancement and sustainable development. The success for biotechnological processes is directly related to the diversity of microorganisms and the molecules they produce as a result of primary and secondary metabolism as well as the conservation of the genetic resources they provide. The rumen fluid is a complex matrix composed by a myriad of microorganisms, which interact among themselves and the host, and degrade plant material (i.e. cellulose, hemicellulose, lignin, starch, protein and small amount of oil) from various sources. The intra-and interspecies interaction are responsible for the synergistic effect on the production of volatile fatty acids and microbial protein in the rumen. Despite the ability of the ruminant to degrade cellulose being responsibility of the microbiota, only recently a reference microbial genome catalog of the rumen was published, and yet a large portion of microorganisms remains unknown 16 . Furthermore, many of the rumen bacteria remain uncultured and uncharacterized, which is more troublesome for eukaryotes 17 . Thus, the use of LC-MS favors the exploration of the interactome, rather than the metagenome. This is actually an important feature as integrative omics tools are only becoming available. For example, the exploration of the metabolome can result on the discovery of novel compounds, such as antimicrobials 18 or other bioactive molecules 19 . To date, the majority of rumen metabolome studies have used UHPLC-MS, gas chromatography-mass spectrometry and nuclear magnetic resonance, with the latter the most used due to reliability and absolute quantification 20 . However, UHPLC-MS has the advantage of having increased sensitivity and coverage. Thus, a single UHPLC-MS run can result on huge amount of data. For example, we obtained 432,391 MS spectra in this study. To overcome the difficulty to analyze this massive amount of data, the GNPS MN was used to comprehensively visualize the dataset and to prospect novel compounds. Another challenge of metabolomics studies relies on our ability to annotate unknown molecular features or their origins. Indeed, there is evidence that only 1.8% of spectra in untargeted metabolomics experiment can be annotated 21 .
To overcome this issue, metabolites spectra can be compared based on parent and fragment ion similarities to publicly available molecular matches. Recent studies exploring the rumen fluid metabolome have mainly focused on how the metabolites are influenced by different diets. For example, organic acids, amino acids, amines, sugars and nucleosides/nucleotides, which are the core of the rumen fluid, are affected when dairy cows are fed diets varying in levels of concentrate 22 . Using targeted approaches and a combination of analytical methods on the rumen fluid of cows fed diets varying in levels of barley, Saleem et al. 8 observed 246 metabolites including: phospholipids, inorganic ions, gases, amino acids, dicarboxylic acids, fatty acids, volatile fatty acids, glycerides, carbohydrates and cholesterol esters. Aiming to identify markers for the rumen function that could lead to differences in average daily gain, Artegoitia 9 observed 1,429 molecular features using the metabolomics untargeted approach. However, a large number of animals was used in that study (n = 16).
In our study, 1,882 molecular features were observed and 67 were identified, which was 3.56% of total. The sources of these uncharacterized molecular features may range from secreted molecules by ruminal microorganisms, plants, the host, and administered drugs. This offers a possibility to further explore the ruminant in its environment. For example, bacterial secondary metabolites can be used to identify antimicrobial resistance 23 . Moreover, due to the vast microbial interaction and dispute inside the rumen, the rumen is a pool for novel antimicrobials 24 . Furthermore, the GNPS MN is a collaborative platform that is improved from the spectra deposited by users. The rumen is still a poorly explored environment and deposit of rumen metabolome dataset is required.
Liquid-liquid extraction method resulted on the greatest abundance of molecular features: a total of 61.96%, and it also resulted on the greatest quantity of annotated compounds. Liquid-liquid extraction method extracts compounds with moderate polarity, and nonpolar or hydrophobic characteristics. The pH of the extraction can also be explored as it influences the solubility of acidic and basic compounds. Thus, pH modifiers can increase the abundance of dissociated compounds in the organic phase 25,26 . Alternatively, SPE methods are focused on low to medium polarity compounds using octadecylsilane (C18) and Cyano (CN) phase cartridges, respectively. To increase the abundance of extracted compounds, solvents with different elution strengths can be used. Lastly, QuEChERS was used as alternative extraction method for compounds of medium and low polarity, but with lower material and labor cost compared to SPE. QuEChERS extraction methods is based on an extraction step that partition with organic solvent, followed by a clean-up step with sorbents, which improves the selectivity to the method.
The molecular feature 3-(2-hydroxyphenyl) propanoic acid ([M] − ; m/z 165.056), which was present in multiple extractions is metabolized from cinnamic acid 27 and is produced by the anaerobic bacterium Clostridium xylanolyticum 28 . Another annotated compound was the EL, which has potential to improve milk stability, the immune system of the dairy cow, and to prevent cardiovascular diseases, osteoporosis and diabetes in human 29 (Fig. 2C), being the latter produced by lactic acid bacteria 31,32 . Two oxylipins, 12,13-EpOME (m/z 297.243) and 12,13-DiHOME (m/z 315.253) had a match with the GNPS, which were also identified by Artegoitia 9 . These are bioactive lipids derived from several polyunsaturated fatty acids, such as linoleic acid. The epoxidation metabolites of linoleic acid produce the epoxyoctadecenoic acid (12,13-EpOME) and the epoxide hydration product dihydroxyoctadecenoic acid, (12,13-DiHOME). The modulation of these products can be associated with processes of inflammation in dairy cattle 33,34 .
One important feature of using the MN is the possibility to visualize structural similarities between molecules and the connection based similarities of their corresponding MS/MS fragmentation patterns. Therefore, the assertive annotation of a compound in the MN can implicate in the identification of neighboring nodes, which allows a more in-depth annotation and discovery of novel compounds. Indeed, we were able to explore the MN using this approach.

Conclusions
This investigation of the rumen fluid metabolome of a non-lactating Holstein cow using different extraction methods, liquid chromatography-mass spectrometry and Molecular Networking analysis resulted on 1,882 observed molecular features, which is the greatest abundance of molecular features reported to date. Several extraction methods resulted on various quantities of exclusive molecular features. The liquid-liquid extraction method had more exclusive and total molecular features. Only 67 molecular features observed had positive match with the Global Natural Products Social Molecular Networking database and public libraries, which corresponded to 3.56% total molecular features. Of those, amino acids, dicarboxylic acids, carboxylic acids, lactones, lignans, fatty acids derivatives and indole compounds were observed. Four clusters observed in the Molecular Networking were presented and two clusters were manually explored so that novel compounds could be identified. This highlights the use of ultra-high performance liquid chromatography tandem mass-spectrometry and Molecular Networking to explore the potential of the rumen as a reservoir for novel compounds.

Data Availability Statement
The data set has been submitted to Global Natural Products Social Molecular Networking (GNPS) and are available via [https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=ab6ca77deedc4c65bc71b3043608c415] study identifier [171122_LRC4_attributesok].