New plasma preparation approach to enrich metabolome coverage in untargeted metabolomics: plasma protein bound hydrophobic metabolite release with proteinase K

Plasma untargeted metabolomics is a common method for evaluation of the mechanisms underlying human pathologies and identification of novel biomarkers. The plasma proteins provide the environment for transport of hydrophobic metabolites. The current sample preparation protocol relies on the immediate precipitation of proteins and thus leads to co-precipitation of a significant fraction of hydrophobic metabolites. Here we present a new simple procedure that overcomes the co-precipitation problem and improves metabolome coverage. Introducing an additional step preceding the protein precipitation, namely limited digestion with proteinase K, allows release of associated metabolites through the relaxation of the native proteins tertiary structure. The modified protocol allows clear detection of hydrophobic metabolites including fatty acids and phospholipids. Considering the potential involvement of the hydrophobic metabolites in human cardiovascular and cancer diseases, the method may constitute a novel approach in plasma untargeted metabolomics.

and it should include a metabolism-quenching step to reflect adequate metabolome composition at the time of sample collection as well as to minimize metabolite losses 6 . Without doubt, a properly chosen and optimized sample preparation procedure is a key factor in the reliable evaluation of a comprehensive metabolic profile and biological interpretation of the data.
In the case of LC-MS-based plasma untargeted metabolomics, the commonly used sample preparation approaches include protein precipitation (PP) 7,8 with various organic solvents, liquid-liquid extraction (LLE) 7 , solid-phase extraction (SPE) 7,8 or ultrafiltration 7 . Nevertheless, protein fraction removal using organic solvents or a mixture thereof, followed by centrifugation and supernatant filtration, constitutes the most frequently used approach in plasma untargeted metabolomics with the use of the LC-MS technique. This procedure is fast, simple, reproducible and provides good plasma metabolome coverage. Importantly, this common approach accounts for the fact that plasma constitutes a hydrophilic environment and thus limits the solubility of hydrophobic metabolites such as lipids, fatty acids, steroids and thyroid hormones. The efficient transport and distribution of these hydrophobic compounds in plasma in vivo is achieved through their interaction with proteins that bind them as substrates, products, cofactors and ligands via relatively strong protein-hydrophobic interactions 9 . Indeed, the most abundant human plasma protein, serum albumin, besides maintaining osmotic pressure, acts as a carrier of hydrophobic hormones, fatty acids and lipids as well as binds toxic substances and drugs 10 . Furthermore, lipoproteins govern the specific transport of cholesterol and phospholipids. Finally, plasma immunoglobulins bind antigens that are often insoluble in a hydrophilic environment to mediate their removal 10 .
Hence, the immediate plasma protein precipitation during untargeted metabolomics procedures leads to co-precipitation of a significant fraction of plasma metabolites that results in their respective intensities being below the limit of detection (LOD). The ablation of these metabolites during the PP step can be a limiting factor for disease related metabolomic analyses. This highlights the necessity for optimization of sample preparation methods to include this plasma protein-associated metabolite fraction. Thus, the main goal of this study was to develop and apply a new simple plasma preparation approach which may overcome the co-precipitation problem and improve metabolome coverage. An additional step, aiming to misfold the native proteins and their complexes to release strongly associated metabolites, such as plasma limited digestion with proteinase K (PK), prior to the global protein fraction precipitation was included in the proposed procedure. PK is a well-known proteolytic enzyme that belongs to the serine protease class produced by Tritirachium album 11 . This enzyme possesses a broad cleavage specificity and breaks the peptide bond adjacent to the carboxylic group of aliphatic and aromatic amino acids which constitutes a useful feature for protein digestion in biological samples 12 . PK is active over a wide pH range, namely 4.3-12.0. The maximum activity of the enzyme is observed at the temperature of 37 °C and requires the presence of calcium ions 12 . Furthermore, the stability of PK in the presence of anionic surfactants including sodium dodecyl sulfate (SDS) and its ability to digest native proteins mean that this enzyme is commonly used in a variety of applications such as preparation of chromosomal DNA for pulsed-filed gel electrophoresis, protein fingerprinting and removal of nucleases from DNA and RNA preparations 12 .
In the present study, a commonly applied plasma preparation procedure was modified with an additional step including incubation with PK. Additionally, various organic solvents for PP in LC-MS-based plasma non-targeted metabolomics were used and evaluated. The developed and optimized approach was tested for plasma samples derived from urogenital tract cancer patients.
The proposed new and optimized procedure increased the number as well as signal intensity of putatively identified endogenous metabolites. Furthermore, the proteinase K procedure allows detection of individual metabolites in plasma samples, significantly enriching plasma metabolome coverage. To the best of the authors' knowledge, this is the first study to apply this new plasma preparation procedure for LC-MS-based untargeted metabolomics. The proposed methodology may provide an enrichment of plasma metabolome coverage, mainly in terms of metabolite groups such as phospholipids, sphingolipids, fatty and bile acids, acylcarnitines and amino acids.
Finally, enriching the procedure with the PK digestion step allows clear detection of hydrophobic metabolites including, among others, hydrophobic amino acids, fatty acids and phospholipids. Given the potential involvement of hydrophobic metabolites in human cardiovascular and cancer diseases, the method proposed here may constitute a novel approach in related metabolomics analysis. Plasma preparation procedures. Plasma samples were prepared with the use of a commonly applied procedure including protein precipitation, centrifugation and supernatant filtration. Three organic solvents, i.e. acetonitrile, methanol and methanol:ethanol (J.T. Baker, Netherlands) (1:1, v/v), were used in a 3:1 (v/v) ratio with plasma and then compared. The second proposed approach includes addition of proteinase K (A&A Biotechnology, Poland; 1 µl, 20 mg/ml, 6 U) into 50 µl (approx. 2-2.5 mg of total protein) of plasma before the PP step using the above-mentioned organic solvents. Hence, the PK activity was not the limiting factor. To ensure optimal activity of PK, 1 µl of 250 mM CaCl 2 (Fluka TM , Switzerland) was added to 50 µl of plasma sample to obtain a 5 mM final concentration. Different plasma incubation times (15,30 and 45 minutes) with PK at 37 °C were tested and compared. The detailed methodology for both procedures is graphically presented in Fig. 1. For sample preparation development, six replicates of pooled plasma were used for each tested procedure. Next, the procedure optimized with the use of PK was applied for plasma samples derived from urogenital tract cancer patients and this was then compared with an approach including only protein precipitation.

Analytical measurements. Plasma metabolic fingerprints were measured with an HPLC system (Agilent
Technologies 1200 series, Waldbronn, Germany) coupled with a Time-of-Flight (TOF) mass analyzer (Agilent Technologies 6224 series, Waldbronn, Germany) equipped with a dual electrospray ionization source. The detailed chromatographic and mass spectrometer parameters are described in the Supplementary information section. The prepared plasma samples were analyzed in a randomized order. Two separate sequences were conducted: first in positive and second in negative ionization mode.
Data processing and metabolite identification. The raw data sets were deconvoluted using MassHunter Qualitative Analysis B.06.00 software (Agilent Technologies, Waldbronn, Germany). The parameters of data extraction were similar to those previously published 13 . To correct the common shift of retention time and measured monoisotopic mass during LC-MS analyses, peak alignment is necessary to mark detected analytical signals as the same features in all measured plasma samples. The alignment step was performed using Mass Profiler Professional B.02.01. (Agilent Technologies, Waldbronn, Germany). After alignment, the obtained data matrices were filtered and normalized using the intensity of the internal standard (IS), namely 1-(4-fluorobenzyl)-5-oxoproline. Only variables present in all plasma samples (n = 6, for each sample preparation procedure evaluated) and with a coefficient of variation (CV) lower than 20% were used for further identification. The filtered analytical signals were putatively identified based on monoisotopic mass, isotopic distribution, formula and hits found in available databases, such as METLIN (www.metlin.scripps.edu), KEGG (www.genome. jp/kegg), LIPIDMAPS (www.lipidmaps.org/), HMDB (www.hmdb.ca) and CEU MassMediator (http://ceumass. eps.uspceu.es/mediator).

Results
Comparison of various organic solvents used for plasma protein precipitation and metabolite extraction. Various organic solvents were used and compared, such as acetonitrile, methanol and methanol:ethanol (1:1, v/v) in a 3:1 (v/v) ratio with plasma. The determined plasma metabolomic profiles were evaluated for both the number of extracted features and reproducibility of signal intensities derived from these features. The obtained results are collected and presented in Table S1 in the Supplementary information section.

Development and optimization of a new plasma preparation procedure with the use of proteinase K.
The commonly applied PP and metabolite extraction approaches using organic solvents were compared with a procedure including additional plasma incubation with PK. Although the new approach with PK was tested and evaluated for all the above-mentioned organic solvents, the results obtained only for the mixture of methanol:ethanol (1:1, v/v) are described and presented here. The results obtained for the comparison of both procedures and different incubation times with proteinase K are presented in Fig. 2. A comparison of results observed for the optimization of the procedure with the use of PK in terms of incubation time is presented in Table 1. The obtained data were evaluated based on the total number measured features, the CV of their intensities, the number of putatively identified endogenous metabolites and the number of peptide fragments.
Importantly, the plasma proteins interact with the metabolites mainly through these proteins tertiary structure hydrophobic interaction (not by covalent bounds) and the mild digestion (that leads to the protein misfolding) is required in order to relax the protein structure and release the metabolites. Too intense or complete digestion of plasma proteins will result in their fragmentation into short peptides, that are very difficult to precipitate, and may reaggregate randomly with metabolites. Furthermore, extensive protein digestion affects matrix effect by changing plasma composition, mainly in terms of short peptide fragments. This may result in ion suppression phenomenon. Ion suppression negatively affects several analytical figures of merit, such as ionization efficiency and detection capability. Therefore, the number of detected features may be reduced with increased plasma incubation time with proteinase K (Table 1). Taken together, too extensive digestion of plasma proteins is the method limiting factor and establishing optimal limited digestion conditions is crucial for the method sensitivity.
The identified metabolites present only in plasma samples prepared by the procedure including incubation with PK were classified in terms of their chemical groups (Fig. 3). Additionally, the intensities of common features detected in plasma samples prepared with the use of both compared procedures were also verified and evaluated. The identified metabolites representing higher abundances for the procedure with proteinase K were categorized based on their chemical classes (Fig. 4).
Application of the newly developed procedure for plasma samples derived from urogenital tract cancer patients. Based on previously described results, the new and optimized procedure including 15 min plasma incubation with proteinase K was applied to plasma samples (n = 10) derived from urogenital tract cancer patients and compared with the commonly used protein precipitation approach. The results of the comparison of plasma metabolic fingerprints obtained after application of both compared sample preparation procedures are presented in Figs 5-6 and Table 2.

ME15
ME30 ME45    The parameters, such as recovery, coverage, repeatability, matrix effects, selectivity as well as the orthogonality of all methods tested, were evaluated and compared. The results obtained in the study showed a high degree of overlap, which resulted in metabolite coverage increases of 34-80% depending on the LC-MS method used as compared to the single extraction procedure (methanol/ethanol precipitation). However, the increase in MS analysis time and sample consumption should be also underlined. Moreover, ion exchange solid-phase extraction and liquid-liquid extraction with methyl-tertbutyl ether were observed to be the most orthogonal methods to methanol-based precipitation 14 .
Among recently published studies, some different approaches have aimed at improvements in plasma metabolome coverage in untargeted metabolomics 7,[18][19][20][21][22] . Godzień et al. developed and applied a one-step extraction method using methyl-tert-butyl-ether consisting of a lipophilic and hydrophilic layer within a single vial insert, in-vial dual extraction (IVDE) 18 . In the case of the lipophilic phase, a 60 min lipid profiling method with LC-qTOF was developed and this enabled identification of fatty acids, glycerolipids, and glycerophospholipids as well as sterols. The aqueous phase of the obtained extract was analyzed with a previously reported method for plasma metabolic fingerprinting 23 . The proposed approach provides detection of over 4,500 metabolic features from a single 20 µl of plasma sample. Yang et al. proposed and optimized a combined LL and SPE approach for LC-MS-based plasma profiling 19 . The new procedure was compared to PP using methanol and spiked internal standards as controls. As a result, five separate fractions enriched for aqueous species, phospholipids, fatty acids, neutral lipids, and hydrophobic lipids were obtained 19 . The combined approach demonstrated better reproducibility with CV's below 15% for the combined procedure as compared to 30% observed for PP. Additionally, the proposed fractionation approach resulted in greater overall plasma metabolome coverage.
Skova et al. compared two different sample preparation techniques prior to LC-qTOF/MS-based plasma metabolic fingerprinting 20 . The first was PP and the second constituted PP followed by SPE resulted in three sub-fractions: phospholipid, lipid and polar. The proposed procedure was able to detect 4234 molecular features measured by the LC-qTOF/MS technique as compared to 1792 signals detected for the PP approach 20 . The proposed and optimized sample preparation procedure was applied for plasma samples collected from rats administered an environmental pollutant, namely perfluorononanoic acid.
Patterson et al. tested Folch, Bligh-Dyer, and Matyash lipid extractions 22 . Additionally, these methods were compared with the phospholipid removal plate procedure. Such a plate was used for lipid extraction and this has been proved to be a robust approach for targeted lipid analysis 22 . Folch and Matyash ensured reproducible recovery for a wide range of lipid classes; however, the Matyash aqueous layer was observed to be comparable with common methanol preparations for extraction and determination of polar metabolites. Therefore, the Matyash procedure was suggested as the best choice for biphasic extraction for untargeted plasma metabolomics and lipidomics 22 .
All the above-mentioned studies constitute recent attempts which have been undertaken to enhance plasma metabolomic profiles. Although improvements in metabolite coverage and good method reproducibility were observed, the proposed procedures are time-consuming and expensive, and include a lot of additional and difficult steps which can generate unwanted sources of analytical variability and therefore may be not suitable for analysis of larger set of biological samples. In the case of the commonly applied protein precipitation approach, the possible issue of co-precipitation still remains unaddressed in plasma untargeted metabolomics 24 .
In the present study, a new procedure with the use of a single additional step including limited digestion with proteinase K was developed and evaluated. The best results for the number of detected and reproducible analytical signals as well as the number of putatively identified metabolites were observed for 15 min plasma incubation with PK (Table 1). However, it should also be underlined that unspecific enrichment of peptide fragments can be observed in the obtained plasma metabolic fingerprints ( Table 1). The new optimized sample preparation approach was applied to plasma samples collected from urogenital tract cancer and compared with methanol/ethanol precipitation previously observed as the most efficient and reproducible procedure ( Table 2). The number of all detected analytical signals using proteinase K procedure was twice as high as that with the typical deproteinization approach ( Table 2). Around 53% of these analytical signals were potentially annotated as endogenous metabolites. The putatively identified metabolites which were present only in plasma samples prepared with the use of proteinase K procedure representing mainly phospholipid, sphingolipid, fatty acid, bile acid, amino acid and cardiolipin classes. Additionally, for around 92% of common analytical signals detected in plasma metabolic profiles for both compared sample preparation approaches, at least twofold higher intensities were observed ( Table 2). Among these analytical signals, metabolites belonging mainly to phospholipids, sphingolipids, carnitines, bile and fatty acids were tentatively identified.
The results of these preliminary studies indicate the beneficial effect of the new plasma sample preparation procedure including incubation with PK for metabolome coverage as well as the intensities of the signals derived from putatively identified endogenous metabolites. The proposed methodology, as compared to previously reported ones, is simple, fast, reproducible and not expensive or time-consuming; therefore, it seems to be a useful approach for plasma untargeted metabolomics which often require quite a large set of samples to provide biologically reliable results.
The individual metabolites within the context of the biological processes. To find how the results of the optimized procedure could improve the biological and biomedical application of untargeted metabolomics analysis, we integrated the uniquely detected small metabolites (Table S2, Supplementary information) with biochemical pathway networks. Our results suggest that the optimized procedure allows the detection of unique crucial small metabolites and could provide an additional dimension of regulatory information. The KEGG pathway analysis (combined in Table S2, Supplementary information) assigned the proteinase K-procedure's unique metabolites into several categories including: amino acids biosynthesis and metabolism, purine and pyrimidine metabolism, sphingolipid metabolism as well as pentose phosphate pathway, biosynthesis of terpenoids and steroids, and biosynthesis of secondary messengers. Although it is highly plausible to argue that the majority of these uniquely detected hydrophobic features were released from plasma albumin upon PK digestion, we cannot exclude the possibility that the identification of some phospholipids was in part a result of lipoprotein degradation. Indeed, our PK enriched procedure resulted in unique detection of phospholipids, such as phosphatidic acid (PA), lysophosphatidic acid (LPA), phosphatidylinositol (PI), diglyceride/diacylglycerol (DG/DAG), phosphatidylinositol bisphosphate (PIP2), and phosphatidylinositol trisphosphate (PIP3). It is highly plausible to suggest that their identification is a direct result of lipoprotein and plasma phospholipid transfer protein (PLTP) digestion [25][26][27][28] . Since plasma phospholipids levels are widely associated with cardiovascular diseases, our modified procedure could provide a novel initial metabolomic insight into these pathologies.
Furthermore, although we have observed enrichment in hydrophobic amino acids that resulted in assignment of the metabolites into protein degradation and absorption pathways, this enrichment could be partially related to PK activity.
Moreover, the procedure resulted in the detection of a large fraction of small metabolites that are usually assigned to plant (e.g., hexadecanedioate) and microbial metabolisms (e.g., ectoine). However, we cannot exclude the possibility that the presence of some of these features which are unusual for human metabolism could be a direct result of hemolysis.

Conclusions
The majority of blood plasma proteins make contact with various small metabolites, and although these encounters may not always have functional consequences, it is obvious that for hydrophobic metabolites these interactions constitute a crucial transport and tissue delivery mechanism. Although achieving an in vivo global insight into these protein encounters fraction could provide important information and novel biomarkers for human pathologies, current metabolomic procedures focus mainly on circulating protein-unassociated small metabolites. As demonstrated in this study, a simple modification of the sample preparation procedure with a step allowing release of plasma protein-associated metabolites resulted in a significant enrichment of detected signals with individual features. The functional assignment of these metabolites covered a wide range of both anabolic and catabolic cellular processes (including hormones, aromatic amino acids and lipids). Although the performance and reproducibility of the new sample preparation approach should be tested and evaluated in a larger set of plasma samples, the modified protocol is simple, fast and easy to apply, and allows clear detection of hydrophobic metabolites including fatty acids and phospholipids. To the best of the authors' knowledge, this is the first study to apply new sample preparation including addition of PK, which is the focus of the issue of metabolite co-precipitation, for LC-MS-based untargeted metabolomics. Taken together, the proposed methodology may provide an enrichment of plasma metabolome coverage mainly in terms of certain metabolite groups, such as phospholipids, sphingolipids, fatty and bile acids, acyl-carnitines and amino acids, and this would seem to be crucial in expanding current knowledge about the pathological processes of various diseases, for instance cancer and cardiovascular disorders.