liputils: a Python module to manage individual fatty acid moieties from complex lipids

Lipidomic analyses address the problem of characterizing the lipid components of given cells, tissues and organisms by means of chromatographic separations coupled to high-resolution, tandem mass spectrometry analyses. A number of software tools have been developed to help in the daunting task of mass spectrometry signal processing and cleaning, peak analysis and compound identification, and a typical finished lipidomic dataset contains hundreds to thousands of individual molecular lipid species. To provide researchers without a specific technical expertise in mass spectrometry the possibility of broadening the exploration of lipidomic datasets, we have developed liputils, a Python module that specializes in the extraction of fatty acid moieties from individual molecular lipids. There is no prerequisite data format, as liputils extracts residues from RefMet-compliant textual identifiers and from annotations of other commercially available services. We provide three examples of real-world data processing with liputils, as well as a detailed protocol on how to readily process an existing dataset that can be followed with basic informatics skills.


Supplementary Figure S1
Graphical visualization of tabular data. For a quick tutorial of how to use liputils to extract residue information from lipidomics data, the following table is used (A). Data is arranged with samples in column, with analytes in row. To be processed directly with liputils make_residues_table() function, a table needs to avoid column multi-indexing, and have lipids as row index. This table needs to be loaded in pandas by specifying to skip the first two rows and to index the second column, or needs prior editing and all unwanted columns and row removed. In B, all lipid residues have ben counted per sample. NaN stands for "not a number", and it is the way Numpy has to represent blank/missing values. When saved to a csv or Excel file, these values will result in empty cells. Missing values derive from residues that are present in the data, but are not represented in that particular sample.

A B
Fatty acid moieties extracted from stock media. Residues extracted from lipidomic data of HepG2, human hepatocyte-like cells (HLCs) and human primary hepatocytes (HPPs) culture media are shown (A-B) (n=1 per sample). HLCs and PHHs media were chemically defined, while HepG2 medium included 10% FBS comprised of complex lipid molecules. Units were unchanged during the extraction and are the same as the input data (pmol/µg protein).

More abundant
Less abundant pmol/µg protein pmol/µg protein A B Graphical output of the online RefMet translator. The RefMet compound name translator, available at the following address: (https://www.metabolomicsworkbench.org/databases/refmet/name_to_ref met_form.php) attempts at translating non-RefMet compliant strings into official RefMet compounds (A). For low-resolution mass isomers, a list of possible molecular lipids that can be the parent compound is given (B).  Table 2 Detailed statistically significant differences for each comparison, per residue. Statistically significant differences were determined with ANOVA followed by Tukey's post-hoc test (n=7 subjects per group).

Supplemental Materials and Methods
General description. We have developed liputils as a lightweight Python library focused on text-based recognition of lipid identifiers that can be seamlessly integrated in a Pythonbased data analysis pipeline. Its primary function is to provide fatty acid moieties information from any RefMet-compliant lipid annotation.
An in-depth, Python-based analysis of a dataset is available as Supplemental analysis protocol online in PDF form, or in its native Jupyter notebook file format at the following address.
In the main text, the User is presented with a basic protocol that automatically processes lipidomics data in tabular format in order to extract residue information and package the results into a new, easily readable data table. In here, we will be discussing some useful functions and methods of the library that can be exploited by the more advanced User and suited to personalized analysis pipelines.  In this case, liputils reports each residue in the list it returns, but it also returns 3 as ambiguity index, to reflect the fact that the molecular lipid was not fully resolved, and there were three different but equivalent mass isobars.

Functionality of the
When using non-Refmet compliant lipids, Users are encouraged to use Lipid.lipid_class() and Lipid.residues() methods instead: In this case, an empty list is returned together with 0 ambiguity index. Defaults to "residues_table".
replace_nan: this replaces any missing value (empty cells) in the input table with the desired value. It defaults to 0.
cleanup: this parameter is used to avoid processing some lipids that may be found in the data but are not meaningful to process. These may include, for example, identifiers with "total" (like "total cholesterol"), or abbreviations thereof (like "TC" for total cholesterol, or "FC" for free cholesterol). The list of unwanted terms is read from another parameter, absolute_amount is set to True, then the actual number of residues will be counted.
The default unit is "picomoles", but this can be set otherwise by passing another unit as keyword argument. It defaults to False.
In addition to the use case presented in the manuscript, an additional example can be found in the Supplemental analysis protocol at page 5.
Two other functions come handy when processing residues subsets. To focus on particular residues, it is possible to mix saturated() and max_carbon() to dictate which residues to keep and which to discard. Specifically: Conveniently, saturated() can be negated via the not operator if the residue is required to having exactly zero unsaturations.
Example use of these functions can be found in the Supplemental analysis protocol, starting from page 16, when they are used to restrict the plots to specific residues.
Online Documentation. As liputils is in active development, the online documentation reflects the latest changes and updates to the library. It can be accessed at liputil's pip package manager page, or in liputil's GitHub repository.